Academia.eduAcademia.edu

Walnut genome analysis

The goal of this project is to build a set of comprehensive genomic tools for walnut. These will facilitate a more precise evaluation of breeding populations and will accelerate development of improved walnut cultivars to address the needs of both California growers and the consumers of this important agricultural commodity. Development of these tools includes (1) construction of a physical map of the walnut genome, (2) a detailed survey of walnut gene expression, and (3) fine-scale genetic and association mapping of economically important traits. Two bacterial artificial chromosome (BAC) libraries comprising a total of 129,024 clones (64,512 each) were constructed from Persian walnut (Juglans regia cv. Chandler) DNA. Average insert lengths were 135 kb (HindIII) and 120 kb (MboI) for the two libraries respectively, providing approximately 20x genome coverage. To date 124,890 BAC clones have been fingerprinted using the five-color SNaPshot HICF technology. The fingerprints have been edited, and 113,073 could be used for contig assembly with the FPC program. A total of 916 contigs and 4,830 singletons were obtained. A total of 54,912 BAC end sequences (BES) have also been produced, one BES per BAC. A mapping population of 265 progeny of the cross Chandler' x 'Idaho that segregates for all commercially important walnut traits was evaluated with 15 SSRs to verify the parentage of F1 progeny. In addition, the genetic structure of 399 trees among 204 diverse accessions, including 62 elite germplasm used for breeding, was employed in the development of a population for association mapping of walnut traits. A total of 21 cDNA libraries were constructed for the characterization of the walnut transcriptome using next generation DNA sequencing. These genomic tools will significantly strengthen ongoing California walnut breeding efforts by facilitating marker-assisted selection strategies. The use of well-defined markers will significantly increase selection efficiency, the discovery of new genes, and rapid integration of these genes into genetic backgrounds adapted to California environmental conditions, thus accelerating the development of improved walnut cultivars. PROJECT OBJECTIVES 1. Physical mapping of the walnut genome 2. Genetic and association mapping of economically important walnut traits 3. Functional mapping of the walnut genome 4. Development of a 'Walnut Genome Resource (WGR)', a web-based knowledge base of walnut genomic information California Walnut Board 35 Walnut Research Reports 2009

WALNUT GENOME ANALYSIS Jan Dvorak, Ming-Cheng Luo, Mallikarjuna Aradhya, Dianne Velasco, Charles A. Leslie, Sandie L. Uratsu, Monica T. Britton, Russell L. Reagan, Gale H. McGranahan and Abhaya M. Dandekar ABSTRACT The goal of this project is to build a set of comprehensive genomic tools for walnut. These will facilitate a more precise evaluation of breeding populations and will accelerate development of improved walnut cultivars to address the needs of both California growers and the consumers of this important agricultural commodity. Development of these tools includes (1) construction of a physical map of the walnut genome, (2) a detailed survey of walnut gene expression, and (3) fine-scale genetic and association mapping of economically important traits. Two bacterial artificial chromosome (BAC) libraries comprising a total of 129,024 clones (64,512 each) were constructed from Persian walnut (Juglans regia cv. Chandler) DNA. Average insert lengths were 135 kb (HindIII) and 120 kb (MboI) for the two libraries respectively, providing approximately 20x genome coverage. To date 124,890 BAC clones have been fingerprinted using the five-color SNaPshot HICF technology. The fingerprints have been edited, and 113,073 could be used for contig assembly with the FPC program. A total of 916 contigs and 4,830 singletons were obtained. A total of 54,912 BAC end sequences (BES) have also been produced, one BES per BAC. A mapping population of 265 progeny of the cross Chandler’ x ‘Idaho that segregates for all commercially important walnut traits was evaluated with 15 SSRs to verify the parentage of F1 progeny. In addition, the genetic structure of 399 trees among 204 diverse accessions, including 62 elite germplasm used for breeding, was employed in the development of a population for association mapping of walnut traits. A total of 21 cDNA libraries were constructed for the characterization of the walnut transcriptome using next generation DNA sequencing. These genomic tools will significantly strengthen ongoing California walnut breeding efforts by facilitating marker-assisted selection strategies. The use of well-defined markers will significantly increase selection efficiency, the discovery of new genes, and rapid integration of these genes into genetic backgrounds adapted to California environmental conditions, thus accelerating the development of improved walnut cultivars. PROJECT OBJECTIVES 1. 2. 3. 4. Physical mapping of the walnut genome Genetic and association mapping of economically important walnut traits Functional mapping of the walnut genome Development of a ‘Walnut Genome Resource (WGR)’, a web-based knowledge base of walnut genomic information California Walnut Board 35 Walnut Research Reports 2009 PROCEDURES Objective 1: Physical mapping of the walnut genome A physical map of the walnut genome will be built concurrent with development of the genetic map. To accomplish the construction of a physical map, walnut genomic DNA fragments were cloned in a bacterial artificial chromosome (BAC) vector. Each BAC clone was fragmented with different restriction enzymes and ordered into contiguous sequences based on the overlap of fragment patterns. Ends of the BAC clones were then sequenced using Sanger DNA sequencing technology. Each of the BAC end sequences (BES) generated by this process is collinear with the BAC segments and thus corresponds to the sequence of nucleotides along a walnut chromosome. The presence of gene sequence tags (GSTs) within the BES will be confirmed through their expression in walnut tissues (Objective 3). The physical map also provides a scaffold upon which to assemble the complete walnut genomic sequence when such sequencing is performed. Objective 2: Genetic and association mapping of economic traits in walnut. Two different approaches were proposed for walnut genome mapping: (1) Linkage analysis of a conventional mapping population derived from a cross between parents that differ for traits under consideration; and (2) Association genetic analysis of a natural population such as a germplasm collection with genotypes of unknown or mixed ancestry that represent a common gene pool. Extensive DNA libraries of both mapping and association mapping populations have been developed. We are currently developing genotypic data using microsatellite polymorphisms to validate the full-sib nature of the mapping population and to assess the genetic structure of the association mapping population. Objective 3: Functional mapping of the walnut genome Genetic and physical maps describe the structure of the genome, but it is also essential to precisely document gene expression and to link specific traits (Objective 2) and GSTs (Objective 1) to underlying metabolic and biochemical processes. A key step toward this is gene transcript sequencing to identify expressed genes. Twenty tissue-specific gene transcript libraries have been constructed using tissue samples collected over the 2008 growing season. These transcripts were copied into DNA, cloned and high-throughput Illumina Solexa sequencing is in progress to generate millions of Expressed Sequence Tags (ESTs). The ESTs will be deposited in public databases, compiled, and annotated with computer analysis to identify genes involved in important metabolic pathways. The ESTs will be analyzed to validate GSTs through their expression pattern in different walnut tissues. Objective 4: Development of a ‘Walnut Genome Resource (WGR)’, a web-based knowledge base of walnut genomic information. A web-based browser is being developed as data from Objectives 1, 2 and 3 begins to accumulate. The database will contain all physical and linkage mapping information as well as all EST sequences and their integration with the walnut physical and genetic maps to better inform the walnut research community and to provide access to these genomic resources. California Walnut Board 36 Walnut Research Reports 2009 RESULTS AND DISCUSSION Objective 1: Physical mapping of the walnut genome Physical mapping consists of cloning large genomic DNA fragments in a suitable vector such as a BAC and ordering the fragments so that their sequence reflects the order of nucleotides in a chromosome. The first step in the construction of physical maps is the construction of BAC libraries. Two bacterial artificial chromosome (BAC) libraries were constructed from DNA isolated from in vitro grown shoots of Persian walnut (Juglans regia cv. Chandler). Walnut genomic DNA was fragmented with either HindIII or MboI restriction endonucleases. A total of 129,024 clones, 64,512 per BAC library, were arrayed in 336 384-well plates. The average insert size was around 135 kb and 120 kb for the HindIII and MboI libraries, respectively. Assuming the walnut genome is approximately 800 Mb, these two BAC libraries represent approximately 20x genome equivalents. Each BAC library has been stored in triplicate. BAC fingerprinting and BAC end sequencing were initiated in 2008. We are using a previously developed fluorescence-based, high-throughput BAC DNA fingerprinting technique (Luo et al. 2003) that sizes DNA fragments from each BAC clone by capillary electrophoresis, creating a unique fragment profile or BAC clone fingerprint for each BAC clone (Fig. 1). The “SNaPshot” fragment pattern for each BAC clone, using a different restriction endonuclease, is analyzed using a 96-capillary ABI3730XL robotic DNA sequence analyzer. These BAC fingerprints are rapidly edited using a previously developed computer software (GenoProfiler; You et al. 2006). Another computer program, FPC (Soderlund et al. 2000), searches for overlaps between BAC fingerprints, creating contiguous sequences of BAC clones (contigs). These contigs (Fig. 2) reflect the sequences of nucleotides along individual chromosomes. As of November 20, 2009, 124,890 BAC clones have been fingerprinted with the 5-dye SNaPshot fingerprinting technique. The fingerprints have been edited, and 113,073 could be used for contig assembly with the FPC program. A total of 916 contigs and 4,830 singletons were obtained. A total of 54,912 BAC end sequences (BES) have also been produced, one BES per BAC. Objective 2: Genetic and association mapping of economically important traits in walnut Genotyping of mapping populations to confirm their full-sib origin: The F1 mapping population consisting of 265 individuals of the cross “Chandler x Idaho” has been genotyped and parentage confirmed. We have augmented the DNA and it will be available for the SNP genotyping when the platform is ready in the first quarter of 2010. The phenotypic data on traits of breeding value: (1) Lateral vs. terminal bearing; (2) Leafing and harvesting dates; (3) Nut size; (4) Shell thickness; (5) Shell seal; (6) Plumpness; (7) Fill (nut/kernel ratio); (6) Kernel color and; (7) Yield have been recorded for the second year. Association mapping using the walnut germplasm collection: We have increased the association mapping population of English walnut to 452 trees and they have been genotyped using 21 microsatellite loci. A cluster analysis (CA) of the microsatellite data indicated mild genetic structure within cultivated walnut (Fig. 3). The elite germplasm consisting of cultivars, breeders’ selections, and germplasm frequently used in breeding programs showed moderate differentiation. The Chinese germplasm appeared to be unique and exhibited considerable California Walnut Board 37 Walnut Research Reports 2009 divergence and closely allied with SE Asian germplasm. Walnuts from South Asia are found to be the most diverse and there was further evidence of differentiation within this group. A small group of West Asian germplasm is nested within this group. Both South and West Asian germplasm have been extensively used in the California breeding program. Overall, the cultivated walnut (Juglans regia) shows a marginal differentiation and is suitable for association genetic analysis. The principle component analysis (PCA) of the walnut microsatellite data basically confirmed the results of CA and however, the genetic differentiation among the groups is more amplified and elucidated lot clearly (Fig. 4). All the six groups illustrated in the CA were evident in the PCA, but there was significant overlapping among clusters and also the minimum spanning tree superimposed on the PC plot suggested that the there is still significant genetic affinity within and among the groups (Fig. 5). The data has been further analyzed to quantify and describe the genetic diversity in the collection. There is significant deficiency of heterozygotes at all loci as compared to HardyWeinberg expected frequencies (Table 1) indicating some level of population substructuring within the collection as illustrated in the cluster analysis. All the six groups exhibited in the CA and PCA showed significant inbreeding within groups suggesting a complex but still mild population structure within the walnut germplasm collection (Table 2). Orthogonal partitioning of genetic variation indicated that the most variation is locked up within groups among individuals (~87%) and only a small proportion of the total variation is explained by genetic differentiation among groups (13%) (Table 3). However, the FST indicated that there is still significant differentiation among groups. Objective 3: Functional mapping of the walnut genome Functional mapping represents a composite profile the transcriptome of walnut. The transcriptome represents the transcribed RNA of any particular organism. In plants, gene expression is both temporally and spatially regulated in different tissues and organ systems. A profile of the transcriptome in a plant organ such as fruit, for example, would sample all mRNA expressed in this organ, and thus represents all genes expressed in fruit. Since only the functional part of the genome is observed, this provides a rapid and direct way to analyze genes that regulate all of the different fruit traits expressed. The first step is the isolation of total RNA from the tissue. Messenger RNA that represents the transcripts of individual genes is about 1-3% of total RNA is first separated using magnetic beads with poly oligo-dT nucleotides that bind to the poly-A tails on mRNA molecules. The mRNA is then chemically fragmented and the single stranded mRNAs are converted to double-stranded cDNA via in vitro cDNA synthesis. Each mRNA population isolated from a particular walnut tissue and converted into a corresponding population of complimentary DNA (cDNA library) represents the mRNA population of a particular plant and tissue at a particular time during development. These cDNA libraries are ligated to adapters so that they can be PCR amplified to create sufficient quantities of molecules that then can be sequenced using the next generation of DNA sequencers that can profile an entire transcriptome in a single run using Solexa technology on an Illumina Genome Analyzer (http://genomecenter.ucdavis.edu/dna_technologies/uhtsequencing.html). Such new technologies are able to generate millions of sequences corresponding to a greater diversity of the mRNA California Walnut Board 38 Walnut Research Reports 2009 population at a lower cost than traditional Sanger sequencing. A single library is loaded onto one lane in a GAII sequencer, which generates 10-20 million sequencing reads between 20-80 nucleotides in length from both ends of each cDNA molecule. The resulting sequences will be compiled to generate a reference walnut Unigene set, representing a high percentage of the walnut gene space. For this project, 16 samples of walnut tissue were gathered from Chandler trees in the Stuke block at UC Davis between April and October 2008 (Table 4). Four additional samples were taken from Chandler plant material maintained in the lab of Gale McGranahan. RNA isolation and construction of cDNA libraries for each sample has nearly been completed, with just two libraries remaining to be constructed (Table 4). Each cDNA sample will be sequenced separately so that the profile of genes expressed in each tissue can be determined. Paired-end (85bp x 85bp) sequencing is underway on an Illumina Genome Analyzer using Solexa technology. The sequencing of three libraries has just been completed, yielding 10-20 million paired-end sequences of 20-85 bp in length from each cDNA library. Bioinformatics for sequencing processing will be conducted at the UC Davis Genome Center Bioinformatics Core, where these sequences will be compiled with the 18,000 existing walnut ESTs to generate a Unigene set, which is expected to describe most of the expressed walnut genome. This set of genes will be annotated using Blast2GO (http://blast2go.bioinfo.cipf.es/; Götz et al. 2008). This software package was recently used to annotate the set of 8622 walnut genes compiled in our walnutnematode project. There, Gene Ontology (GO) categories were assigned to 57% of the sequences, including 18% which were also matched to Enzyme Commission (EC) identifiers. The GO categories and EC identifiers can be used to map the associated proteins onto metabolic pathways. These data can then be used to determine which metabolic pathways are active in each tissue and at various time points in the growing season which is a functional map of the gene activities in walnut. Objective 4: Development of a ‘Walnut Genome Resource (WGR)’, a web-based knowledge base of walnut genomic information A genome resource or knowledgebase is a database that provides access to genetic, physical, and functional mapping data generated in this project. The resource, which is now web accessible (http://walnutgenome.ucdavis.edu/) will have two distinct components: one for visualizing the genetic map and one for visualizing the physical map. The database will provide access to all of the fingerprinting data along with the BAC end sequences (Fig. 1). Tools are available to integrate and represent this information as a physical map showing individual contigs (Fig. 2). The physical map is a scaffold on which we will integrate genetic mapping data of walnut phenotypic information, molecular markers, and expressed genes as that information becomes available. The main objective of functional mapping is gene annotation and functional categorization of ESTs, the first step toward linking specific traits and conditions to metabolic and biochemical processes. Every EST enters a functional mapping pipeline using Blast2GO to assign sequences to metabolic pathways, and in some cases, regulatory roles in quality traits, pathogen resistance and stress response. This data pipeline includes gene set enrichment, a statistical analysis to determine which pathways and functional classes are expressed at higher or lower levels, California Walnut Board 39 Walnut Research Reports 2009 compared to the walnut Unigene set, in each tissue sample. Patterns in gene expression are visualized by unsupervised clustering, to reveal patterns in expression alone across several samples, and by functional visualizations. The Pathway Tools Omics Viewer (http://www.plantcyc.org; Zhang et al. 2005), and MapMan (http://mapman.gabipd.org; Thimm et al. 2004) visualization platforms can display high throughput expression data of individual genes and whole pathways in combination with standard pathway maps and graphical classifications of gene functions. We have used MapMan to visualize expression data from experiments on tomato, apple, citrus, and grape, in addition to walnut. REFERENCES Götz S, García-Gómez J.M., Terol J., Williams T.D., Nagaraj S.H., Nueda M.J., Robles M., Talón M., Dopazo J., Conesa A. (2008). High-throughput functional annotation and data mining with the Blast2GO suite. Nucleic Acids Res. 36:3420-3435. Luo M.C., Thomas C., You F.M., Hsiao J., Ouyang S., Buell C.R., Malandro M., McGuire P.E., Anderson O.D., Dvorak J. (2003) High-throughput fingerprinting of bacterial artificial chromosomes using the SNaPshot labeling kit and sizing of restriction fragments by capillary electrophoresis. Genomics 82:378-389. Soderlund C., Humphray S., Dunham A., French L. (2000) Contigs built with fingerprints, markers, and FPCV4.7. Genome Research 10:1772-1787. Thimm, O., Blasing, O., Gibon, Y., Nagel, A., Meyer, S., Kruger, P., Selbig, J., Muller, L.A., Rhee, S.Y. Stitt, M. (2004). MAPMAN: a user-driven tool to display genomics data sets onto diagrams of metabolic pathways and other biological processes. The Plant Journal 37, 914-939. Yu, J., Pressoir, G., Briggs, W.H., Bi, I.V., Yamasaki, M., Doebley, J.F., McMullen, M.D., Gaut, B.S., Nielsen, D.M., Holland, J.B., Kresovich, S. and Buckler, E.S. (2006). A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat. Genet., 38: 203-208. Zhang, P., Foerster, H., Tissier, C., Mueller, L., Paley, S., Karp, P., Rhee, S.Y. (2005). MetaCyc and AraCyc. Metabolic Pathway Databases for Plant Research. Plant Physiology 138: 27-37. California Walnut Board 40 Walnut Research Reports 2009 Table 1. Locus-wide genetic variability in walnut germplasm Locus WGA001 WGA202 WGA384 WGA331 WGA332 WGA321 WGA009 WGA118 WGA225 WGA004 WGA069 WGA089 WGA349 WGA338 WGA178 WGA318 WGA106 WGA237 WGA071 WGA242 WGA223 California Walnut Board #Genotype H(o) H(e) 450 437 427 445 451 451 424 449 432 447 451 450 414 450 451 399 450 446 450 450 425 0.62222 0.61327 0.45902 0.58652 0.54767 0.60310 0.65566 0.64588 0.38889 0.58166 0.55876 0.47778 0.28986 0.52000 0.67184 0.34837 0.33556 0.33632 0.34222 0.49556 0.53412 0.80766 0.83625 0.58113 0.65681 0.64695 0.72423 0.77891 0.79499 0.49816 0.68289 0.81452 0.66620 0.82987 0.55639 0.74688 0.82008 0.42703 0.59402 0.38149 0.65027 0.82727 41 P 0.00000 0.00000 0.00000 0.01330 0.00011 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00975 0.00657 0.00000 0.00011 0.00000 0.00010 0.00000 0.00000 Walnut Research Reports 2009 Table 2. Group specific fixation indices (inbreeding coefficient) Group FIS P China SE Asia Breeders’ gene pool Bred W Asia S Asia 0.08503 0.14087 0.12687 0.05423 0.09642 0.10898 0.000000 0.003910 0.000000 0.076246 0.001955 0.000000 Table 3. Orthogonal partitioning of genetic variation within and among groups Source of variation d.f. Among groups 5 Within groups Total California Walnut Board Sum of squares Variance components Percentage of variation 523.378 0.70793 13.37** 898 4118.848 4.58669 86.63** 903 4642.226 5.29462 42 Fixation index 0.134** Walnut Research Reports 2009 Table 4: Functional mapping of the walnut genome (Objective 3), walnut tissues gathered April to Nov 2008 for cDNA library construction and EST analysis Samp Tissue Genotype Development Source Harvest Code cDNA RNAle No Source al Stage Date Library seq done 1 Vegetative Chandler Vegetative Tree 4/1/08 VB Yes Done Bud 2 Leaf Chandler Vegetative Tree 4/15/08 LY Yes Done Young 3 Root Chandler Vegetative Pot 8/27/08 RT Yes Done 4 Callus Chandler Vegetative In 10/14/08 CI Yes Waiting Interior Vitro 5 Callus Chandler Vegetative In 10/14/08 CE Yes Done Exterior Vitro 6 Pistillate Chandler Vegetative Tree 4/17/08 FL Yes Done Flower 7 Catkins Chandler Immature Tree 4/1/08 CK Yes Done 8 Somatic Chandler Immature In 10/14/08 SE Yes Done Embryo Vitro 9 LeafChandler Vegetative Tree 6/5/08 LM Yes Waiting mature 10 Leaves Chandler Vegetative Tree 4/17/08 LE Yes Waiting 11 Fruit Mixed Immature Tree 6/5/08 IF Yes Waiting immature 12 Hull Chandler Immature Tree 6/5/08 HL Yes Waiting immature 13 Packing Chandler Immature Tree 6/5/08 PT Yes Waiting Tissue 14 Hull Peel Chandler Mature Tree 9/4/08 HP Yes Waiting 15 Hull Chandler Mature Tree 9/4/08 HC Yes Waiting Cortex 16 Packing Chandler Mature Tree 9/4/08 PK Yes Waiting Tissue 17 Pellicle Chandler Mature Tree 9/4/08 PL No No 18 Embryo Mixed Mature Tree 9/4/08 EM Yes Done 19 Leaf – late Chandler Senescent Tree 10/15/08 LS No No 20 Hull – Chandler Senescent Tree 10/15/08 HU Yes Waiting dehiscing 21 Transition J.nigra Transition Tree 04/02/08 KW Yes Waiting wood Zone California Walnut Board 43 Walnut Research Reports 2009 Figure 1. Example of a single fingerprint for a walnut BAC clone. To date 65,280 BACs have been fingerprinted with 92% being of high quality suitable for contig assembly. California Walnut Board 44 Walnut Research Reports 2009 Figure 2. Example of an assembled contig, containing 557 BAC clones, ca. 4.7 Mb in length. California Walnut Board 45 Walnut Research Reports 2009 etteNCEk nqu RA un IA Fra e F 16 E ves uett 73 NCA SLAV GrFaranq016te6 FReAUS GO or tte t t t uequelit YUFetren aye 7W t M 017 Franqran7 E e Xan 06 chF047 ranqu XeXylC E F C0c1hars M N r A ye 3S FRcha oe Mo 017 er ise on na CCE nPk ione an N 2 u P tte A eyl FRA 1 llo ye M ng IN 1 15 ot 29 6 rmeMa co droppi CH A d r 04 s y e 02 Ve Cnawa Ci P Klrotletyia t k US 14 a en rd un245SA 04 H Co a 8 c U 80Plahrh 12 7453 9 1 0 yE 53 09 2 65 25 rl 8 01 74 1 4 Ea022 08 59 7 SA 04 U 01 6 AA 1 25 USUS 68a 74 56 e in 78 4 2How V 1 0 6 70 0 7 01 0 1 0.02 02 24 0 0 1 38 01 9 1 02024 91 010191 0 0120 0 0 02243 43 03 07 0 91 1 5 9 7 0 3 4 02255 01 2 U 0607US1 0AF 43 01 3 0012 P243 PAPAK 9 0 S 0 U US 9 G 0 5 PPAK024 PAAKI04 KIS IST 024 011 01891SR19 SSSSR USHA AK IS 1 0 KISST PA TA AN0243 091 U 01 1 0 R R SRNI ST ISTTA 1 P TAANKISN 3 08 P03SS U 4 U T A N AK N A 6 A URS S S A N N P I N ST AKKISSSR R SR 0 02 AN 025 02 026265605 03 04 ISTAN TA 7 68 05 012 P PA85 N A N 1 K 02 01 PA P a K KPAKAI KISISTISTAgaB 0 57 A 025259 002 P 02IS 5 TA ST TA N N agh 9 9 01 4 PAAKIS 03N AN N IND 026 0 P IS TANPA IA 8 268AKK 02402 PA 0 ISTTAN KIST AN 7 K 3 PAAKN IS 02405 P 7 A TANIST 024 7 01 002 PKAISTA AN PAK247 0KISTN 0252 ISTA3 PAAN 01 P 0 N KIS A K ISTA 2060 04 TAN 2 N 4 7 KIST 04PA PA K A 0 316 ISTN AN 022544 010P 0263 0246 1 PAAK0IS1TUSS 03 PA 020 P0A KIS AN R KISTAKIS 0270 01 0270 N TANTAN PAKIS05 PAK0088 01 IS TA TA N N USSR 0254 03 PA KI STAN 0337 01 0085 02 0336 03 CH CHINA INA NEPA 0334 03 CH L 6 0 13 104 0251 03 022 INA PAKIST AN 8 AFGHANISTAN 0251 0140 USSR 01 PAKISTAN 0250 06 PAKIS TAN 0250 04 PAKISTA N California Walnut Board M Gaurc 4 sthet Lo in ti m A P 9 4 01 e 89 poshayn6 04 01 0 c le e 70 89 3 U y Jp 0 S 2 Ro USR ur pu R W n re ed e Lo deSSR 04 a e 12 GEZinpezer de R R g r onn77Mo 04 Xiii4 0 M er n 02 11 P 01190ANY e 12 t 14 RV OL 9 016 iii6 PAND0210204 0M2140 0023 U 4u 4 014 0EX 03 USSS OL AN 1 M2 MICOM S R 01n6k D EX EX EXR 1 ICOIC ICO 014 USSR O Als 4U os SS 0 1 zen 5 R 5U 012 t 0 016 S 1 5 0 S 5 015 2 U US 0 HU 171 R 0 S Id S 0 N H S R 132 0163 GAR aho 0129UNGR unk HUNY USA USSARY GAR R Y 0158 0492 HUN 015 Krus A 9 nsky 013G 8 HRUY unk 019 sd B NG 0 01 018 USU RY Sunl8an01 U SLGARAIA 0099 PI d SSR R 159568 AFGH 59 12 ANISTA4 N Se TulareSirBrron Twist er 67 13 0134 67 103 0165 0 70 1307 0 20 107 USA 2 AFGHAN ISTAN 0154 unk 0145 USSR 0413 Nn88Godyn POLAN 0143 USSR D Sinensis5 0098 Sinensis7 JAPAN 0410 Bulgaria3 BULGARIA Cheinovo sd RIA 0147 unk 0491 Cheinovo sd BULGA USSR 4 360 ROMANIA 6 02 031 0407 Geoagiu44 loid6 USA Trip56E INA 0461 182 Amigo 0338 08 CH AN Gillette PAKIST 22e 26 rd Chico 0254 02 95 Fo 76 80 dler Chan rd Lara 5 How4a 61 2 e adnee 56 22 rnaesttcO FeC Naedw ayjoz B rke Sha A ANTAN 0 D9US ISATKIS N ms1aseD9 KP TA Ada Chhasilee P2A 31 KIS 6 C T an PA 041047a2kis3t 03 P027 N 1 NAN 047 TAN ISTA SSR IS AKR U ANN ISTISATN P 1 AK A T 1 P 9 01US18S9S0SRKISISTPAAPKAIKST 3 0 INA A K 4 1 K 23 4 0 0271 CH 0189 0 02zaUl PPA72702 1PA N 0 315en 100202 01 30 TA 0 5 H272 IS 033 72 7 0 02 AK 04 4 NP 0 R N A SS A TA 027I4ST DI IS K IA SRSRR S1R U IN AK PA IND USUSSS U2S0 ha2 P 2 R 4 c 7 1 U 9 2 A 0 a 0 I 0 0 S Th 63 IA 74 IA dlg D 2 2 006 2 01 S ng 02 ND 02INDri5s1 IN01091992 019 R 8 U La ia I g 01 USS2 0 d g4 da l 86 dl ani6sd 04 02 In 19R s 3 T r 6 0 0 i S 01 ar 88 da 92 US nd 04 an 01 05 Ta 2 T 83 048 92 1 04 0 03 01 15 11 01 74 C U 26 SS 6 R US A NA HI C A 12NA IN A NA 77 HI CH IN HI A 035 C01INACH6 C HIN A 0 2 H 07 0 2 C IN 770334 C75 3705 0 INACH A A 3 0 0 2 03 0 37CH 03NA INHIN 0 9 75 HI CH C A A INA 4 A 03 C 2 04 IN IN H IN 03 A 70 002INA72 081 CH CH06 IC CH 3 3 4 0 0 5 H 03 0 11 0 73 H AN 5 01A C 0035 C 377373 0308HIN037 HIN C 0 62 C 0 6 6 7 035 08 84 0 HINA 03 7 03INA C A AA 03 H 05 IN A HININ 1 C340 CH I0N2 CC H 0 0 00 046 C4H 0 01 A 4 03 03440 003 3604 CHIN A 0 0 HIN CHINA 03 3 C 2 INA A 03460H8INA84 0H 3 HIN 03089 C 04 19 0C3 CA A 5 03803844CHCIN HIN 035 4804 11 83 030 A A A IN IN CHCH HIN INA A 8 094 0328 07CC 0330350 3 07 H CAHIN HINA 03043284C1H4ININ 03 CINA A A 04 CH 034002 CHH C CIN 03033984 01 0507 0342 HINA 7878 0303 HINCAHINA 0878C01 037803 A A IN CH 02 CHINCHINA 0378 04033903 43 02INA CHINA 01 CH 0379 02 0378 INA PAKISTAN 0378 03 CH 0261 01 N 0262 02 PAKISTA CHINA 0353 02 11 CHINA 0353 10 CHINA 0353 CHINA 05 0354 0353 03 CHINA 0353 08 CHINA 0354 03 0354CHINA 03540602CHINA CHINA 0354 09 CHINA 035 3 09 CHI 0352 01NA 035 CHI 01 CH 0354507 INA NA CHINA 0355 05 0353 01 CH INA 0355 06CHINA CH INA 0246 03 PAKIS 0246 01 0246 TAN PAKISTA 0356 04 PA N 01 CHKISTAN 0356 IN 0356 03 CHA A 0 02 C IN INA 04604423 C 356 04HC HINA Sha HINA 02 nX 033 C IN 62 03 PA 4 02 0i4 34 H KIST CH3IN Man033 0 A AN 8 A 1 CHIN 014 048 01 r0e1gia0n6 CH A 6 K 01 0 C39 U01 D INA SS 18 04701O5R2E6 uhnonw R 0 6 CH 044798 Y1 KA k on1 0460468 IN 81 Yoon OR 9 Y 4 A Yna una Ko Yo nggdoEA n2 C n1 C rea HIN HINA g 01 n0gddoonn A 3 n g246 KO 073 u1n23 g RE OR 14 k USlocK A a l S E 0 1 02 K 02 53 01 US R oreA 5 a 60 02 un 41 SR 62 1 0 k un 0 P 0031 P 260AKI 257 k 0283 AK 05 STA 03 P 0 0239 06 CIST02P4AKN 02AKI A 5 0 I 5 S H 002259 02 PAINN 01STA 8 0TAN 1P 500 02 P KA PA N AK KI 053 PAKIST ST IS A A TA 02PA KIST N AN N 44 KI IST AN S 0 A 0 2 0 TA N 01 33 0274 2523 PA N 02 U 7001 04 K 70SS 02 02PA PIST 06R 70 PAKISAK AN PA 03 KI TAIST KI PASTAN AN ST KI N A ST N AN Figure 3. An unrooted tree depicting the genetic structure and differentiation within the walnut (Juglans regia) germplasm collection based on 21 microsatellite loci. 0361 01 CHINA 0370 04 CHINA 0362 10 CHINA 0366 01 CHINA 0370 10 CHIN A 0370 01 0466 Xinjiang7 CHINA 0370 CHINAINA 03707 CHCH 0370 00302CH INA 0360 INA CHINA 0308 0348 01 0371 7201 A 01CH 0348 CH CHIN INA 04 IN 03 49 01 CHAINA 03620350 01 04 CCHINA H C IN H A INA 0377 0 3 C 031 HINA 3 01 U S S 0033661 06 R 031 1 037 009 CCHHINA 0351503 USS 3 03 361 INA 04 U R 038 C 03 C SSR 033802 0C375 03HINA HIN 3 0 HIN10 C73 0 A 5C A H 5C H IN INA HINA 0307377A 9 02 03 03037903 C CHIN 0 H 3 INA A 38 001334 79 079 0206 C 85 02 62 u04 0144 CHCH HIN 8 CH nkCH 8 U IN INA A INA INA SS A 03 03 03 R 82 81 51 03 5 0 0 0 030 0 3732 C7 C01 C 5 H 050CH03801HINHINA IN S 03 A e 6 CIN 1 0CHA 4 xt o 03 0 5 003 H A 1 IN n 47 30471 C63 0036INA CHA IN 0 3 0 H 03 3 4 A 3 6 I 0 03 4 N C 81 0 55 03 CH3471 07CH A HIN052 CH 03 047 012INA01 CIHNA A CH INA IN CH3402 2 0 CH IN A IN7 0CH 4 P INAA A 7 CIN A HIA KIS N A TA N AN ST KI PAAN R N 05 T SS 2 IS TA N U 27 AK IS TA N A 01 R 0 P L K IS 6SR SS 07PA PAAKN IST N N 8 U E A A 2 00US1B 27 N7 0036 PSTA AK T T N AN 0 01 6 7 KI 2 P KISKISISTA 2 025 0 A IST N 6 0 A K 5 A 02 AKISTA 4 PPA 86 31 008 0 7 P26701 P 0 0 7 7 001 TANANN 4P K 00 0 0 AN A 7 6 T 6 A N N 63 P 6 2 2 4 IS IS T IST IS N TA TA 02 06 02 0 0026AKAKK AK P P A TAKIASKIS 263TAN 3P 0 IS 050804 PAKISPA 90 P 3 9 7 2 K 3 IA 7 0 D P 0 0 6 6 PA IN 0202026906272272 N 07 eri3 72 0 0 69 N TA 7 S IA 02 02STAAKIS 04i28 IND N I TAN r est AK03 IPSTA e P S 0 arli AN AKIS 08 269 AK 049 EKIS8T 02 PAN 69 0 6 P A 25 IST 02 69 0 P 2 0 AK 02 P 90 TAN 026 58 03TAN KIS 02AKIS 2 PA P 75 0 N ISTAN 5 01 0K2IST6APAK 7 N 2 N 0 PA 5 0 TA KISTAN 5 05 02P7AK0IS 4 PA ISTA 02 7 5 0 3 2 75 PAK TAN 027 00253 0012 PAKIS TANN N 02A5N3 ISTA01 PAKKIS PAK A ISTA 0269 05 PTA KIST55 03 0269 IS N 02 PA 02 N PAK TA 03 IS 0255 K PAKISTAN 0274 05 PA 0274 05 0264AN KISTKI STAN dlg2 INDIA 03 PA 08 PATa 026402 ndari5s 72 04 89 PA TAN KIS 0122 23PAKISTAN 0122 26PAK ISTAN 0122 13 PAKISTAN 0122 17ISTA N PAK 0122 190266 03 PAKISTAN 0270 04 PAKISTAN 0266 01 PAKISTAN 0266 04 PAKISTAN 0266 02 PAKISTAN 0266 05 PAKISTAN 46 Walnut Research Reports 2009 Figure 4. 3D projection of walnut germplasm along the first three principle axes. California Walnut Board 47 Walnut Research Reports 2009 Figure 5. 3D projection of walnut germplasm accessions along the first three PC axes with minimum spanning tree superimposed. California Walnut Board View publication stats 48 Walnut Research Reports 2009