Background: Pythium ultimum is a ubiquitous oomycete plant pathogen responsible for a variety of ... more Background: Pythium ultimum is a ubiquitous oomycete plant pathogen responsible for a variety of diseases on a broad range of crop and ornamental species. Results: The P. ultimum genome (42.8 Mb) encodes 15,290 genes and has extensive sequence similarity and synteny with related Phytophthora species, including the potato blight pathogen Phytophthora infestans. Whole transcriptome sequencing revealed expression of 86% of genes, with detectable differential expression of suites of genes under abiotic stress and in the presence of a host. The predicted proteome includes a large repertoire of proteins involved in plant pathogen interactions, although, surprisingly, the P. ultimum genome does not encode any classical RXLR effectors and relatively few Crinkler genes in comparison to related phytopathogenic oomycetes. A lower number of enzymes involved in carbohydrate metabolism were present compared to Phytophthora species, with the notable absence of cutinases, suggesting a significant difference in virulence mechanisms between P. ultimum and more host-specific oomycete species. Although we observed a high degree of orthology with Phytophthora genomes, there were novel features of the P. ultimum proteome, including an expansion of genes involved in proteolysis and genes unique to Pythium. We identified a small gene family of cadherins, proteins involved in cell adhesion, the first report of these in a genome outside the metazoans. Conclusions: Access to the P. ultimum genome has revealed not only core pathogenic mechanisms within the oomycetes but also lineage-specific genes associated with the alternative virulence and lifestyles found within the pythiaceous lineages compared to the Peronosporaceae.
Thellungiella salsuginea (also known as T. halophila) is a close relative of Arabidopsis that is ... more Thellungiella salsuginea (also known as T. halophila) is a close relative of Arabidopsis that is very tolerant of drought, freezing, and salinity and may be an appropriate model to identify the molecular mechanisms underlying abiotic stress tolerance in plants. We produced 6578 ESTs, which represented 3628 unique genes (unigenes), from cDNA libraries of cold-, drought-, and salinity-stressed plants from the
Motivation: The growth of sequence data has been accompanied by an increasing need to analyze dat... more Motivation: The growth of sequence data has been accompanied by an increasing need to analyze data on distributed computer clusters. The use of these systems for routine analysis requires scalable and robust software for data management of large datasets. Software is also needed to simplify data management and make large-scale bioinformatics analysis accessible and reproducible to a wide class of target users. Results: We have developed a workflow management system named Ergatis that enables users to build, execute and monitor pipelines for computational analysis of genomics data. Ergatis contains preconfigured components and template pipelines for a number of common bioinformatics tasks such as prokaryotic genome annotation and genome comparisons. Outputs from many of these components can be loaded into a Chado relational database. Ergatis was designed to be accessible to a broad class of users and provides a user friendly, web-based interface. Ergatis supports highthroughput batch processing on distributed compute clusters and has been used for data management in a number of genome annotation and comparative genomics projects.
Genome sequence and analysis of the tuber crop potato The Potato Genome Sequencing Consortium* Po... more Genome sequence and analysis of the tuber crop potato The Potato Genome Sequencing Consortium* Potato (Solanum tuberosum L.) is the world's most important non-grain food crop and is central to global food security. It is clonally propagated, highly heterozygous, autotetraploid, and suffers acute inbreeding depression. Here we use a homozygous doubled-monoploid potato clone to sequence and assemble 86% of the 844-megabase genome. We predict 39,031 protein-coding genes and present evidence for at least two genome duplication events indicative of a palaeopolyploid origin. As the first genome sequence of an asterid, the potato genome reveals 2,642 genes specific to this large angiosperm clade. We also sequenced a heterozygous diploid clone and show that gene presence/absence variants and other potentially deleterious mutations occur frequently and are a likely cause of inbreeding depression. Gene family expansion, tissue-specific expression and recruitment of genes to new pathways contributed to the evolution of tuber development. The potato genome sequence provides a platform for genetic improvement of this vital crop.
Single nucleotide polymorphism discovery in elite north american potato germplasm Hamilton et al.... more Single nucleotide polymorphism discovery in elite north american potato germplasm Hamilton et al. Hamilton et al. BMC Genomics 2011, 12:302
The International Cancer Genome Consortium (ICGC) is a collaborative effort to characterize genom... more The International Cancer Genome Consortium (ICGC) is a collaborative effort to characterize genomic abnormalities in 50 different cancer types. To make this data available, the ICGC has created the ICGC Data Portal. Powered by the BioMart software, the Data Portal allows each ICGC member institution to manage and maintain its own databases locally, while seamlessly presenting all the data in a single access point for users. The Data Portal currently contains data from 24 cancer projects, including ICGC, The Cancer Genome Atlas (TCGA), Johns Hopkins University, and the Tumor Sequencing Project. It consists of 3478 genomes and 13 cancer types and subtypes. Available open access data types include simple somatic mutations, copy number alterations, structural rearrangements, gene expression, microRNAs, DNA methylation and exon junctions. Additionally, simple germline variations are available as controlled access data. The Data Portal uses a web-based graphical user interface (GUI) to of...
Background: Current breeding approaches in potato rely almost entirely on phenotypic evaluations;... more Background: Current breeding approaches in potato rely almost entirely on phenotypic evaluations; molecular markers, with the exception of a few linked to disease resistance traits, are not widely used. Large-scale sequence datasets generated primarily through Sanger Expressed Sequence Tag projects are available from a limited number of potato cultivars and access to next generation sequencing technologies permits rapid generation of sequence data for additional cultivars. When coupled with the advent of high throughput genotyping methods, an opportunity now exists for potato breeders to incorporate considerably more genotypic data into their decisionmaking. Results: To identify a large number of Single Nucleotide Polymorphisms (SNPs) in elite potato germplasm, we sequenced normalized cDNA prepared from three commercial potato cultivars: 'Atlantic', 'Premier Russet' and 'Snowden'. For each cultivar, we generated 2 Gb of sequence which was assembled into...
We present the genome sequences of a new clinical isolate of the important human pathogen, Asperg... more We present the genome sequences of a new clinical isolate of the important human pathogen, Aspergillus fumigatus, A1163, and two closely related but rarely pathogenic species, Neosartorya fischeri NRRL181 and Aspergillus clavatus NRRL1. Comparative genomic analysis of A1163 with the recently sequenced A. fumigatus isolate Af293 has identified core, variable
PDF corresponds to the article as it appeared upon acceptance. Copyedited and fully formatted PDF... more PDF corresponds to the article as it appeared upon acceptance. Copyedited and fully formatted PDF and full text (HTML) versions will be made available soon. Genome sequence of the necrotrophic plant pathogen, Pythium ultimum, reveals original pathogenicity mechanisms and effector repertoire
BackgroundLow levels of sample contamination can have disastrous effects on the accurate identifi... more BackgroundLow levels of sample contamination can have disastrous effects on the accurate identification of somatic variation in tumor samples. Detection of sample contamination in DNA is generally based on observation of low frequency variants that suggest more than a single source of DNA is present. This strategy works with standard DNA samples but is especially problematic in solid tumor FFPE samples because there can be huge variations in allele frequency (AF) due to massive copy number changes arising from large gains and losses across the genome. The tremendously variable allele frequencies make detection of contamination challenging. A method not based on individual AF is needed for accurate determination of whether a sample is contaminated and to what degree.MethodsWe used microhaplotypes to determine whether sample contamination is present. Microhaplotypes are sets of variants on the same sequencing read that can be unambiguously phased. Instead of measuring AF, the number a...
Genes related to sex and reproduction are known to evolve rapidly, however, the mechanism for rap... more Genes related to sex and reproduction are known to evolve rapidly, however, the mechanism for rapid evolutionary change is proving to be more complex than a simple relaxation of selective constraint. We compared the divergence between orthologous human and mouse fertility genes according to their degree of dispensability as suggested by mouse knockout mutation phenotypes. The dataset consisted of 161 orthologous genes affecting fertility and 803 orthologous genes affecting viability. We find that essential fertility genes affecting both sexes evolve at a similar rate as essential viability genes, but that within sexes the degree of dispensability is not an important factor affecting the rate of fertility gene evolution. We also find no difference in the evolutionary rates of fertility genes that affect the male versus the female, however, there are a greater number of sterility genes that affect the male. Generally there are a significantly greater number of fertility genes that affect one sex rather than both, suggesting that fertility genes tend toward sex-specific functions, particularly in the male. Our findings support the hypothesis that the rapid evolution of sex-and reproduction-related genes is facilitated through an increased specialization of gene function and that dispensability is not a major factor determining their evolutionary rate.
Potato (Solanum tuberosum) is the third most important food crop in the world. Potato tubers must... more Potato (Solanum tuberosum) is the third most important food crop in the world. Potato tubers must be stored at cold temperatures to prevent sprouting, minimize disease losses, and supply consumers and the processing industry with high-quality tubers ...
Potato (Solanum tuberosum) is the third most important food crop in the world. Potato tubers must... more Potato (Solanum tuberosum) is the third most important food crop in the world. Potato tubers must be stored at cold temperatures to prevent sprouting, minimize disease losses, and supply consumers and the processing industry with high-quality tubers ...
Background: Pythium ultimum is a ubiquitous oomycete plant pathogen responsible for a variety of ... more Background: Pythium ultimum is a ubiquitous oomycete plant pathogen responsible for a variety of diseases on a broad range of crop and ornamental species. Results: The P. ultimum genome (42.8 Mb) encodes 15,290 genes and has extensive sequence similarity and synteny with related Phytophthora species, including the potato blight pathogen Phytophthora infestans. Whole transcriptome sequencing revealed expression of 86% of genes, with detectable differential expression of suites of genes under abiotic stress and in the presence of a host. The predicted proteome includes a large repertoire of proteins involved in plant pathogen interactions, although, surprisingly, the P. ultimum genome does not encode any classical RXLR effectors and relatively few Crinkler genes in comparison to related phytopathogenic oomycetes. A lower number of enzymes involved in carbohydrate metabolism were present compared to Phytophthora species, with the notable absence of cutinases, suggesting a significant difference in virulence mechanisms between P. ultimum and more host-specific oomycete species. Although we observed a high degree of orthology with Phytophthora genomes, there were novel features of the P. ultimum proteome, including an expansion of genes involved in proteolysis and genes unique to Pythium. We identified a small gene family of cadherins, proteins involved in cell adhesion, the first report of these in a genome outside the metazoans. Conclusions: Access to the P. ultimum genome has revealed not only core pathogenic mechanisms within the oomycetes but also lineage-specific genes associated with the alternative virulence and lifestyles found within the pythiaceous lineages compared to the Peronosporaceae.
Thellungiella salsuginea (also known as T. halophila) is a close relative of Arabidopsis that is ... more Thellungiella salsuginea (also known as T. halophila) is a close relative of Arabidopsis that is very tolerant of drought, freezing, and salinity and may be an appropriate model to identify the molecular mechanisms underlying abiotic stress tolerance in plants. We produced 6578 ESTs, which represented 3628 unique genes (unigenes), from cDNA libraries of cold-, drought-, and salinity-stressed plants from the
Motivation: The growth of sequence data has been accompanied by an increasing need to analyze dat... more Motivation: The growth of sequence data has been accompanied by an increasing need to analyze data on distributed computer clusters. The use of these systems for routine analysis requires scalable and robust software for data management of large datasets. Software is also needed to simplify data management and make large-scale bioinformatics analysis accessible and reproducible to a wide class of target users. Results: We have developed a workflow management system named Ergatis that enables users to build, execute and monitor pipelines for computational analysis of genomics data. Ergatis contains preconfigured components and template pipelines for a number of common bioinformatics tasks such as prokaryotic genome annotation and genome comparisons. Outputs from many of these components can be loaded into a Chado relational database. Ergatis was designed to be accessible to a broad class of users and provides a user friendly, web-based interface. Ergatis supports highthroughput batch processing on distributed compute clusters and has been used for data management in a number of genome annotation and comparative genomics projects.
Genome sequence and analysis of the tuber crop potato The Potato Genome Sequencing Consortium* Po... more Genome sequence and analysis of the tuber crop potato The Potato Genome Sequencing Consortium* Potato (Solanum tuberosum L.) is the world's most important non-grain food crop and is central to global food security. It is clonally propagated, highly heterozygous, autotetraploid, and suffers acute inbreeding depression. Here we use a homozygous doubled-monoploid potato clone to sequence and assemble 86% of the 844-megabase genome. We predict 39,031 protein-coding genes and present evidence for at least two genome duplication events indicative of a palaeopolyploid origin. As the first genome sequence of an asterid, the potato genome reveals 2,642 genes specific to this large angiosperm clade. We also sequenced a heterozygous diploid clone and show that gene presence/absence variants and other potentially deleterious mutations occur frequently and are a likely cause of inbreeding depression. Gene family expansion, tissue-specific expression and recruitment of genes to new pathways contributed to the evolution of tuber development. The potato genome sequence provides a platform for genetic improvement of this vital crop.
Single nucleotide polymorphism discovery in elite north american potato germplasm Hamilton et al.... more Single nucleotide polymorphism discovery in elite north american potato germplasm Hamilton et al. Hamilton et al. BMC Genomics 2011, 12:302
The International Cancer Genome Consortium (ICGC) is a collaborative effort to characterize genom... more The International Cancer Genome Consortium (ICGC) is a collaborative effort to characterize genomic abnormalities in 50 different cancer types. To make this data available, the ICGC has created the ICGC Data Portal. Powered by the BioMart software, the Data Portal allows each ICGC member institution to manage and maintain its own databases locally, while seamlessly presenting all the data in a single access point for users. The Data Portal currently contains data from 24 cancer projects, including ICGC, The Cancer Genome Atlas (TCGA), Johns Hopkins University, and the Tumor Sequencing Project. It consists of 3478 genomes and 13 cancer types and subtypes. Available open access data types include simple somatic mutations, copy number alterations, structural rearrangements, gene expression, microRNAs, DNA methylation and exon junctions. Additionally, simple germline variations are available as controlled access data. The Data Portal uses a web-based graphical user interface (GUI) to of...
Background: Current breeding approaches in potato rely almost entirely on phenotypic evaluations;... more Background: Current breeding approaches in potato rely almost entirely on phenotypic evaluations; molecular markers, with the exception of a few linked to disease resistance traits, are not widely used. Large-scale sequence datasets generated primarily through Sanger Expressed Sequence Tag projects are available from a limited number of potato cultivars and access to next generation sequencing technologies permits rapid generation of sequence data for additional cultivars. When coupled with the advent of high throughput genotyping methods, an opportunity now exists for potato breeders to incorporate considerably more genotypic data into their decisionmaking. Results: To identify a large number of Single Nucleotide Polymorphisms (SNPs) in elite potato germplasm, we sequenced normalized cDNA prepared from three commercial potato cultivars: 'Atlantic', 'Premier Russet' and 'Snowden'. For each cultivar, we generated 2 Gb of sequence which was assembled into...
We present the genome sequences of a new clinical isolate of the important human pathogen, Asperg... more We present the genome sequences of a new clinical isolate of the important human pathogen, Aspergillus fumigatus, A1163, and two closely related but rarely pathogenic species, Neosartorya fischeri NRRL181 and Aspergillus clavatus NRRL1. Comparative genomic analysis of A1163 with the recently sequenced A. fumigatus isolate Af293 has identified core, variable
PDF corresponds to the article as it appeared upon acceptance. Copyedited and fully formatted PDF... more PDF corresponds to the article as it appeared upon acceptance. Copyedited and fully formatted PDF and full text (HTML) versions will be made available soon. Genome sequence of the necrotrophic plant pathogen, Pythium ultimum, reveals original pathogenicity mechanisms and effector repertoire
BackgroundLow levels of sample contamination can have disastrous effects on the accurate identifi... more BackgroundLow levels of sample contamination can have disastrous effects on the accurate identification of somatic variation in tumor samples. Detection of sample contamination in DNA is generally based on observation of low frequency variants that suggest more than a single source of DNA is present. This strategy works with standard DNA samples but is especially problematic in solid tumor FFPE samples because there can be huge variations in allele frequency (AF) due to massive copy number changes arising from large gains and losses across the genome. The tremendously variable allele frequencies make detection of contamination challenging. A method not based on individual AF is needed for accurate determination of whether a sample is contaminated and to what degree.MethodsWe used microhaplotypes to determine whether sample contamination is present. Microhaplotypes are sets of variants on the same sequencing read that can be unambiguously phased. Instead of measuring AF, the number a...
Genes related to sex and reproduction are known to evolve rapidly, however, the mechanism for rap... more Genes related to sex and reproduction are known to evolve rapidly, however, the mechanism for rapid evolutionary change is proving to be more complex than a simple relaxation of selective constraint. We compared the divergence between orthologous human and mouse fertility genes according to their degree of dispensability as suggested by mouse knockout mutation phenotypes. The dataset consisted of 161 orthologous genes affecting fertility and 803 orthologous genes affecting viability. We find that essential fertility genes affecting both sexes evolve at a similar rate as essential viability genes, but that within sexes the degree of dispensability is not an important factor affecting the rate of fertility gene evolution. We also find no difference in the evolutionary rates of fertility genes that affect the male versus the female, however, there are a greater number of sterility genes that affect the male. Generally there are a significantly greater number of fertility genes that affect one sex rather than both, suggesting that fertility genes tend toward sex-specific functions, particularly in the male. Our findings support the hypothesis that the rapid evolution of sex-and reproduction-related genes is facilitated through an increased specialization of gene function and that dispensability is not a major factor determining their evolutionary rate.
Potato (Solanum tuberosum) is the third most important food crop in the world. Potato tubers must... more Potato (Solanum tuberosum) is the third most important food crop in the world. Potato tubers must be stored at cold temperatures to prevent sprouting, minimize disease losses, and supply consumers and the processing industry with high-quality tubers ...
Potato (Solanum tuberosum) is the third most important food crop in the world. Potato tubers must... more Potato (Solanum tuberosum) is the third most important food crop in the world. Potato tubers must be stored at cold temperatures to prevent sprouting, minimize disease losses, and supply consumers and the processing industry with high-quality tubers ...
Uploads
Papers by Brett Whitty