Primary levels of the marine food chain may play an important role in the fate of petroleum hydro... more Primary levels of the marine food chain may play an important role in the fate of petroleum hydrocarbons in both chemically dispersed and un-dispersed oil spills. HSP-60 proteins, members of the chaperonin family of stress proteins, are induced in response to a wide variety of environmental agents, including UV light, heavy metals, and xenobiotics. Increased production and storage of carbohydrate in I. galbana has been associated with aging and stress. Thus, HSP-60 and carbohydrate storage were selected as sublethal endpoints of exposure to the primary producer, I. galbana, a golden brown, unicellular algae, and a significant component of the marine phytoplankton community. The authors have found that I. galbana cultures exposed to water-accommodated fractions (WAF) of Prudhoe Bay Crude Oil (PBCO), and PBCO/dispersant preparations efficiently induce HSP-60. Studies indicated that WAF produced a dose-related response in I. galbana, which increased as a function of time. Dispersant alone showed the greatest induction, while combined WAF-dispersant showed less induction, suggesting a possible competition between crude oil and algae for dispersant interaction. In addition, they have demonstrated that I. galbana accumulates carbohydrates in response to exposure to WAF and PBCO/dispersant preparations and therefore represents another index of stress in this organism. They weremore » interested in determining if induction of stress proteins and HSP60 in particular represented an adaptive-mechanism, allowing this algae to better cope with exposure to petroleum hydrocarbons released in the marine environment during an oil spill. In an effort to determine if stress protein induction serves as a protective adaptive response to exposure to petroleum hydrocarbons they examined the effect of heat shock induction on the accumulation of carbohydrates by these organisms in response to exposure to WAF and dispersed oil preparations.« less
ABSTRACT Many bioassays examine the effects of a single Stressor on organism health, but in the e... more ABSTRACT Many bioassays examine the effects of a single Stressor on organism health, but in the environment organisms are rarely, if ever, exposed to a single Stressor. The development of an assay that measures overall organism health would be desirable. Stress proteins, including hsp60, are a group of highly conserved proteins that are induced in response to a variety of environmental agents and are well suited to measure the effects of multiple Stressors due to their integrative nature (Sanders (1993) Critical Reviews in Toxicology23, 49–75). They have been postulated to confer a protective response in the cell, shielding it from further protein damage. The goal of this investigation was to demonstrate an integrative stress response to multiple environmental contaminants using the marine rotifer, Brachionus plicatilis, which has a demonstrated ability to produce hsp60 (Cochrane et al. (1991) Comparative Biochemistry and Physiology98C, 385–390).
bioRxiv (Cold Spring Harbor Laboratory), Apr 13, 2017
Understanding gene regulation and function requires a genome-wide method capable of capturing bot... more Understanding gene regulation and function requires a genome-wide method capable of capturing both gene expression levels and isoform diversity at the single-cell level. Short-read RNAseq is limited in its ability to resolve complex isoforms because it fails to sequence fulllength cDNA copies of RNA molecules. Here, we investigate whether RNAseq using the longread single-molecule Oxford Nanopore MinION sequencer is able to identify and quantify complex isoforms without sacrificing accurate gene expression quantification. After benchmarking our approach, we analyse individual murine B1a cells using a custom multiplexing strategy. We identify thousands of unannotated transcription start and end sites, as well as hundreds of alternative splicing events in these B1a cells. We also identify hundreds of genes expressed across B1a cells that display multiple complex isoforms, including several B cellspecific surface receptors. Our results show that we can identify and quantify complex isoforms at the single cell level.
As a step towards simplifying and reducing the cost of haplotype resolvedde novoassembly, we desc... more As a step towards simplifying and reducing the cost of haplotype resolvedde novoassembly, we describe new methods for accurately phasing nanopore data with the Shasta genome assembler and a modular tool for extending phasing to the chromosome scale called GFAse. We test using new variants of Oxford Nanopore Technologies’ (ONT) PromethION sequencing, including those using proximity ligation and show that newer, higher accuracy ONT reads substantially improve assembly quality.
The human Y chromosome has been notoriously difficult to sequence and assemble because of its com... more The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure including long palindromes, tandem repeats, and segmental duplications1–3. As a result, more than half of the Y chromosome is missing from the GRCh38 reference sequence and it remains the last human chromosome to be finished4, 5. Here, the Telomere-to-Telomere (T2T) consortium presents the complete 62,460,029 base pair sequence of a human Y chromosome from the HG002 genome (T2T-Y) that corrects multiple errors in GRCh38-Y and adds over 30 million base pairs of sequence to the reference, revealing the complete ampliconic structures ofTSPY,DAZ, andRBMYgene families; 41 additional protein-coding genes, mostly from theTSPYfamily; and an alternating pattern of human satellite 1 and 3 blocks in the heterochromatic Yq12 region. We have combined T2T-Y with a prior assembly of the CHM13 genome4and mapped available population variation, clinical variants, and functional genom...
Historically, RNA has been sequenced as cDNA copies derived from reverse transcription of cellula... more Historically, RNA has been sequenced as cDNA copies derived from reverse transcription of cellular RNA followed by PCR amplification. Recently, RNA sequencing using nanopores has emerged as an alternative. Using this technology, individual cellular RNA strands are read directly as they are driven through nanoscale pores by an applied voltage. The speed of translocation is regulated by a helicase that is loaded onto each RNA strand by an adapter that also facilitates capture by the nanopore electric field. Here we describe a technique for adapting human ribosomal RNA subunits for nanopore sequencing. Using this strategy, a single Oxford Nanopore MinION run delivered 470,907 sequence reads of which 396,048 aligned to ribosomal RNA, with 28S, 18S, 5.8S, and 5S coverage of 6053, 369,472, 16,058, and 4465 reads, respectively. Example alignments that reveal putative nucleotide modifications are provided.
The SARS-CoV-2 virus has a complex transcriptome characterised by multiple, nested subgenomic RNA... more The SARS-CoV-2 virus has a complex transcriptome characterised by multiple, nested subgenomic RNAsused to express structural and accessory proteins. Long-read sequencing technologies such as nanopore direct RNA sequencing can recover full-length transcripts, greatly simplifying the assembly of structurally complex RNAs. However, these techniques do not detect the 5′ cap, thus preventing reliable identification and quantification of full-length, coding transcript models. Here we used Nanopore ReCappable Sequencing (NRCeq), a new technique that can identify capped full-length RNAs, to assemble a complete annotation of SARS-CoV-2 sgRNAs and annotate the location of capping sites across the viral genome. We obtained robust estimates of sgRNA expression across cell lines and viral isolates and identified novel canonical and non-canonical sgRNAs, including one that uses a previously un-annotated leader-to-body junction site. The data generated in this work constitute a useful resource for...
Archives of Environmental Contamination and Toxicology, 1999
Hsp60 induction was selected as a sublethal endpoint of toxicity for Brachionus plicatilis expose... more Hsp60 induction was selected as a sublethal endpoint of toxicity for Brachionus plicatilis exposed to a water accommodated fraction (WAF) of Prudhoe Bay crude oil (PBCO), a PBCO/dispersant (Corexit 9527) fraction and Corexit 9527 alone. To examine the effect of multiple stressors, exposures modeled San Francisco Bay, where copper levels are approximately 5 µg/L, salinity is 22‰, significant oil transport and refining occurs, and petroleum releases have occurred historically. Rotifers were exposed to copper at 5 µg/L for 24 h, followed by one of the oil/dispersant preparations for 24 h. Batch-cultured rotifers were used in this study to model wild populations instead of cysts. SDS-PAGE with Western Blotting using hsp60-specific antibodies and chemiluminescent detection were used to isolate, identify, and measure induced hsp60 as a percentage of control values. Both PBCO/dispersant and dispersant alone preparations induced significant levels of hsp60. However, hsp60 expression was reduced to that of controls at high WAF concentrations, suggesting interference with protein synthesis. Rotifers that had been preexposed to copper maintained elevated levels of hsp60 upon treatment with WAF at all concentrations. Results suggest that induction of hsp60 by chronic low-level exposure may serve as a protective mechanism against subsequent or multiple stressors and that hsp60 levels are not additive for the toxicants tested in this study, giving no dose-response relationship. The methods employed in this study could be useful for quantifying hsp60 levels in wild rotifer populations.
Abstract Adaptation to sublethal exposure to crude oil by phytoplankton is poorly understood. Use... more Abstract Adaptation to sublethal exposure to crude oil by phytoplankton is poorly understood. Use of chemical dispersants for oil spill remediation increases petroleum hydrocarbon concentrations in water, while exposing marine organisms to potentially toxic concentrations of dispersant. Heat shock proteins (hsps) have been found to serve as an adaptive and protective mechanism against environmental stresses. The objective of this project was to examine the induction of hsps in Isochrysis galbana, a golden-brown algae, following exposure to the water-accommodated fraction (WAF) of Prudhoe Bay crude oil (PBCO) and PBCO chemically dispersed with Corexit 9527® (dispersed oil: DO). Initial experiments using 35S-labeled amino acids and 2-dimensional electrophoresis with subsequent western blotting identified and confirmed hsp60, a member of the chaperonin family of stress proteins, as being efficiently induced by heat shock in this species. One-dimensional SDS PAGE and western blotting, with hsp60 antibodies and chemiluminesence detection, were used to quantitate hsp60 following exposure to a range of environmental temperatures and concentrations of WAF and DO preparations. I. galbana cultured in 22 parts per thousand (‰) salinity showed a statistically significant increase (p
(TCDD) is the most toxic congener of a large class of toxic pollutants, collectively known as hal... more (TCDD) is the most toxic congener of a large class of toxic pollutants, collectively known as halogenated dioxins (1,2). The dioxins, together with the halogenated dibenzofurans, polychlorinated biphenyls, and polybrominated biphenyls, belong to a larger class of toxins collectively referred to as the halogenated aromatic hydrocarbons (HAHs). HAHs are characterized by a common set of toxic effects and biochemical changes, including immunosuppression, carcinogenesis, teratogenesis, and induction of cytochrome CYP1A1 and other components of xenobiotic detoxification enzyme systems. One of the most prominent
Wilms tumor is the most common childhood kidney cancer. 15% of patients with Wilms tumor have ger... more Wilms tumor is the most common childhood kidney cancer. 15% of patients with Wilms tumor have germline pathogenic variants in genes or regions such as WT1 or the 11p15 region. Variants in these regions can include structural or copy number alterations or alterations in methylation. In the majority of cases of Wilms tumor no known pathogenic variant could be found using the state-of-the-art technologies, including comprehensive approaches such as Illumina whole exome sequencing. One explanation for this is that such technologies have difficulties in detecting structural variants (SVs) in areas associated with repeat or low complexity sequence. In addition, Illumina technology does not immediately support direct methylation detection. Therefore, we hypothesized that analysis of bilateral Wilms tumor of unknown etiology using long-read sequencing could reveal molecular events of potential clinical interest. We performed in-depth genomic analysis on a whole blood DNA sample from a patient with a bilateral Wilms tumor. This patient had no significant family history of cancer, and previously tested negative for Beckwith-Wiedemann syndrome by methylation testing of the 11p15 region; clinical exome sequencing of the patient's germline detected no variants associated with Wilms tumors. We sequenced the genome at 40x depth using PromethION Nanopore sequencing. 29180 SVs (deletions or insertions larger than 30 base pairs) were detected using Sniffles and 26480 were detected with SVIM. Only SVs that were detected by both methods were considered for downstream analysis. Variants were annotated and filtered using a short-read catalog of SVs (gnomAD-SV), a long-read catalog, the Database of Genomic Variants catalog, and by comparison to 11 in-house genomes. We focused on SVs, copy number variants, and methylation events affecting genes previously associated with Wilms tumor. Our long-read sequencing approach detected compound heterozygotes using phased variant calls. A heterozygous missense mutation was identified in haplotype one, while a 300 base pair insertion in an ALU element was present in haplotype two. These two compound heterozygous variants overlap an exon of the OVCH2 gene, and the ALU element was not detected by the prior Illumina analysis. Additionally, we determined the frequency of methylation in CpG sites genomewide using nanopolish. Using a normal blood sample from an unrelated individual as a control, we searched for extreme differences across large and gene promoter regions. Hypermethylation in the promoter regions of genes in the 11p15.5 locus was observed in the patient as compared to the control. Hypomethylation in this region is associated with Beckwith-Wiedemann syndrome. In conclusion, nanopore technology is able to detect variants missed by Illumina sequencing, and has the potential to yield new findings of interest in a case of a child with suspected cancer predisposition syndrome. Citation Format: Allison R. Cheney, Jean Monlong, Holly C. Beale, Hugh Olsen, Ellen Towle Kephart, Katrina Learned, Shanna White, Julian A. Martinez-Agosto, Noah Federman, Mark Akeson, Miten Jain, Vivian Y. Chang, Olena M. Vaske. Long-read sequencing characterization of a patient with bilateral Wilms tumor of unknown etiology [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 261.
There is growing recognition that epivariations, most often recognized as promoter hypermethylati... more There is growing recognition that epivariations, most often recognized as promoter hypermethylation events that lead to gene silencing, are associated with a number of human diseases. However, little information exists on the prevalence and distribution of rare epigenetic variation in the human population. In order to address this, we performed a survey of methylation profiles from 23,116 individuals using the Illumina 450k array. Using a robust outlier approach, we identified 4,452 unique autosomal epivariations, including potentially inactivating promoter methylation events at 384 genes linked to human disease. For example, we observed promoter hypermethylation of BRCA1 and LDLR at population frequencies of ∼1 in 3,000 and ∼1 in 6,000, respectively, suggesting that epivariations may underlie a fraction of human disease which would be missed by purely sequence-based approaches. Using expression data, we confirmed that many epivariations are associated with outlier gene expression. Analysis of variation data and monozygous twin pairs suggests that approximately two-thirds of epivariations segregate in the population secondary to underlying sequence mutations, while one-third are likely sporadic events that occur post-zygotically. We identified 25 loci where rare hypermethylation coincided with the presence of an unstable CGG tandem repeat, validated the presence of CGG expansions at several loci, and identified the putative molecular defect underlying most of the known folate-sensitive fragile sites in the genome. Our study provides a catalog of rare epigenetic changes in the human genome, gives insight into the underlying origins and consequences of epivariations, and identifies many hypermethylated CGG repeat expansions.
The advent of a miniaturized DNA sequencing device with a high-throughput contextual sequencing c... more The advent of a miniaturized DNA sequencing device with a high-throughput contextual sequencing capability embodies the next generation of large scale sequencing tools. The MinION™ Access Programme (MAP) was initiated by Oxford Nanopore Technologies™ in April 2014, giving public access to their USB-attached miniature sequencing device. The MinION Analysis and Reference Consortium (MARC) was formed by a subset of MAP participants, with the aim of evaluating and providing standard protocols and reference data to the community. Envisaged as a multi-phased project, this study provides the global community with the Phase 1 data from MARC, where the reproducibility of the performance of the MinION was evaluated at multiple sites. Five laboratories on two continents generated data using a control strain of Escherichia coli K-12, preparing and sequencing samples according to a revised ONT protocol. Here, we provide the details of the protocol used, along with a preliminary analysis of the c...
may have either antiestrogenic or estrogen potentiating activIn Vivo Modulation of 17b-Estradiol-... more may have either antiestrogenic or estrogen potentiating activIn Vivo Modulation of 17b-Estradiol-Induced Vitellogenin Synity based on compound and its ability to induce CYP1A1 thesis and Estrogen Receptor in Rainbow Trout (Oncorhynchus (Anderson et al., 1995; Villalobos et al., 1995). For example, mykiss) Liver Cells by b-Naphthoflavone. ANDERSON, M. J., 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD), 2,3,7,8-tetraOLSEN, H., MATSUMURA, F., AND HINTON, D. E. (1996). Toxicol. chlorodibenzofuran, 2,3,4,7,8-pentachlorodibenzofuran, and Appl. Pharmacol. 137, 210–218. b-naphthoflavone (bNF) proved antiestrogenic at concentrations tested, while the ortho-substituted, weak CYP1A1-inVitellogenesis or egg yolk production represents a key estrogen initiated process in oviparous vertebrates which is crucial for ooducing polychlorinated biphenyl (PCB) 2,3,4,4*,5-pentacyte maturation. Previous in vitro studies have shown that cytochlorobiphenyl (congener 114) was estrogen potentiating. chrome P4501A1...
Motivation Nucleotide modification status can be decoded from the Oxford Nanopore Technologies na... more Motivation Nucleotide modification status can be decoded from the Oxford Nanopore Technologies nanopore-sequencing ionic current signals. Although various algorithms have been developed for nanopore-sequencing-based modification analysis, more detailed characterizations, such as modification numbers, corresponding signal levels and proportions are still lacking. Results We present a framework for the unsupervised determination of the number of nucleotide modifications from nanopore-sequencing readouts. We demonstrate the approach can effectively recapitulate the number of modifications, the corresponding ionic current signal levels, as well as mixing proportions under both DNA and RNA contexts. We further show, by integrating information from multiple detected modification regions, that the modification status of DNA and RNA molecules can be inferred. This method forms a key step of de novo characterization of nucleotide modifications, shedding light on the interpretation of various...
Nanopore sequencing devices read individual RNA strands directly. This facilitates identification... more Nanopore sequencing devices read individual RNA strands directly. This facilitates identification of exon linkages and nucleotide modifications; however, using conventional methods the 5′ and 3′ ends of poly(A) RNA cannot be identified unambiguously. This is due in part to the architecture of the nanopore/enzyme-motor complex, and in part to RNA degradation in vivo and in vitro that can obscure transcription start and end sites. In this study, we aimed to identify individual full-length human RNA isoform scaffolds among ~4 million nanopore poly(A)-selected RNA reads. First, to identify RNA strands bearing 5′ m7G caps, we exchanged the biological cap for a modified cap attached to a 45-nucleotide oligomer. This oligomer adaptation method improved 5′ end sequencing and ensured correct identification of the 5′ m7G capped ends. Second, among these 5′-capped nanopore reads, we screened for ionic current signatures consistent with a 3′ polyadenylation site. Combining these two steps, we i...
We describe a method for direct tRNA sequencing using the Oxford Nanopore MinION. The principal t... more We describe a method for direct tRNA sequencing using the Oxford Nanopore MinION. The principal technical advance is custom adapters that facilitate end-to-end sequencing of individual transfer RNA (tRNA) molecules at subnanometer precision. A second advance is a nanopore sequencing pipeline optimized for tRNA. We tested this method using purified E. coli tRNA fMet , tRNA Lys , and tRNA Phe samples. 76−92% of individual aligned tRNA sequence reads were full length. As a proof of concept, we showed that nanopore sequencing detected all 43 expected isoacceptors in total E. coli MRE600 tRNA as well as isodecoders that further define that tRNA population. Alignment-based comparisons between the three purified tRNAs and their synthetic controls revealed systematic nucleotide miscalls that were diagnostic of known modifications. Systematic miscalls were also observed proximal to known modifications in total E. coli tRNA alignments, including a highly conserved pseudouridine in the T loop. This work highlights the potential of nanopore direct tRNA sequencing as well as improvements needed to implement tRNA sequencing for human healthcare applications.
De novo assembly of a human genome using nanopore long-read sequences has been reported, but it u... more De novo assembly of a human genome using nanopore long-read sequences has been reported, but it used more than 150,000 CPU hours and weeks of wall-clock time. To enable rapid human genome assembly, we present Shasta, a de novo long-read assembler, and polishing algorithms named MarginPolish and HELEN. Using a single PromethION nanopore sequencer and our toolkit, we assembled 11 highly contiguous human genomes de novo in 9 d. We achieved roughly 63× coverage, 42-kb read N50 values and 6.5× coverage in reads >100 kb using three flow cells per sample. Shasta produced a complete haploid human genome assembly in under 6 h on a single commercial compute node. MarginPolish and HELEN polished haploid assemblies to more than 99.9% identity (Phred quality score QV = 30) with nanopore reads alone. Addition of proximity-ligation sequencing enabled near chromosome-level scaffolds for all 11 genomes. We compare our assembly performance to existing methods for diploid, haploid and trio-binned h...
Primary levels of the marine food chain may play an important role in the fate of petroleum hydro... more Primary levels of the marine food chain may play an important role in the fate of petroleum hydrocarbons in both chemically dispersed and un-dispersed oil spills. HSP-60 proteins, members of the chaperonin family of stress proteins, are induced in response to a wide variety of environmental agents, including UV light, heavy metals, and xenobiotics. Increased production and storage of carbohydrate in I. galbana has been associated with aging and stress. Thus, HSP-60 and carbohydrate storage were selected as sublethal endpoints of exposure to the primary producer, I. galbana, a golden brown, unicellular algae, and a significant component of the marine phytoplankton community. The authors have found that I. galbana cultures exposed to water-accommodated fractions (WAF) of Prudhoe Bay Crude Oil (PBCO), and PBCO/dispersant preparations efficiently induce HSP-60. Studies indicated that WAF produced a dose-related response in I. galbana, which increased as a function of time. Dispersant alone showed the greatest induction, while combined WAF-dispersant showed less induction, suggesting a possible competition between crude oil and algae for dispersant interaction. In addition, they have demonstrated that I. galbana accumulates carbohydrates in response to exposure to WAF and PBCO/dispersant preparations and therefore represents another index of stress in this organism. They weremore » interested in determining if induction of stress proteins and HSP60 in particular represented an adaptive-mechanism, allowing this algae to better cope with exposure to petroleum hydrocarbons released in the marine environment during an oil spill. In an effort to determine if stress protein induction serves as a protective adaptive response to exposure to petroleum hydrocarbons they examined the effect of heat shock induction on the accumulation of carbohydrates by these organisms in response to exposure to WAF and dispersed oil preparations.« less
ABSTRACT Many bioassays examine the effects of a single Stressor on organism health, but in the e... more ABSTRACT Many bioassays examine the effects of a single Stressor on organism health, but in the environment organisms are rarely, if ever, exposed to a single Stressor. The development of an assay that measures overall organism health would be desirable. Stress proteins, including hsp60, are a group of highly conserved proteins that are induced in response to a variety of environmental agents and are well suited to measure the effects of multiple Stressors due to their integrative nature (Sanders (1993) Critical Reviews in Toxicology23, 49–75). They have been postulated to confer a protective response in the cell, shielding it from further protein damage. The goal of this investigation was to demonstrate an integrative stress response to multiple environmental contaminants using the marine rotifer, Brachionus plicatilis, which has a demonstrated ability to produce hsp60 (Cochrane et al. (1991) Comparative Biochemistry and Physiology98C, 385–390).
bioRxiv (Cold Spring Harbor Laboratory), Apr 13, 2017
Understanding gene regulation and function requires a genome-wide method capable of capturing bot... more Understanding gene regulation and function requires a genome-wide method capable of capturing both gene expression levels and isoform diversity at the single-cell level. Short-read RNAseq is limited in its ability to resolve complex isoforms because it fails to sequence fulllength cDNA copies of RNA molecules. Here, we investigate whether RNAseq using the longread single-molecule Oxford Nanopore MinION sequencer is able to identify and quantify complex isoforms without sacrificing accurate gene expression quantification. After benchmarking our approach, we analyse individual murine B1a cells using a custom multiplexing strategy. We identify thousands of unannotated transcription start and end sites, as well as hundreds of alternative splicing events in these B1a cells. We also identify hundreds of genes expressed across B1a cells that display multiple complex isoforms, including several B cellspecific surface receptors. Our results show that we can identify and quantify complex isoforms at the single cell level.
As a step towards simplifying and reducing the cost of haplotype resolvedde novoassembly, we desc... more As a step towards simplifying and reducing the cost of haplotype resolvedde novoassembly, we describe new methods for accurately phasing nanopore data with the Shasta genome assembler and a modular tool for extending phasing to the chromosome scale called GFAse. We test using new variants of Oxford Nanopore Technologies’ (ONT) PromethION sequencing, including those using proximity ligation and show that newer, higher accuracy ONT reads substantially improve assembly quality.
The human Y chromosome has been notoriously difficult to sequence and assemble because of its com... more The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure including long palindromes, tandem repeats, and segmental duplications1–3. As a result, more than half of the Y chromosome is missing from the GRCh38 reference sequence and it remains the last human chromosome to be finished4, 5. Here, the Telomere-to-Telomere (T2T) consortium presents the complete 62,460,029 base pair sequence of a human Y chromosome from the HG002 genome (T2T-Y) that corrects multiple errors in GRCh38-Y and adds over 30 million base pairs of sequence to the reference, revealing the complete ampliconic structures ofTSPY,DAZ, andRBMYgene families; 41 additional protein-coding genes, mostly from theTSPYfamily; and an alternating pattern of human satellite 1 and 3 blocks in the heterochromatic Yq12 region. We have combined T2T-Y with a prior assembly of the CHM13 genome4and mapped available population variation, clinical variants, and functional genom...
Historically, RNA has been sequenced as cDNA copies derived from reverse transcription of cellula... more Historically, RNA has been sequenced as cDNA copies derived from reverse transcription of cellular RNA followed by PCR amplification. Recently, RNA sequencing using nanopores has emerged as an alternative. Using this technology, individual cellular RNA strands are read directly as they are driven through nanoscale pores by an applied voltage. The speed of translocation is regulated by a helicase that is loaded onto each RNA strand by an adapter that also facilitates capture by the nanopore electric field. Here we describe a technique for adapting human ribosomal RNA subunits for nanopore sequencing. Using this strategy, a single Oxford Nanopore MinION run delivered 470,907 sequence reads of which 396,048 aligned to ribosomal RNA, with 28S, 18S, 5.8S, and 5S coverage of 6053, 369,472, 16,058, and 4465 reads, respectively. Example alignments that reveal putative nucleotide modifications are provided.
The SARS-CoV-2 virus has a complex transcriptome characterised by multiple, nested subgenomic RNA... more The SARS-CoV-2 virus has a complex transcriptome characterised by multiple, nested subgenomic RNAsused to express structural and accessory proteins. Long-read sequencing technologies such as nanopore direct RNA sequencing can recover full-length transcripts, greatly simplifying the assembly of structurally complex RNAs. However, these techniques do not detect the 5′ cap, thus preventing reliable identification and quantification of full-length, coding transcript models. Here we used Nanopore ReCappable Sequencing (NRCeq), a new technique that can identify capped full-length RNAs, to assemble a complete annotation of SARS-CoV-2 sgRNAs and annotate the location of capping sites across the viral genome. We obtained robust estimates of sgRNA expression across cell lines and viral isolates and identified novel canonical and non-canonical sgRNAs, including one that uses a previously un-annotated leader-to-body junction site. The data generated in this work constitute a useful resource for...
Archives of Environmental Contamination and Toxicology, 1999
Hsp60 induction was selected as a sublethal endpoint of toxicity for Brachionus plicatilis expose... more Hsp60 induction was selected as a sublethal endpoint of toxicity for Brachionus plicatilis exposed to a water accommodated fraction (WAF) of Prudhoe Bay crude oil (PBCO), a PBCO/dispersant (Corexit 9527) fraction and Corexit 9527 alone. To examine the effect of multiple stressors, exposures modeled San Francisco Bay, where copper levels are approximately 5 µg/L, salinity is 22‰, significant oil transport and refining occurs, and petroleum releases have occurred historically. Rotifers were exposed to copper at 5 µg/L for 24 h, followed by one of the oil/dispersant preparations for 24 h. Batch-cultured rotifers were used in this study to model wild populations instead of cysts. SDS-PAGE with Western Blotting using hsp60-specific antibodies and chemiluminescent detection were used to isolate, identify, and measure induced hsp60 as a percentage of control values. Both PBCO/dispersant and dispersant alone preparations induced significant levels of hsp60. However, hsp60 expression was reduced to that of controls at high WAF concentrations, suggesting interference with protein synthesis. Rotifers that had been preexposed to copper maintained elevated levels of hsp60 upon treatment with WAF at all concentrations. Results suggest that induction of hsp60 by chronic low-level exposure may serve as a protective mechanism against subsequent or multiple stressors and that hsp60 levels are not additive for the toxicants tested in this study, giving no dose-response relationship. The methods employed in this study could be useful for quantifying hsp60 levels in wild rotifer populations.
Abstract Adaptation to sublethal exposure to crude oil by phytoplankton is poorly understood. Use... more Abstract Adaptation to sublethal exposure to crude oil by phytoplankton is poorly understood. Use of chemical dispersants for oil spill remediation increases petroleum hydrocarbon concentrations in water, while exposing marine organisms to potentially toxic concentrations of dispersant. Heat shock proteins (hsps) have been found to serve as an adaptive and protective mechanism against environmental stresses. The objective of this project was to examine the induction of hsps in Isochrysis galbana, a golden-brown algae, following exposure to the water-accommodated fraction (WAF) of Prudhoe Bay crude oil (PBCO) and PBCO chemically dispersed with Corexit 9527® (dispersed oil: DO). Initial experiments using 35S-labeled amino acids and 2-dimensional electrophoresis with subsequent western blotting identified and confirmed hsp60, a member of the chaperonin family of stress proteins, as being efficiently induced by heat shock in this species. One-dimensional SDS PAGE and western blotting, with hsp60 antibodies and chemiluminesence detection, were used to quantitate hsp60 following exposure to a range of environmental temperatures and concentrations of WAF and DO preparations. I. galbana cultured in 22 parts per thousand (‰) salinity showed a statistically significant increase (p
(TCDD) is the most toxic congener of a large class of toxic pollutants, collectively known as hal... more (TCDD) is the most toxic congener of a large class of toxic pollutants, collectively known as halogenated dioxins (1,2). The dioxins, together with the halogenated dibenzofurans, polychlorinated biphenyls, and polybrominated biphenyls, belong to a larger class of toxins collectively referred to as the halogenated aromatic hydrocarbons (HAHs). HAHs are characterized by a common set of toxic effects and biochemical changes, including immunosuppression, carcinogenesis, teratogenesis, and induction of cytochrome CYP1A1 and other components of xenobiotic detoxification enzyme systems. One of the most prominent
Wilms tumor is the most common childhood kidney cancer. 15% of patients with Wilms tumor have ger... more Wilms tumor is the most common childhood kidney cancer. 15% of patients with Wilms tumor have germline pathogenic variants in genes or regions such as WT1 or the 11p15 region. Variants in these regions can include structural or copy number alterations or alterations in methylation. In the majority of cases of Wilms tumor no known pathogenic variant could be found using the state-of-the-art technologies, including comprehensive approaches such as Illumina whole exome sequencing. One explanation for this is that such technologies have difficulties in detecting structural variants (SVs) in areas associated with repeat or low complexity sequence. In addition, Illumina technology does not immediately support direct methylation detection. Therefore, we hypothesized that analysis of bilateral Wilms tumor of unknown etiology using long-read sequencing could reveal molecular events of potential clinical interest. We performed in-depth genomic analysis on a whole blood DNA sample from a patient with a bilateral Wilms tumor. This patient had no significant family history of cancer, and previously tested negative for Beckwith-Wiedemann syndrome by methylation testing of the 11p15 region; clinical exome sequencing of the patient's germline detected no variants associated with Wilms tumors. We sequenced the genome at 40x depth using PromethION Nanopore sequencing. 29180 SVs (deletions or insertions larger than 30 base pairs) were detected using Sniffles and 26480 were detected with SVIM. Only SVs that were detected by both methods were considered for downstream analysis. Variants were annotated and filtered using a short-read catalog of SVs (gnomAD-SV), a long-read catalog, the Database of Genomic Variants catalog, and by comparison to 11 in-house genomes. We focused on SVs, copy number variants, and methylation events affecting genes previously associated with Wilms tumor. Our long-read sequencing approach detected compound heterozygotes using phased variant calls. A heterozygous missense mutation was identified in haplotype one, while a 300 base pair insertion in an ALU element was present in haplotype two. These two compound heterozygous variants overlap an exon of the OVCH2 gene, and the ALU element was not detected by the prior Illumina analysis. Additionally, we determined the frequency of methylation in CpG sites genomewide using nanopolish. Using a normal blood sample from an unrelated individual as a control, we searched for extreme differences across large and gene promoter regions. Hypermethylation in the promoter regions of genes in the 11p15.5 locus was observed in the patient as compared to the control. Hypomethylation in this region is associated with Beckwith-Wiedemann syndrome. In conclusion, nanopore technology is able to detect variants missed by Illumina sequencing, and has the potential to yield new findings of interest in a case of a child with suspected cancer predisposition syndrome. Citation Format: Allison R. Cheney, Jean Monlong, Holly C. Beale, Hugh Olsen, Ellen Towle Kephart, Katrina Learned, Shanna White, Julian A. Martinez-Agosto, Noah Federman, Mark Akeson, Miten Jain, Vivian Y. Chang, Olena M. Vaske. Long-read sequencing characterization of a patient with bilateral Wilms tumor of unknown etiology [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 261.
There is growing recognition that epivariations, most often recognized as promoter hypermethylati... more There is growing recognition that epivariations, most often recognized as promoter hypermethylation events that lead to gene silencing, are associated with a number of human diseases. However, little information exists on the prevalence and distribution of rare epigenetic variation in the human population. In order to address this, we performed a survey of methylation profiles from 23,116 individuals using the Illumina 450k array. Using a robust outlier approach, we identified 4,452 unique autosomal epivariations, including potentially inactivating promoter methylation events at 384 genes linked to human disease. For example, we observed promoter hypermethylation of BRCA1 and LDLR at population frequencies of ∼1 in 3,000 and ∼1 in 6,000, respectively, suggesting that epivariations may underlie a fraction of human disease which would be missed by purely sequence-based approaches. Using expression data, we confirmed that many epivariations are associated with outlier gene expression. Analysis of variation data and monozygous twin pairs suggests that approximately two-thirds of epivariations segregate in the population secondary to underlying sequence mutations, while one-third are likely sporadic events that occur post-zygotically. We identified 25 loci where rare hypermethylation coincided with the presence of an unstable CGG tandem repeat, validated the presence of CGG expansions at several loci, and identified the putative molecular defect underlying most of the known folate-sensitive fragile sites in the genome. Our study provides a catalog of rare epigenetic changes in the human genome, gives insight into the underlying origins and consequences of epivariations, and identifies many hypermethylated CGG repeat expansions.
The advent of a miniaturized DNA sequencing device with a high-throughput contextual sequencing c... more The advent of a miniaturized DNA sequencing device with a high-throughput contextual sequencing capability embodies the next generation of large scale sequencing tools. The MinION™ Access Programme (MAP) was initiated by Oxford Nanopore Technologies™ in April 2014, giving public access to their USB-attached miniature sequencing device. The MinION Analysis and Reference Consortium (MARC) was formed by a subset of MAP participants, with the aim of evaluating and providing standard protocols and reference data to the community. Envisaged as a multi-phased project, this study provides the global community with the Phase 1 data from MARC, where the reproducibility of the performance of the MinION was evaluated at multiple sites. Five laboratories on two continents generated data using a control strain of Escherichia coli K-12, preparing and sequencing samples according to a revised ONT protocol. Here, we provide the details of the protocol used, along with a preliminary analysis of the c...
may have either antiestrogenic or estrogen potentiating activIn Vivo Modulation of 17b-Estradiol-... more may have either antiestrogenic or estrogen potentiating activIn Vivo Modulation of 17b-Estradiol-Induced Vitellogenin Synity based on compound and its ability to induce CYP1A1 thesis and Estrogen Receptor in Rainbow Trout (Oncorhynchus (Anderson et al., 1995; Villalobos et al., 1995). For example, mykiss) Liver Cells by b-Naphthoflavone. ANDERSON, M. J., 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD), 2,3,7,8-tetraOLSEN, H., MATSUMURA, F., AND HINTON, D. E. (1996). Toxicol. chlorodibenzofuran, 2,3,4,7,8-pentachlorodibenzofuran, and Appl. Pharmacol. 137, 210–218. b-naphthoflavone (bNF) proved antiestrogenic at concentrations tested, while the ortho-substituted, weak CYP1A1-inVitellogenesis or egg yolk production represents a key estrogen initiated process in oviparous vertebrates which is crucial for ooducing polychlorinated biphenyl (PCB) 2,3,4,4*,5-pentacyte maturation. Previous in vitro studies have shown that cytochlorobiphenyl (congener 114) was estrogen potentiating. chrome P4501A1...
Motivation Nucleotide modification status can be decoded from the Oxford Nanopore Technologies na... more Motivation Nucleotide modification status can be decoded from the Oxford Nanopore Technologies nanopore-sequencing ionic current signals. Although various algorithms have been developed for nanopore-sequencing-based modification analysis, more detailed characterizations, such as modification numbers, corresponding signal levels and proportions are still lacking. Results We present a framework for the unsupervised determination of the number of nucleotide modifications from nanopore-sequencing readouts. We demonstrate the approach can effectively recapitulate the number of modifications, the corresponding ionic current signal levels, as well as mixing proportions under both DNA and RNA contexts. We further show, by integrating information from multiple detected modification regions, that the modification status of DNA and RNA molecules can be inferred. This method forms a key step of de novo characterization of nucleotide modifications, shedding light on the interpretation of various...
Nanopore sequencing devices read individual RNA strands directly. This facilitates identification... more Nanopore sequencing devices read individual RNA strands directly. This facilitates identification of exon linkages and nucleotide modifications; however, using conventional methods the 5′ and 3′ ends of poly(A) RNA cannot be identified unambiguously. This is due in part to the architecture of the nanopore/enzyme-motor complex, and in part to RNA degradation in vivo and in vitro that can obscure transcription start and end sites. In this study, we aimed to identify individual full-length human RNA isoform scaffolds among ~4 million nanopore poly(A)-selected RNA reads. First, to identify RNA strands bearing 5′ m7G caps, we exchanged the biological cap for a modified cap attached to a 45-nucleotide oligomer. This oligomer adaptation method improved 5′ end sequencing and ensured correct identification of the 5′ m7G capped ends. Second, among these 5′-capped nanopore reads, we screened for ionic current signatures consistent with a 3′ polyadenylation site. Combining these two steps, we i...
We describe a method for direct tRNA sequencing using the Oxford Nanopore MinION. The principal t... more We describe a method for direct tRNA sequencing using the Oxford Nanopore MinION. The principal technical advance is custom adapters that facilitate end-to-end sequencing of individual transfer RNA (tRNA) molecules at subnanometer precision. A second advance is a nanopore sequencing pipeline optimized for tRNA. We tested this method using purified E. coli tRNA fMet , tRNA Lys , and tRNA Phe samples. 76−92% of individual aligned tRNA sequence reads were full length. As a proof of concept, we showed that nanopore sequencing detected all 43 expected isoacceptors in total E. coli MRE600 tRNA as well as isodecoders that further define that tRNA population. Alignment-based comparisons between the three purified tRNAs and their synthetic controls revealed systematic nucleotide miscalls that were diagnostic of known modifications. Systematic miscalls were also observed proximal to known modifications in total E. coli tRNA alignments, including a highly conserved pseudouridine in the T loop. This work highlights the potential of nanopore direct tRNA sequencing as well as improvements needed to implement tRNA sequencing for human healthcare applications.
De novo assembly of a human genome using nanopore long-read sequences has been reported, but it u... more De novo assembly of a human genome using nanopore long-read sequences has been reported, but it used more than 150,000 CPU hours and weeks of wall-clock time. To enable rapid human genome assembly, we present Shasta, a de novo long-read assembler, and polishing algorithms named MarginPolish and HELEN. Using a single PromethION nanopore sequencer and our toolkit, we assembled 11 highly contiguous human genomes de novo in 9 d. We achieved roughly 63× coverage, 42-kb read N50 values and 6.5× coverage in reads >100 kb using three flow cells per sample. Shasta produced a complete haploid human genome assembly in under 6 h on a single commercial compute node. MarginPolish and HELEN polished haploid assemblies to more than 99.9% identity (Phred quality score QV = 30) with nanopore reads alone. Addition of proximity-ligation sequencing enabled near chromosome-level scaffolds for all 11 genomes. We compare our assembly performance to existing methods for diploid, haploid and trio-binned h...
Uploads
Papers by Hugh Olsen