W2 DNA Sequencing - The Human Genome Project - Stu
W2 DNA Sequencing - The Human Genome Project - Stu
W2 DNA Sequencing - The Human Genome Project - Stu
Goals:
• To create a complete map of the human genome
• To improve our understanding of human biology and disease
• To develop new technologies and methods for genome analysis
• To address ethical, legal, and social issues
SIO1003 | DNA Sequencing & The Human Genome Project
Human Genome Project timeline
Goal for Human Physical Map
Genetic Map Covers 98% of Human draft
Exceeded Genome
NRC U.S. Human Gene Map Human Gene Map
Recommends HGP (16,000 genes) (30,181 genes)
HGP Begins
2nd data:
From IHGSC
16 million DNA seq reads
Key Features
•Uses ddNTPs to terminate DNA synthesis
•DNA synthesis reactions in four separate tubes
•Radioactive dATP is also included in all the tubes so the
DNA products will be radioactive.
•Yielding a series of DNA fragments whose sizes can be
measured by electrophoresis.
•The last base in each of these fragments is known.
• Automated
• No radioactive labelling
• Each ddNTP is fluorescently labelled
• Excited by laser, 4 different fluorescence detected
• Fluorescence intensity translated into “peak”
• All 4 chain terminators can be in the same tube = run
one lane on a gel
Procedure
• gDNA is fragmented by sonification/hydrodynamic shearing.
• All sticky-end fragments are blunt-ended with T4 DNA polymerase and
exonuclease activity.
• T4 polynucleotide kinase is added - 5' ends are phosphorylated.
• Fragments separated into either different-sized fragments.
• A library is created per each size in plasmids and transformed into E. coli cells.
• Vector DNA is purified (and amplified) from each library.
• Each DNA strand is sequenced (can attach a primer upstream of our vector, then
use any sequencing by synthesis method).
• Computer program called a base caller filters out poor calls (bioinformatics).
• The assembler finds overlapping segments and generates contigs (long successive
continguous stretches of nucleotides).
2000
•Sanger Sequencing
• Allow us to sequence DNA/RNA more •Maxam-Gilbert Sequencing
2nd Generation
quickly and more affordable than Sanger
2006-2010
•Pyrosequencing
Sequencing •Sequencing by Reversible
• Revolutionized genomics & molecular Terminator Chemistry
•Sequencing by Ligation
biology
3rd Generation
2010-2015
• Puts bioinformatics into the hot seat •Single Molecule Fluorescent Sequencing
• Overcome limitations of conventional seq •Single Molecule Real Time Sequencing
•Semiconductor Sequencing
methods
•Nanopore Sequencing
4th Generation
Aims conducting genomic analysis
SIO1003 | DNA Sequencing & The Human Genome Project directly in the cell.
Human Genome
Properties of chromosome bands
Dark bands (G bands) Pale bands (R bands)
AT-rich GC-rich
DNase insensitive DNase sensitive
Condense early during cell cycle Condense late during cell cycle
Replicate late Replicate early
Gene poor Gene rich
Large genes Small genes
Tissue-specific House-keeping
CpG islands;few CpG islands;many
LINE SINE, Alu repeats rich
Genome Size and Number of Protein-Coding Genes for a Select Handful of Species. Table
adapted from Van Straalen & Roelofs, 2006
• Accuracy of gene prediction tools has become more accurate over the years – improved precision brought possible
45k genes down to ~26.k protein-coding genes (Human Genome Project) (Venter et al., 2001), then down to 20k~
in 2008.
*computational tools back then had high false positive rates