Bio4241 Chap 9 Genomics
Bio4241 Chap 9 Genomics
Bio4241 Chap 9 Genomics
What is Genomics?
Study of complete set of genes
Global vs local
Genome Projects (Table)
Structural vs. Functional genomics
b) SSLPs
Types of SSLPs:
i) Minisatellite markers
c) RAPDs
2) Cytogenetic Maps
produced by relating locations of DNA markers to cytogenetic landmarks such as chromosome bands and
puffs
Ways to do this:
if a cloned DNA sequence is available for area of interest, label it as a probe and use it to hybridize to
chromosomes in situ
individual chromosomes are recognizable through morphology differences such as size, banding
pattern, centromere location
map the probe sequence to approximate position on chromosome
labels used for probes: FISH - Fluorescent In Situ Hybridization
Physical Mapping
Physical mapping is an intermediate step in sequencing the entire genome
(genetic map > physical map > sequence map)
A complete physical map of the genome includes:
maps for each chromosome in the haploid chromosome set
for each chromosome, continuous overlapping cloned genomic DNA segments extending from one
telomere of the chromosome to the other
Vector - plasmid or phage chromosome used to carry cloned DNA segment (or insert) Chapter 8, MGA
Main types used: YAC (Yeast Artificial chromosomes) or Cosmids
BAC (Bacterial Artificial Chromosomes)
PAC (Phage P-1 based Artificial Chromosomes)
Contig - set of ordered overlapping clones that constitute a a chromosomal region or a genome
2) Ordering by STSs
Sequence-Tagged Sites are short unique sequences that can be amplified using defined PCR primers
derive from sequenced regions of the genome so can be used as landmarks for clone classification in
creating physical map
clones that share STSs must overlap; the more STSs they share, the more they overlap
resulting physical map is a STS content map
combination of fingerprinting and STS content mapping has resulted in complete and near-complete
physical maps for many organisms, such as C. elegans
Techniques:
DNA Sequencing
Four bases include A, C, T, and G
Human genome equals 3 x 109 base pairs and includes an X and Y chromosome as well as 22 autosomes
All current sequencing techniques are clone based
First make a clone or subclone library and then sequence all or part of inserts of individual clones in the
library. From these sequences form a consensus sequence
a) Transposons
b) Retrotransposons
c) LINE (long interspersed elements)
d) SINE (short interspersed elements)
Functional Genomics
functional genomics includes study of expression and interaction of gene products on a global level, that
is, using genomic approaches to study some aspect of all gene products simultaneously
how molecules cooperate and interact to effect all the processes and phenotypes that make up a biological
system
genome refers to "gene" plus "ome", or the global data set for "all genes"
various other 'ome's are being worked on: transcriptome, proteome, interactome and phenome
transcriptome - sequence and expression patterns of all transcripts (where, when, how much)
proteome - sequence and expression patterns of all proteins (where, when, how much)
interactome - complete set of physical interactions between: all proteins and all DNA segments; all
proteins and RNA segments; and among all proteins
phenome - description of complete set of phenotypes produced by inactivation of gene function for
each gene in the genome
1) One protocol detecting which genes are active at a particular stage of development in a cell:
array of known cDNAs from different genes are applied to chip
chip exposed to fluorescently labelled probe, such as, RNA extracted from particular cell at
particular stage of development
binding of probe molecules to homologous DNA spots monitored automatically by laser
beam-illuminated microscope
detect spots on chip where probe binds to determine which genes are active at the particular stage of
interest
Animation
Bioinformatics
deciphering meaning from the raw 4-letter DNA sequence by using computational analysis to predict
mRNA and polypeptide sequences.
2. A given DNA sequence can encode for different things depending on its location within the DNA
ie. if located in coding region, the sequence would code for amino acid, if located in non-coding region,
the sequence would act as binding site for regulatory protein.
1. cDNA sequences (complimentaryDNAs are DNA copies of mRNAs) cDNAs are aligned with genomic
DNA to determine the position of introns and exons.
2. Docking site sequences marking the start and end points for the events in information transfer
(transcription, pre-mRNA splicing, translation).
3. Sequences of related polypeptides. Common statistical tool for aligning proteins is BLAST (Basic Local
Alignment Search Tool)
4. Codon bias - species-specific usage preferences for some codons over other encoding for the same amino
acid. Presence of the preferred codon in predicted mRNA sequence supports the accuracy of the
prediction.
Predictions of mRNA and polypeptide structure from genomic DNA sequence depend on an integration of
information from cDNA sequence, docking site predictions, polypeptide similarities, and codon bias. Summary
Figure