Unigene

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 7

UniGene

• UniGene is NCBI EST cluster database.


• Each cluster is a set of overlapping EST
sequences .
• The database is constructed based on
combined information from dbEST, GenBank
mRNA database.
• Only ESTs with 3’ ends are clustered.
• The resulting 3’EST sequences provide more
unique representation of the transcripts.
• The next step is to remove contaminant
sequences that include bacterial vectors.
• The cleaned ESTs are used to search against a
database of known unique genes with BLAST.
• The compiling step identifies sequence
overlaps and derived final sequence.
• During this step, errors in individual ESTs are
corrected, then sequences are partitioned into
clusters and assembled into contig.
• The final result is a set of nonredundant, gene
clusters known as UniGene clusters.
• Each UniGene cluster represents unique gene
and is further annotated for its gene locus
information, as well as information related to
the tissue type where gene has been
GSS
• In field of bioinformatics and computational
biology, genome survey sequences are
nucleotide sequences similar to ESTs.
• The only difference is that most of them are
genomic in origin rather than mRNA.
• Genome Survey Sequences are typically
generated and submitted to NCBI by labs
performing genome sequencing.
• They are used, amongst other things, as a
framework for the mapping and sequencing of
• Genome survey sequencing is a new way to
map the genome sequences.
• Current genome sequencing approaches are
mostly high-throughput shotgun methods,
and GSS is often used on the first step of
sequencing.
• GSSs can provide an initial global view of a
genome, which includes both coding and non-
coding DNA and contain repetitive section of
the genome.
UCSC
• The UCSC genome browser is an online
genome browser hosted by University of
California Santa Cruz.
• It is an interactive website offering access to
genome sequence data from variety of
vertebrates and invertebrates species.
• The UCSC genome browser hosts genomes
from variety of organisms: As of September
2009, this included 24 vertebrates, 14
mammals, 13 insects, 11 species of
• The UCSC genome browser is a part of
package of tools accessible from the UCSC
genome bioinformatics website.
• The UCSC genome browser provides users
with visualization of results from genome such
as SNP associated studies, linkage studies,
chromosomal positions of genes, evolutionary
relationships, alignments.
• It includes many tools such as Genome
browser, BLAT, Gene sorter, Genome graphs.
TIGR
• TIGR Gene Indices (www.tigr.org/tdb/tgi.shtml)
is an EST database that uses a different
clustering method from UniGene.
• It compiles data from dbEST, GenBank mRNA
and genomic DNA data, and TIGR’s own
sequence database.
• Sequences are only clustered if they are more
than 95% identical for over a forty nucleotide
region in pairwise comparisons.
• BLAST and FASTA are used to identify sequence
overlaps.

You might also like