EX 1 TREE THINKING CONCEPTS Worksheet
EX 1 TREE THINKING CONCEPTS Worksheet
EX 1 TREE THINKING CONCEPTS Worksheet
According to Baum and Smith (2012), tree thinking is the ability to visualize evolution in tree form
and to use tree diagrams to communicate and analyze evolutionary phenomena. It is significant in the
development of accurate understanding of evolution and aids in the organization of knowledge on
biodiversity. Tree thinking is essential in the various fields of biology: from molecular biology to genetics,
and developmental biology to ecology. Moreover, tree thinking is vital in applied research, such as tracking
diseases like COVID-19, recording responses to climate change, and guiding conservation policies.
A tree diagram is composed of lines, called branches (or edges) which are connected at nodes
(lineages splitting events). The diagram needs to be directed (time runs in one direction along each branch)
and acyclic (lineages that diverge never subsequently fuse) for it to be considered as a tree in the formal
sense. The root of the tree is a special node that marks the point where time enters the diagram and is often
designated by an external branch whose tip is unlabeled.
Mesquite is an extendible, open source, modular software used in studying evolutionary biology.
It allows the users to organize and analyze comparative data about organisms with an emphasis on
phylogenetic analysis. Moreover, it can also be used in population genetics and in non-phylogenetic
multivariate analysis. Mesquite can accept two major types of data: (1) morphological (discreet/categorical)
data and (2) molecular sequence (continuous) data. For the morphological data, a matrix will be formed by
scoring and entering the morphological information. This will let the users to make their own phylogenetic
tree and consensus tree. Meanwhile, sequence data files (FASTA, GenBank, text files) are used for the
molecular analysis. Mesquite specifically allows the users to load and align sequences which are used for
tree building. You may also examine previous character states on the generated molecular phylogenies. In
building the phylogenetic trees for both types of data, different methods are used, namely: Neighbor-
Joining, Parsimony, Minimum Evolution, Maximum likelihood and Bayesian analysis.
Objectives
PROCEDURES
Basic Local Alignment Search Tool (BLAST) is used for finding regions of local similarity between
nucleotide or protein sequences. It refers to a suite of programs utilized in generating alignments between
sequences known as “query” and sequences within a database known as “subject”sequences. The program
also compares the sequence in the database and computes the statistical significance of the matches.
BLAST databases are built from concatenated FASTA formatted sequences. FASTA is a format of BLAST
query sequences consisting of character strings of single letter nucleotide or amino acid codes, preceded by
a definition line, beginning with a “>” symbol, and containing identifiers and descriptive information.
There are several types of BLAST searches. The National Center for Biotechnology Information (NCBI)
WebBLAST offers four main search types: (1) BLASTn (Nucleotide BLAST), (2) BLASTx (translated
nucleotide sequence searched against protein sequences), (3)tBLASTn (protein sequence searched against
translated nucleotide sequences, and (4) BLASTp (Protein BLAST). NCBI BLAST homepage
(https://blast.ncbi.nlm.nih.gov/Blast.cgi) is shown below .
Figure1. NCBI BLAST
homepage.
1. To go to the Nucleotide
BLAST page, just click on the Nucleotide BLAST button on the NCBI BLAST homepage OR simply visit
this link-
(https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastn&PAGE_TYPE=BlastSearch&LINK_LOC=
blasthome). This will lead you to the Nucleotide BLAST page below.
2. Copy your assigned sequence and paste it on the Enter Query Sequence textbox. You may also upload
the FASTA file. Click the BLAST button to start the search. Wait for the results to load (note: sometimes
this may take a few minutes).
3. Once the search is finished, this will show you the sequences producing significant alignments. In the
descriptions tab, you will see the description, scientific name, max score, total score, query cover, e-value,
percent identity, accession length, and accession number. In the provided excel file, supply the missing
data and include the first three species from your BLAST results. Predict the identity of each of the
sequences.
5. You may now simplify the taxa name by double clicking each taxon. Examine the sequences. This
is now the working file. Save your progress.
Note:
In cases wherein the initial alignment did not produce a good “aligned sequence data”, you can
check and alter your sequences. Instances will occur when the provided/downloaded sequence has been
reversed or encoded as sequence complements. To arrange your alignment, select the target sequence, click
ALTER and choose the necessary method. Make sure to read beforehand on how these methods change the
arrangement of your target sequence.
Guide Questions:
1. How can you identify an organism or a given sample using BLAST? If the e-value for a given species is
“0,” how will you interpret the result? If the percent identity is 100%, how will you interpret the results?
What are the strengths and limitations of BLAST?
2. Are interpretations of a single data tree analysis conclusive enough? Can it accurately describe the
evolution and relationships of taxa studied?
3. How reliable are molecular data in evolutionary studies? What are the pros and cons of using molecular
data in evolutionary studies?
Conduct all the procedures indicated in this worksheet. Compile submit all the necessary output provided
in the checklist.
References:
Baum, D.A., Smith, S.D., & Donovan, S.S. (2005). The tree-thinking challenge. Science. 310:979–980.
Baum, David A. & Stacey D. Smith. (2012). Tree Thinking: An Introduction to Phylogenetic Biology.
Systematic Biology. 62(4):634–637.
Huang, Sophia & Justen B. Whittall. (2018). Tree of Trees: Using Campus Tree Diversity to Integrate
Molecular, Organismal, and Evolutionary Biology. The American Biology Teacher, 80(2): 144–151.
Maddison, W. P. and D.R. Maddison. 2019. Mesquite: a modular system for evolutionary analysis. Version
3.61. http://www.mesquiteproject.org
Omland, Kevin E., Lyn G. Cook & Michael D. Crisp. (2008). Tree thinking for all biology: the problem
with reading phylogenies as ladders of progress. BioEssays 30:854–867.
Wheeler, D. B. M.(2007)." Chapter 9: BLAST QuickStart. Bergman, Nicholas H. Comparative Genomics
Volumes, 1, 395-396.