Scan
Scan
Scan
3 CHROMOSOME STRUCTURE, DNA REPLICATION AND GENOMES 33 Chromosome Chromatic) Chromatid Double Stranded DNA Telomere Centromere Figure 2.6 Chromosome Organization Chromosomes are highly coiled and condensed packages of DNA. In a nondividing cell, DNA exists in an unraveled state called chromatin. Histone proteins serve as particles around which DNA becomes tightly wound to give a "beads on a string" appearance when viewed with an electron microscope. During chromosome formation, which occurs when cells divide, chromatin is further compacted into tight fibers and super-coiled looped structures. Ultimately these supercoiled loops are tightly packed together with the assistance of other proteins to create an entire chromosome, a highly compact assembly of DNA Each chromosome consists of two sister chromatids ->uached by a centromere. Chromosome arms are the portions of the Chromatid on one side of the centromere, lab-sled as the p and q arms. The ends of a chromosome are called telomeres. called the haploid number (it) of chromosomes. All . other cells of the body, such as skin cells, muscle cells, and liver cells, are known as somatic cells. Somatic cells from many organisms have two sets of chromosomes, called the diploid number (in) of chromosomes. Human somatic cells contain 46 chromosomes. Somatic cells ol a normal human male have 22 pairs ol autosomes and an X and Y chromosome, while cells ol a normal female have 22 pairs of autosomes and two X chromosomes. Sex chromosomes arc so named because they contain genes that influence sex trails and the development of reproductive organs, while the autosomes primarily contain the genes that affect other body features unrelated to sex such as skin color and e'ye color. Several characteristics are common to most eukaryotic chromosomes. Prokaryotes typically contain a circular chromosome with slightly different structures (Chapter 5). Each eukaryotic chromosome consists of two thin rodlike structures of DNA called sister chromatids (Figure 2.6). The sister chromatids are exact replicas of each other, copied during DNA synthesis, which occurs just prior to chromosome formation. During cell division, each sister chromatid is separated so that newly forming cells receive the same amount of DNA as the original cell they arose from. Each eukaryotic chromosome has a single centromere. The centromere is a constricted region of the chromosome consisting of intertwined DNA and proteins that join the iwo sister chromaiids to each other. This region of a chromosome also contains proteins thai attach chromosomes to organelles called microuibules. Micro-tubules play an essential role in moving chromosomes and separating sister chromatids during cell division. The centromere delineates each sister chromatid into two armsthe
short arm. called ihe p arm, and the long arm or q arm. Each arm of a chromosome Q Do all species have the same num-__ ber of chromosomes? A No. Chromosome number is almost as diverse as the number of different species. Human cells have a haploid number of 23. In parentheses are haploid numbers for other species: fruit flies (4), yeast (16),t cats (19), dogs (39). ends with a segment called a telomere (Figure 2.6). Telomeres are highly conserved repetitive sequences ol nucleolides thai are important for attaching chromosomes to the nuclear envelope. Telomeres are a subject of intense research. As we will discuss in Chapter 11. changes in lelomere length are believed to pl.iy a role in the aging process and in the development of certain .ypes of cancers. Xaryotype analysis for studying chromosomes One of the most common ways to study chromosome number and basic aspects of chromosome structure is to prepare a karyolype. In karyolype analysis, cells are spread on a microscope slide and then treated with chemicals to release and stain the chromosomes. For example, G-banding, in which chromosomes are treated with a DNA-bindmg dye called Giemsa slain, creates a series of alternating light and dark bands in stained chromosomes. Each stained chromosome shows a unique and reproducible banding pattern that can be used to identify different chromosomes. Chromosomes can be aligned and paired based on their staining pattern and their size (Figure 2.7). In humans. Human male G-bands 1! ae * ,? ts *B II chromosome I is the largest chromosome while chromosome 21 is the smallest (actually arranged before 22 in a karyolype). Kanotypes are very valuable for studying and comparing chromosome structure. In Chapter 11. we will consider how karvotype analysis can he u^cd u> idcnulN human genetic disease conditions associated with abnormalities in chromosome structure and number. DNA Replication When a cell divides, il is essenlial lhat the newly created cells contain equal copies ol replicated DNA. Somatic cells divide by a process called mitosis wherein one cell divides to produce daughter cells each of which contains an identical copy of i-he DNA of the1 original (parent) cell. For instance, o human skin cell divides to produce two daughter cells, each Containing 23 pairs of chromosomes. Gametes are formed In a process called meiosis wherein a parent cell divides to create up to four daughter cells, which can be either sperm or egg cells. During meiosis. the chromosome number in daughter cells is cm in half to the It. 0
spread out onto a. glass microscope slide to release their chromosomes. Chromosomes are stained and aligned based on their overall size, position of the centromere, and their staining pattern to create a karyotype.
CHAPTER 2 AN INTRODUCTION TO GENES AND GENOMES 2,3 CHROMOSOME STRUCTURE. DNA REPLICATION. AND GENOMES 35 'arental -DNA Nucleotides After separation, -both parental strands serve as templates - Two identical daughter molecules of DNA Original (parental) strand Newly synthesized strands Original (parental) strand Figure 2.8 An Overview of DNA Replication Nucleotide strands in a DNA molecule must first be separated (a). Each strand serves as a template for the synthesis of new strands producing two DNA molecules, each containing one original strand and one newly synthesized strand (b). haploid number. Sperm and egg cells contain a single set of 23 chromosomes. Through sexual reproduction. a fertilized egg called the zygote is formed. The zygote, which divides by mitosis to form an embryo and eventually a complete human, contains 46 chromosomes23 paternal chromosomes and 23 maternal chromosomes. Prior to cell division by either mitosis or meiosis. DNA must be replicated in the cell. DNA replication occurs by a process called semiconservativc replication. An overview of this process is shown in Figure 2.8. Before replication begins, the two comple-' mentary strands of the double helix must be pulled apart into single strands. Once separated, the two strands serve as templates for copying two new strands of DNA. At the end of this process, tvvo new double helices are formed. Each helix contains one original DNA (parental) strand and one newly synthesized strand, thus the term "semiccnservative." DNA replication occu.'s in a series of stages involving a number of different proteins. Because prokary-otes contain circular chromosomes. DNA replication in prokaryotes is slightly different from that in eukary-otes. Here we consider DNA replication in eukaryoles. Replication is initiated by DNA helicase, an enzyme that separates the two strands of nucleotides. literally "unzipping' the DNA by breaking hydrogen bonds between complementary base pairs (Figure 2.9). Tlv; separated strands form a replication fork. As helicase unwinds the DNA. single-strand binding proteins attach to each strand and prevent them from base pairing and reforming a double helix. This step is important because the DNA strands must be held apart during DNA replication. The separation of complementary strands occurs in regions of the DNA called origins of replication. Bacterial chromosomes have a single origin. Because of their large size, eukaryotic ; chromosomes have multiple origins. Starting DNA j replication at multiple origins allows eukaryotic chromosomes to be copied rapidly. The next step in DNA replication involves the I addition of short segments of RNA approximately 10 to { 15 nucleotides long. These sequences, called RNAs primers, ai e synthesized by an enzyme called primase. Primers start the process of DNA replication because' they serve as binding sites for DNA polymerase, the key enzyme that makes new strands of DNA. Polymerase binds to
each single strand moving along the strand and using it.as a template to copy a new strand of DNA. During this process, DNA polymerase \ises nucleotides present in the cell to synthesize cbmple4-mentary strands of DNA. DNA polymerase alwayf 2) Single-strand binding proteins stabilize the unwound parental DNA. 1) Helicases unwind the parental double helix. _ 3} The leading strand is synthesized continuously in the 5' - 3'direction by DNA polymerase. 3' 5' 3' Parental DNA 5) After the RNA primer is replaced by DNA (by another DNA polymerase, not shown), DNA ligase joins the Okazaki fragment to the growing strand. DNA ligase - Overall direction of replication 4) The lagging strand is synthesized discontinuously. Primase synthesizes a short RNA primer, which is extended by DNA polymerase to form an Okazaki fragment. Figure 2.9 Semiconservative Replication of DNA works in one dircclion, synthesizing new strands in vi 5' to 3' orientation and adding nucleolidcs to the 3' end of a newly synthesized strand (Figure 2.9) by forming phosphodiester bonds between the phosphate of one nucleotide and the sugar in the previous nucleoiide. Because DNA polymerase only proceeds in a y to y direction, replication along one strand, ihe leading strand, occurs in a continuous fashion (Figure 2.9). Synthesis on the opposite strand, the lagging strand, occurs in a discontinuous fashion because DNA poly-.nerase must wait for the replication fork to open. On the lagging strand, short pieces of DNA called Okazaki .ragmems (named after Reiji and Tuncko Okazaki, the scientists who discovered these fragments), are synthesized as the DNA polymerase literally "backstitches" its way into the opening replication fork. Covalcnt bonds between Okazaki fragments in the lagging strand are formed by DNA ligase to ensure that there are no gaps in the phosphodiester backbone. Finally, ihe RNA primers are removed, and these gaps are filled by DNA polymerase. Remember the functions of enzymes involved in DNA synthesis. In the next chapler, we will discuss how DNA polymerase and DNA ligase are routinely used in the lab during DNA cloning and analysis experiments. What Is a Genome? DNA contains the instructions for lifegenes. The ffintire. set Of genes in an organism's DNA is called the igefiSfritUComained in the human genome is an.estimated 35,000 genes scattered among approximately 3 billion base pairs of DNA. The study of genomes, a discipline tailed genomics, is t'ui renily one of the most active and rapidly advancing areos of biological science. Throughout this book, we will discuss aspects Q Does the size of an organism's genome relate to an organism's complexity? A Absolutely not. Genome size varies greatly from organism to organism, but the size of an organism's genome does not relate to its complexity.
Humans and mice share a similar number of base pairs (~3 billion) and a similar number of genes, around 30,000 to 40,000 estimated genes. Plants such as Arabidopsis thaliana contain approximately 25,000 genes in a 97 million base pair genome, fruit flies (Drosophila melanogaster) have around 13,000 genes in a genome of 165 million base pairs. While nonscientists might not consider mice or plants to be as "complex" as humans, genome studies tell us that complexity is far more than just the number of genes an organism contains. It is incorrect to think about i lumans as being more complex than other life forms. For instance, A. thaliana, a weed that has proven valuable tor many studies in genetics, contains genes that allow it to derive energy from sunlight by photosynthesis. Human cells cannot convert energy by photosynthesis. All living organisms are complex with unique capabilities dictated by genes and the way proteins produced by genes interact with one another.
36 CHAPTER 2 AN INTRODUCTION TO GENES AND GENOMES of (he Human Genome Project, a worldwide effort ro identify all human genes on each chromosome. The Human Genome Project is an enormous undertaking in genomics thai is providing scientists vviih exciting insighi imo human genes, their locations, and functions. RNA and Protein Synthesis ~'Genes govern the activities and functions within a cell by directing the synthesis of proteins. Some of the myriad functions of these essential molecules follow: Proteins are necessary for cell structure as important components ol membranes and the cytoplasm. Proteins carry out essential reactions in the cell as enzymes. Proteins perform critical roles as hormones and other "signaling" molecules that cells use to communicate with one another. Receptor proteins hind to oilier molecules such as hormones and transport proteins , enabling molecules to enter and leave cells. Proteins in the form of antibodies recognize and destroy foreign materials in the body. Quite simply, cells cannot function .without proteins. How does DNA make proteins? Actually, DNA does not make proteins direct Iv. To synthesize proteins, genes are first copied into molecules called messenger RNA (mRNA) (Figure 2.10). RNA synthesis is called transcription because genes are literally transcribed (copied) from a DNA code into an RNA code. In turn. mRNA molecules, which are exact copies of genes, contain information that is deciphered into instructions for making a protein through a process known as translation. Other than the fact that RNA molecules are single-stranded, the chemical composition of RNA is very similar to that of DNA. Its bases are also very similar to DNA. One key difference is that RNA contains a base called uracil (U) instead of thymine (T) (see Figure 2.4). The other primary difference is that RNA contains a pentose sugar called ribose,' which has a slightly different structure than the deoxyribose sugar contained in DNA. An easy way to remember the difference between transcription and translation is to remember that translation involves a change in code from RNA to protein, much like translating one language to another. Through production of mRNA and protein synthesis, DNA controls the properties of a cell and its traits (Figure 2.ICM. This process of transcription and translation directs the flow of genetic information in cells, controlling a cell's activities and properties. Here we study basic principles of transcription and translation and aspects of how gene expression can he controlled by cells. Copying the Code: Transcription How is DNA' used as a template to make RNA? RNA polymerase is the key enzyme of transcription. Inside the nucleus, RNA polymerase unwinds the DNA helix and then copies one strand of DNA into RNA. Unlike DNA replication where the entire DNA molecule is copied, transcription occurs only in segments of a chromosome that contain genes. How does RNA polymerase know where 10
begin transcription? Adjacent to most genes is a promoter, specific sequences of nucleotides that allow RNA polymerase to bind at specific locations next to genes (Figure 2.12; see also Figure 2.14). As we will discuss in more detail later in this chapter, proteins called transcription factors help RNA polymerase find the promoter and bind to DNA, and sequences called enhancers can also play important roles in transcription. After RNA polymerase binds to a promoter, jt separates the DNA strands and proceeds in a 5"to 3'; direction along the DNA template strand "to copy a complementary strand of RNA by forming phosphodiester bonds between ribonucleotides in much the same way that TNA polymerase copies DNA (Figure 2.11). When RNA polymerase reaches the end of a gene, it encounters a teimination sequence. These sequences eh her bind specific proteins or base pair to create loops at the end of the RNA. As a result, the RNA polymerase and newly formed RNA are released from the DNA molecule. Unlike DNA replication where the DNA is only copied once each time a cell divides, multiple copies of mRNA are transcribed Irom each gene during transcription. Sometimes a tell transcribes thousands of copies of mRNA from a gene. Later in this section, you will see that cells with a high requirement for a particular protein generally produce large amounts of mRNA to encode that protein. Transcription produces different types of RNA We have already seen that mRNA is produced when many genes are copied into RNA. Two other types of RNA, transfer RNA (tRNA) and ribosomal RNA (rRNA), are al^o produced by transcription. Different RNA polymerases produce each type of RNA. As we will soon learn, only mRNA carries information that codes for the symhe.is of a protein, but tRNA and rRNA are also essential (fir protein synthesis. mRNA processing In eukaryotic cells, the initial mRNA copied from a gene is called a primary transcript (pre-mRNA). This mRNA is immature and noi hilly functional. Primary transcripts undergo a series of modifications, collectively called mRNA processing, before they are ready for protein synthesis. One modification involves RNA splicing (Figure 2.12) When the details of transcription were first worked out. scientists were surprised to learn that genes are interupted by stretches of DNA that do not contain protein coding information called introns. Introns are interspersed between exons, protein coding sequences of a gene. Introns and exons are copied during transcription of mRNA.
Before mRNA can be used to make a protein, the exons must lie spliced together. As a simple analogy. think of introns as randomly inserted letters in a sentence that must be removed before the sentence can make sense. In the splicing process, introns are cut out of the primary transcript and adjacent exons are spliced to form a fully functional mRNA molecule with no introns. Splicing provides flexibility in the types of proteins that can ultimately be produced from a single gene. When genes were first discovered, scientists thought that a single gene could produce only one protein. But as the process of splicing was revealed, it became clear that when a gene contains several exons, splicing doesn't always occur in the same way. As a result, multiple proteins can he produced from a single gene. In a complex process called alternative splicing, splicing sometimes can join together certain exons and cut out other exons, essentially treating them as introns (Figure 2.12(b)]. This process creates multiple mRNAs of different sizes from the same gene. Each mRNA can then be used to produce different proteins with different, sometimes unique, functions. Alternative splicing allows several different protein products to be produced from the same gene sequence. For instance, certain genes used lo produce antibodies are alternatively spliced to produce some antibody proteins that attach to the surface of cells as well as other antibodies with different structures causing them to he secreted into the bloodstream. Similar splicing occurs with neurotransmitter genes among many others. In fact, scientists originally believed that the human genome contained approximately 100,000 genes based on a predicted number of proteins made by human cells. As you will learn in Chapter 3, genome scientists were very surprised to find that the human genome contains only around 35,000 genes. Much of the reason for this discrepancy is that many human genes can be spliced in different ways. Another type of processing occurs at the 5' end of mRNA where a guanine base containing a methyl group is added (Figure 2.12). Known as a 5' cap, this structure plays a role in ribosome recognition of the 5' end of the mRNA molecule during translation. Lastly, in a process called polyadenylation, a string of adenine nucleotides around 100 to 300 nucleotides in length is added to the 3' end of the mRNA creating a poly(A) "tail" (Figure 2.12). This tail protects mRNA from degradation in the cytoplasm, increasing its stability and availability for translation. Following processing, a mature mRNA leaves the nucleus and enters the cytoplasm where it is now ready for translation (Figure 2.12). Translating the Code: Protein Synthesis The ultimate function of a gene is to produce a protein. We have seen how RNA is made- through transcription. Here we look at a brief overview of translation, 1 using information in mRNA to synthesize a protein from ammo acids. Translation occurs in the cytoplasm of cells as a multistep process that involves several different types of RNA molecules. It will he much easier for you to understand the details of translation if you are Familiar with important functions of each type of RNA. The major components of translation are:
Messenger RNA (mRNA)an exact copy of a gene. Acts as a "messenger" of sorts by carrying the genetic code, encoded by DNA from the nucleus to the cytoplasm where ibis information can he read to produce a protein. Ribosomal RNA (rRNA)short, single-stranded molecules around 1,500 to 4.700 nucleotides long. Ribosomal RNAs are important components of ribosomes, organelles that are essential for protein synthesis. Ribosomes recognize and bind to mRNA and "read" the mRNA during translation. Transfer RNA (tRNA)molecules that transport amino acids to the ribosome during protein synthesis. We have already discussed the details of mRNA structure, but before we explore the details of translation, you need to be familiar with the genetic rode of mRNA and the specific structures of ribosomes and tRNA. The genetic code What is the "genetic code" contained wiihin mRNA? As you will learn shortly, ribosomes read the code and then produce proteins, which are formcd by joining the building blocks called amino acids (see Table of amino acids on the inside back cover). A chain of amino acids linked together by covalent bonds is a polypeptide. Some proteins consist of a single polypeptide chain, while others contain several polypeptide chains that must wrap and fold around each other to, form complicated three-dimensional structures. We will discuss protein structure in more
detail in chapter 4. Proteins can contain combinations of up to 20diffcrem amino acids, yet there are only four bases in mRNA molecules, mi how does ihis code \voi U ? What inlonnation is the ribosome decoding to toll a cell what amino acids belong in a protein? If there are only lour nucleotidcs in mRNA, how can inRNA provide information coding for 20 different amino acids? The answers 10 these questions lie in the genetic Code, a fascinating aspect of biology because it is a universal language ol genetics used by virtually all fiving organisms, flu- code works in three-nucleotide units called codons, which are contained within mRNA molecules. Each codon codes for a single amino acid (Table 2.i). For instance, notice that the codon UAC codes (or the amino acid tyrosine, while the codon UGC codes for the amino acid cysteine. Although each codon codes lor one amino acid, there is flexibility in the genetic uidc. There are 64 different potential codons corresponding 10 all possible combinations ot the four possible bases assembled into three nucleotide Codons (43). Bin because there are only 20 amino acids, most amino acids may be coded for by more than one cudon For example, notice in Table 2.3 thai the amino acid Ksine may be coded for by AAA and AAG. Having redundancy of codons increases the efficiency of translation. Some codons are present in mRNAs with greater frequency than others, just as some \vords in the English language are preferred over Others with identical meanings. THE GENETIC CODE 2.4 RNA AND PROTtIN SYNTHESIS 41 Also contained in the genetic code are nucleotides, which tell ribosomes where to begin translation and end translation. The start codon, AUG. codes for the amino acid ineihionine and signals the starling point for mRNA translation. As a result, the first amino acid in many proteins is rnethionine, although this amino add is removed shortly after translation in some proteins. Stop codons terminate translation. UGA is a commonly used stop codon in many mRNA, but UAA and UAG are other stop codons (Table 2.3). Stop codons do not code for amino acids; they simply signal the end of translation. Because the genetic code is universal, it is used by cells in humans, bacteria, plants, earthworms, fruit flies, and all other species. There are some subtle differences to the code in certain species, but at the basic level it operates the same way throughout biology. As we will discuss in Chapter 3, because the code is universal, biologists can use techniques called rccombi-nant DNA technology to clone a human gene such as the insulin gene and insert il into a bacteria so that bacterial ceils transcribe and translate insulina protein they normally do not produce. Another helpful aspect of the universal genetic code is that it enables scientists to clone a gene in one species such as a mouse and then use sequence information from the mouse gene to identify a similar gene in humans. Because different species share a common genetic code, this approach is a very common strategy for Second Position
G U
UU Phenylala UCU -i UCC UAU UGO'-i_ U nine -i . uGCJCyste,ne -i uACJTyroS,ne uuc -i Serine UU A-i
uuG L
UAA- Stop UGA' Stop UAG- Stop UGG' Tryptophan CGU-. CAC J
CAA
A G U C Aa j? G
J euc,ne
Glutamine CAGJ
1 AAU isoleucin AC Threoni AAU i ^ AGU -i .Uy U e U ne jAsparao.no Serine AGC J " -A AC Cw UC C m . U . AU J A Methioni AU ne G' GGU Valine U GU C GU A GU G AC A | AC G GCU-i GCC IAIanine GCG-1 AAA -i , . Aromine J Lysme AAG AGGJ ' G
.
:<
^; P site (Peptidyl-' ^JRNA binding site) A site (Aminoacy!' tRNA binding sile] empty tRNA IRNA molecule with atlached amino acid t"?- e 1) Codon recognition Amino acid attachment site - 2) Peptide bond 3) Translocate formation Amino acid "charging" reaction <3')QQgj(S') Anticodon Figure 2.13 Stages of Protein Synthesi: Aminoacyl tRNA _-- mes conan a arge and a small subunit. Shown here is a ribosome attached to mRNA. Ribosomes contain two binding sites for tRNA molecules, called the A site and the P site. Abbreviated steps of translation are shown 1-4. (b) Shown is a diagrammatic example of the tRNA symbol used in this book. At one end of each tRNA is an amino acid binding site and at the opposite end is a 3-nucleotide anticodon sequence. identifying human genes, including many involved in disease processes. Ribosomes and tRNA molecules Ribosomes are complex structures consisting ol aggregates of rRNA and proteins that lorm structures called subunits. Each ribosome contains two subunits, the -large and small subunits. These subunils associate to form two grooves, called the A site and the P site, into which tRNA molecules can bind (Figure 2.13). Transfer RNAs are small molecules less than 100 nucleotides long. Transfer RNA molecules fold in intricate ways, and several nucleotides in a tRNA base pair with each other. As a result, a tRNA assumes a structure called a cloverleaf because as regions of the molecule base pair, other unpa'red segments create loops. At one end of each tRNA is an amino acid attachment site [Figure 2.13 (b)]. Enzymes in the cytoplasm called arninoacyl tRNA synthctases attach a single amino acid to each tRNA molecule, creating what is known as an aminoacyl or "charged" tRNA. Charged tRNA molecules carry their amino acids to I he ribosome and bind within grooves of the ribosomes ai die A site. At the opposite end of each IRNA molecule Is a three-nucleotide sequence called an amicodon. Different amino acids have different amicodon sequences. As you will learn shortly, anticodons an- designed to complementary base pair with codons in mRNA. Now thai you know the "players" ol translationmRNA. ribos omes, and tRNAwe will examine hot* these components come together to produce a protein. Stage: of translation
There are some fundamental dill nonces between Irjnslation In prokaryotes .md ctik.iuoii's. Here we provide an overview ol bask aspi-u-, of the three major stages of translation in eukaryou-s: initiation, elongation, and termination. The beginning of translation is called initiation. During initiation, the small ribosomal subunit binds to the 5' end of the mRNA molecule by recognizing the V cap of the mRNA. Other "proteins called initiation lauois are also involved in guiding the small subuuit to the mRNA. I he SUMII Mihunii moves along ihr mRNA until it
CHAPTER 2
encounters the sum codon, AUG. Pausing at the start codon. the small sulnmit waits for the correct t.RNA. called the initiator iRNA. to come along (Figure 2.1 3). This tRNA has the ammo aci'd methionine (met) attached to it (remember that most proteins begin with this amino acid) and contains the amicudon UAC. Tile UAC anueodon binds to the start codon by complementary base pairing (Figure 2.13); then the large ribosomal suhimit binds to this complex containing the small subunit. initiation factors, niRNA, and initiator tRNA. After all these components are'in place, the ribosome can start translating a protein. The next cycle ol translation is called elongation because during this phase additional tRNAs enter the ribosonu1, one at a time, and a growing polypcptidc chain is elongated. The ribosome, paused at the second codon, wails for the (second) tRNA to enter the A site. In Figure 2.! 3, notice that the second codon is UUC, which codes lor the amino add phcnylalaninc (phe). The phe-tRNA enters the A site of the rihosomc and the anticodou (AAG) base pairs with the codon. After two tRNAs are attached 10 the rihosome. an enzyme in the ribosome called peptidyl transferase caialyz.es the formation of a pepiide bond between the amino acids (attached to iheir iRNAs!. Pepiide bonds join together amino acids to form a polypcp-tide chain. " After the amino acids are attached to each other, the initiator IRNA. without methionine attached, is released from the rihosome. Released "empty" tRNAs arc recycled by the cell. A new aiaino acid is attached to the tRNA so ih.it it can he use ; again for translation. The newly forming polypeptide remains attached to the tRNA in the A site. During a phase called iranslo-cation. the nbosome shilts so that the tRNA and growing protein mo\'e into the P site of the ribosome. The tRNA with a growing polypeptide chain attached is called a peplidyl iRNA. The A sile ol the riliosume is now aligned with the third codon in sequence (UGG, which codes for tryplophan), and the ribosome waits for the proper aminoacy! tRNA to enter the A site. The cycle continues as described to attach the next
amino acid (iryptophan) to the g/owing protein and repeals itself as the rihosome moves along the mRNA. Elongation cycles continue to form a new protein until the ribosotne encounters a stop codon (for instance, UGAt. This signals the third stage of translation called termination. Remember that stop melons do not code for an amino acid. Proteins called releasing factors interact with the stop codon to terminate translation. The ribosomal subur.its come apart and release from the mRNA, and the newly synthesized protein is released into the cell. Ribosomes do a-cyclc and subsc ,uemly can bind to any other mRNA molecule (not just the rnRNA for one particular gene) and start the process of translation again. YOU DECIOB Access to Biotechnology Products for Everyone? 0 Now that you have studied what genes are and how they are used to create proteins, in Chapter 3 you will learn how genes can be identified, cloned, and studied in great detail. One benefit of gene cloning has been the identification of genes involved in human disease conditions. As a result, it is possible to make many gene products in the laboratory and use them for medical purposes. For instance, when the gene for insulin was cloned in bacteria, it became possible to produce large amounts of insulin for treating people with certain forms of diabetes. Similarly, cloning of the gene for human growth hormone (hGH), which stimulates growth of bones and muscles during childhood, provided a readily available source of this hormone. Available by prescription only, hGH is v/idely and effectively used to treat children with certain forms of short stature or dwarfism. Dwarfism is generally defined as a condition that results in an adult height of 4'10" or shorter. The availability of hGH and other products of biotechnology raises an ethical question. Should hGH be available to everyone that wants taller children or only those ch Idren with dwarfism? Suppose parents wanted their average-sized son to be taller so that he would have a better chance of making his high school varsity basketball team. Should these parents be able to give their son hGH simply to enhance his height? You decide. Basics of Gene Expression Control
IJioloj'jsls use- Ihe term gone expression Hi relei lo the production of mRNA (and sometimes protein) by a cell. Cells are exquisitely effective at controlling gene expression and translation to accommodate their needs. All genes are not transcribed and translated at the same rate in all cells. All cells of an organism contain the same genome, so how and why are skin cells different from brain cells or liver cells? Different cell types have different properties and carry out different functions because cells can regulaie or control the genes they express. At any given time in a Cell, only certain genes are "lurned-on" or expressed to produce proteins, while many other genes are silenced or repressed. These genes may only be expressed by cells at certain times, in response to specific cues from inside or outside of the cell to make proteins as needed. These cues can be environmental signals such as temperature cli^nges, nutrients in the external environment, hormones, orj other complex chemical signals exchanged by cells. Activators (e.g. hormone/receptor protein) Enhancers (e.g.TGTTCT) 1) Activator proteins bind to enhancer sequences in the DNA. 2) DNA bending brings the bound activators closer lo the promoter Other transcription (actors and RNA polymerase are nearby. 3) Protein-binding domains on the activators attach to certain transcription (actors and help them form an active transcription initiation complex on the promoter that stimulates RNA synthesis by RNA polymerase. 2*4 RNA AND PROTEIN SYNTHESIS Start site Figure 2.14 Promoters, of gene Enhancers Gene 43 Transcr ption Initiation Complex How can genes be turned on and off in response to different signals? Biologists call this process gene regulation. Prokaryotic cells and eukaryotic cells regulate gene expression in a number of different ways. One common mechanism used by both types of cells is called transcriptional regulationcontrolling the amouni of Transcription Factors, and
mRNA transcribed from a particular gene as a way to turn genes on or off. Here we provide an introduction to transcriptional regulation and consider basic examples of this process in etikaryotes and prokaryotes. Transcriptional regulation of gene expression Because the amount of protein translated by a cell is often directly related to the amount of mRNA in the cell, cells can regulate the amouni of mRNA produced for any given gene to control indirectly the amount of protein a cell produces. How do cells know which genes to turn on and which to shut off? To understand transcriptional regulation we need to look at the role of promoter sequences more closely. Promoters are found "upstream" of gene sequences, meaning that they are found at the 5' end of a gene. Genes in prokaryotes and eukaryotes do not all use the same promoter sequences. In eukaryotes, common promoter sequences found upstream of many., genes RNA synthesis include a TATA box (TATAAAA), located about 30 nucleotides ( 30) upstream of the start siie ol a gene and a CAAT box (GGCCAATCT) located about 80 nucleotides (-80) upstream ol a gene (figure 2.14). Earlier in this chapter we discussed how RNA polynierase Jniiiales iransirpiion by binding'to promoter tfftotns ko'dekatari " ' sequences adjacent to genes. For most eukaryotic genes, RNA polymerase cannot properly recognize and bind to a promoter unless transcription factors are also present at the promoter. Transcription factors are DNA binding proteins that can bind promoters and interact with RNA polymcrose lo siimuKiie ii.msaiptiou of o gene (Figure 2.14). In euk.iryotes and prokoryotes, common transcription factors interact with promoters lot many genes; ho\ve\cr. both types o| eells also use specific transcription factors that only interact with certain promoters. Transcription of some genes also depends on the binding of specific transcription factors to regulatory sequences adjacent to the promoter. In addition, many genes that are tightly regulated by cells also contain regulatory sequences called enhancers. Enhancer sequences arc usually located around 50 or more base pairs
upstream of the promoter, but they can also be located downstream ol a gene. Hnhancer sequences bind regulatory proteins, generally referred
CHAPTER Z
to as activators. Activator molecules interact with transcription facto is and RNA polymerase forming a com-plex that stimulates (activates) transcription of a gene. Cells use a wide variety of dilferent activator molecules. Each activator hinds to a particular enhancer sequence. Some activator mnlctul.'s can W hormones. For instance, ihe mate sex stere'd testosterone can act as an activator to stimulate gene expression. You may know that testosterone stimulates cellular activities such as muscle and nair growth in developing boys, but how does testosterone work? This Itormone hinds to' a receptor prolein inside cells. The Icsiostcronc-recepter prolein complex acts as an auiv.tlor to hind to specific enhaiKct elements on DNA called jndrogcn-response elements (S'-TGTTCT-?'). These elements are usually lotmd close to a promoter. In turn, testosterone and its retepter stimulate gene expression. The female sex steroid estrogen works in a similar manner. But testosterone and other activators ddn't stimulate expression el all genes tn all cells. Activators act only on those genes that contain enhancer sequences that they can bind to. Tor instance, testosterone stimulates expression of genes involved in muscle growih and hair growth because these genes contain andro-gen-response elements. Transcription, el other genes without androgcn-rcsponsc elements are my. dlrect(y ^ affected by the hormone. Incidentally, steroid autrse^y" bodybuilders and athletes looking to increase muscle mass and tone can cause serious long-term health effects in part because steroids abnormally stimulate gene expression lor prolonged periods el time. RNA polymerase ..^ Through activators and enhancers, cells can use| transcriptional control to regulate gene expression and j control cellular activities. Some genes even contain ; represser sequences that decrease transcription. Because different cells produce "different transcription factors and activator molecules, genes can be turned , on in some tissues and not others. Skin cells turn on different genes than muscle cells do, so each cell type ; produces different proteins giving each cell type different functions. Consequently, tissue- and cell-specific gene expression is
one way for cells to control the proteins they express even though all body cells contain the same genome. These important control mechanisms are part of why different cells have different functions. hi addition, Jdentilying the promoters enhancer sequences, and transcription factors that hind these sections of a gene is important lor making many biotechnology products. For instance, identifying transcription factors that stimulate the expression of proteins needed for bone growth and development is helping scientists develop new drugs that can be used to stimulate bone growth in people suffering from forms of arthritis when their cells no longer produce bnnegrowth stimulating factors.
r
Bacteria are very important organisms for many applications of biotechnology such as producing human proteins. In several sections oi this book, we will disStructurat genes 2.5 MUTATIONS- CAUSES AND CONSEQUENCES 45 Polypeptide folding Represser protein Lactose prevents represser from blocking transcription Growth medium OOO /3-Galactosidase' Transacetylase Figure 2.15 The lac Operon By controlling the lac operon, bacteria cells can regulate gene expression in response to availability of the sugar lactose. In the absence of lactose, the lac represser binds to the operator blocking transcription of the operon. In the presence of lactose, lactose binds and inactivates the represser allowing transcription of the operon to occur. iuss how gene expression in bacteria cian be regulated lor a particular purpose. Many<JnTtial studies)on gene regulation were carried out in liactenaTScientists discovered that bacteria use a variety ol mechanisms to regulate gene expression. Bacteria and other microorganisms can and must rapidly control gene expression in response to environmental conditions such as growth nutrients, temperature, Permease
and light intensity. One interesting aspect of gene expression and regulation in bacteria is that many bacterial genes are organized in arrangements called operons. Opecons are essentially_clusters of several related genes that are l(]c7^d~U)gether.andja>nftx)Ued by a single jTrojijoR-r. ThlTgenS of an operon can be regulated in response to changes within the cell and many genes controlling nutrient metabolism by bacteria are organized as operons. Bacteria can use operons to tightly regulate gene expression in response to their nutrient requirements. Here we present a well-studied, classic example of gene regulation in bacteria by describing the lac operon (Figure 2.15). The lac operon consi .is of the following three genes: lacz. encoding the enzyme fi-galactosidase lacy, encoding the enzyme permease lac a, encoding the enzyme acetylase Together, these three enzymes are necessaryfor the transport and breakdown of lactose by bacterial cells. Lactose, a sugar present in milk, is an important energy ^tjr^efo^rjTtany^bacteria. FoTTacTeTia" to metabolize lactose, tlie_sjugar_musM3e transported 'nlP_?!ls by permease and [ then_degradelj into gjti-cose_a_nd_gaj_actose by p-galactosidase. The function of acetylase^ is not clear, alflioughTTnia'y play a role in~proiectmg cells against toxic products of lactose degradation. The lac operon is regulated by a protein called the' toe represser, which is encoded by" a separate gene called the lac ijejie. When bacteria are grown in the absence oQgctose. the represser protein binds to a sequence within the lac operon promoter (p) called the operator (o). By binding to the operator, the represser blocks RNA polymerase from binding to the promoter and blocks transcription of the z.y. and a genes in the operon (Figure 2.15). This is a nice way for bacteria to control their metabolism. Why expend entrgy transcribing genes and translating proteins if there is no lactose available for pro' Conversely, in the presence of lactose, the sugars act as inducer molecules that stimulate transcription of the lac operon. Lactose hinds to ihe Ai.' represser
changing the shape of the represser protein and preventing it from binding to the operator (Figure 2.15). With no represser in its way. RNA polymerase can bind 10 the liic promoter and stimulate transcription of the operon. Transcribed mRNA front the operon is translated to produce the enzymes required by the cell to metabolize lactose. We conclude this chapter with a brief discussion of how genes can he affected by mutation. 2.5 Mutations: Causes and Consequences A mutation is a change in the nudeotide sequence ol UNA. Mutations are a majoi cause of genetic diversity. For instance, the underlying basis of the evolution of species to develop and acquire new characteristics is governed by mutations of genes over time. Mutations can also be detrimental. Mutation of a gene can result in the production of an altered protein that functions poorly or in some cases no longer encodes a functional protein. Such mutations can cause genetic diseases. In this section, we provide an overview of different types of mutations and their consequences. Types of Mutations There are many different causes of mutation. Sometimes mutations can occur through .spontaneous events such as errors during DNA replication. For instance, DNA polymerase can insert the wrong nucleotide into a newly synthesized strand of DNA, say, inserting a T where a C belongs. Even though enzymes in cells work to detect and correct mistakes, errors occasionally occur during DNA implication. Mutations can also be induced by environmental causes. For example, chemicals called mutagens, many of which mimic the structure of nucleotides, can mistakenly be introduced into DNA and change DNA structure. Exposure to X-rays or ultraviolet light from the sun can also mutate DNA (that glowing tan you may ep.joy during the summer is not as healthy as you think!). Regardless of how mutations arise, depending on the type of mutation, they may have no effect on prolein production, or they can dramatically change prolein production and protein siruciurc and function. Mutations can involve large changes in genetic information or single nucleotide changes in a gene, such as
changing an A to a C or a G to a T. The most common mutations in a genome are single nucleotide changes (or a few nucleotide mutations} called point mutations. Point mutations often involve base pair substitutions, in which a base pair is replaced b\ a different base pair; insertions, in which a nucleotide is inserted intoja gene sequence: or deletions, the removal of a base pair (Figure 2.16). '
4G CHAPTER S AN INTRODUCTION TO GENES AND GENOMES 2.5 MUTATIONS: CAUSES AND CONSEQUENCES
NORMAL (WILD TYPE) GENE
Lys
Phe (f Gly
|'
BASE-PAIR SUBSTITUTION
I |||R||A| lAl iGl IUI lUl lUl IGMGI IL lUIIAjJA Stop Frameshift causing immediate nonsense Nonsense: Creates stop codon Slop Insertion or deletion of 3 nucleotides: no extensive frame^hitt Stop Stop Figure 2.16 Types of Mutations Mutations can influence the genetic code of rnRNA and resulting proteins translated from a gene. Shown here is a portion of TnRNA copied from a gene, but mutations generally occur within DNA. Mutations in a gene can have different consequences on the protein tran; islated. Mutations ultimately exert Ihcir '.-liens on a cell by changing the properties of a protein, which in turn can aifett traits. Gene mutation can cause: Changes in protein structure and function, synthesis of a nonfunctional protein or ,- no protein synthesized can lead to: \ I
Change or loss of a trait Proteins are large, complicated molecules. To work correctly, most proteins must be folded into complex three-dimensional shapes. Changing one or two amino acids in a protein can alter the overall shape of a protein, dramatically disrupting its function or in some cases pre\ ntting the protein from functioning at all. Mutations may have no effect on a protein if the mutation changes the coclon sequence (if'a gene to another codon that codes lor the same amino acid (Figure 2.16). These are considered silent mutationsL because they have no effect on the structure and func-^l lion of the protein. Similarly, a mutation can change afi codon so that a difierent amino acid is coded for. ThesejH missense mutations can also be considered "silent"' if the new amino acid coded for doesn't change protein structure and function. However, if the newly coded amino acid changes the structure of the protein ihen Us function may be significantly changed. Late in this chapter, we'll consider a dramatic example o how a single-nucleotide mutation is responsible foi the human genetic condition called sickle-cell disease; Sometimes mutations called nonsense muta tions change a codon lor an amino acid into a sto codon, which causes an abnormally shortened proteii to be translated, usually creating a nonfunctional pro tein. Insertions or deletions can jiso dramatically affec the from a gene protein produced by screatin "frameshifts." As shown in Figure 2.16, inserting
nucleotide (U in'this example) causes the readiri' frame of the codons to be shifted to the right of tK lion changing the protein encoded by the mRNA. .eshifis often create nonfunctional proteins. Rotations Can Be inherited or Acquired It is important to realize that not all mutations have the same effect on body cells. The effects of a mutation (a) Noimal hemoglobin and normal red blood cells Normal hemoglobin DNA In the DNA, the mutant template strand has an A where the normal template has a T. 3'
The mutant mRNA has a U instead of an A in one codon. The mutant (sickle-cell) hemoglobin has a valine (Vcl) instead of a gjutamic acid (Glu). mRNA Normal hemoglobin 33' depend not onU on i!i<- type ol muuition that occurs Inn also on lltc cell .\pc in \vhkh a nuiuiikm occurs. Gene mutations can be inherited or acquired. Inherited mutations are those mutations that are passed to offspring through gametes sperm or egg cells. As a result, the mutation will be present in the genome of all of the offspring's cells. Inherited mutations can (b) Sickle-cell hemoglobin and sickled red blood cells Mutant hemoglobin DNA ^Sickle-cell hemoglobin 12 a Chain Figure 2.17 Molecular Basis of Sickle-Cell Disease (a) Hemoglobin, the oxygen-binding protein of red blood cells, consists of four polypeptide chains. A Portion of the normal hemoglobin gene is shown here along with its transcribed ttRNA and the first seven amino acids for one of the hemoglo.bin polypeptides. which contains a total of 146 amino acids, (b) The defective gene that causes sickle-celt disease contains a single base pair change in its sequence that alters Ihe hemoglobin protein translated. This subtle change alters the shape of red blood cells causing them to sickle. Sickled cells are fragile, they can clump and block blood vessels, and they do not transport oxygen well. 3 4 5 6 7.. 146
CHAPTER Z 49
QUESTIONS & ACTIVITIES "You're lucky nobody was hurt Your base pairs are out of alignment any that has your reading frames all messed up " Figure 2.18 Biotechnology Is Being Used to Detect and Correct Mutations ' cause birth defects or inherited genetic diseases. We will study a number ol different inherited genetic diseases throughout this hook. Acquired mutations occur in the genome of somatic cells (remember lhat these include all other cell types except for the gametes) and are not passed along to offspring. Although not inherited, acquired mutations can cause abnormalities in cell growth leading to cancerous tumor lormation, metabolic disorders, and other conditions. F-.ir instance, prolonged exposure to uluaviolei light can cause acquired mutations in skin cells leading to skin cancer. Understanding the genetic basis of cancer and many other human diseases is a major area of biotechnology research, which we will discus-; in more detail in Chapter 11. Mutations Are the Basis of Variation in Genomes and a Cause of Human Genetic Diseases Mutations form the molecular basis ol many human genetic diseases, Sickle cell disease was the lirsi genetic disease discovered whose cause was pinpointed to a particular mutation (Figure 2.17). Sickle-cell disease is created by a single nucieotide change, a base pair substitution, in the gene coding for one of the polypeptidcs in the protein hemoglobin. Hemoglobin is ihe oxygen-binding protein in red blood cells. Red blood cells contain millions of hemoglobin molecules, each ol which consists ot lour polypeptide chains (figure 2.17). A point mutation in this gene, technically called the B-globin gene, changes the genetic code so that the gene codes (or a diflerem amino acid at position 6 in. one of the hemoglobin polypeptides. As a result, the amino acid valine is inserted into sickle-cell hemoglobin instead of gluVj tamic acid. Individuals with two defective copies of
the hemoglobin gene suffer the effects of sickle-cell disease. This subtle mutation alters the oxygen-transporting ability of hemoglobin and dramatically changes the shape of red blood cells to an abnormal sickled shape. Sickled cells block blood vessels, and patients suffer from poor oxygen delivery to tissues, causing joint pain and other symptoms. Sickle-cell disease is one of the most well-understood inherited genetic disorders. In Chapter 3, we will explore how scientists working on the Human Genome Project have identified and a are analyzing the roughly 3 billion base pairs that : comprise human DNA. As scientists have learned more about the human genome, they have discovered that DNA sequences from people of different backgrounds around the world are very similar. Regardless of ethnicity, human genomes are approximately 99.9% identical. In other words, we all have about 99.9% percent the same DNA sequences as President George VV. Bush, Shaq'uille O'Neal, Britney Spears, Saddam Hussein, Michael Jackson, and virtually any other human on the earth! But since there are about 0.1 % differences in DNA between individuals, or around 1 base out of every thousand, tljis means there are roughly 3 million differences between different individuals. Many of these variations are created by mutation, and most of these have no obvious effects; however, other mutations strongly influence cell functions, behavior, and susceptibility to genetic diseases. These variations are important and are the basis for differences in all inherited traits among people, from height and eye color to personality, intelligence, and lifespan. Most genetic variation between human genomes is created by substitutions of individual nucleotides called single nucieotide polymorphisms (SNPs). For instance, at a particular sequence, a certain base in a" given region of DNA sequence from President Hush may read "A", Shaquille O'Neal's sequence may read "T", Britney Spears' sequence may read "C", and the same site in your sequence may read "G." Most SNPs
are harmless because they occur in intron regions of DNA; however, when they do occur in exons, they can allect the structure and function of a protein, which can influence cell lunction.iind result in disease. Sickle cell disease is caused by an SNP. Refer to Figure 1." which shows a comparison of a gene sequence I'ror three different people. An SNP in this gene in person may have no effect on protein structure and functio if it is a silent mutation. Other SNPs (person |) cause disease if they change protein structure and function CAREER PROFILE Careers in Genomics It is an incredibly exciting time to consider a career in genomics. Never b afore have there been greater opportunities or a wider variety of career options for anyone interested in genomes. In addition to studying the human genome, scientists are actively involved in studying the genomes of many other species including model organisms such as mice, truit flies, zebrafish, agriculturally important crop plants and olant pests, disease-causing microbes, and marine organisms. As enormous amounts of genome information become available, decades of work will be necessary to study what different genes do. Deciphering the secrets contained in genomes will involve the combined efforts of many people. A recent publication from the National Institutes of Health (Genetic Basics, NIH Publication No. 01-662, www.nigms.nih.gov) proclaimed that "Help Wanted" signs are up all over the world to recruit thousands and thousands of human brains to contribute to the study of genomics. Career opportunities in genomics primarily fall into four major categories: (1) laboratory scientists, (2) clinical doctors, (3) genetic counselors, and (4) bioinfor-matics experts. Laboratory scientists are the so-called bench scientists because they conduct experiments at the lab bench on a daily basis. Lab scientists are often i wolved in carrying out experiments to discover and clone genes and study their functions. Entry-level lab scientist
opportunitie s exist as lab technicians for those with Associate's and Bachelor's degrees, and higher-level scientist positions such as laboratory director positions require a Master's degree or a Doctor of Philosophy In Chapter 11, we will consider several genetic disease conditions, discuss how defective genes can be detected, and examine how scientists are working on gene therapy approaches to cure these diseases (Figure 2.18). QUESTIONS & ACTIVITIES Answers can he found in Appendix 1. 1. Compare and contrast genes and chromosomes, and describe their roles in the cell. 2. If the sequence of one strand of a DNA molecule is 5'AGCCCCGACTCTATTC-3', what is the sequence of the complementary strand? (Ph.D.) degree. Clinical doctors are M.D.-trained physicians who conduct research and interact with patients as part of research teams. Clinical doctors may be involved in research studies to treat patients with new genetics-based treatments such as gene therapy protocols. Physicians are also important because genomics is havhg and will have profound effects on medicine and the treatment of human diseases. Genetic counselors help people understand genomics information such as how genes affect disease susceptibility. Counselor positions usually require a B,S. degree with training in psychology, biology, and genetics. Bioinformatics is a discipline that merges biology with computer science. Bioinformatics involves storing, analyzing, and sharing gene and protein data. The tremendous amount f DNA sequence data being generated by genome projects has made bioinformatics a rapidly 'i I emerging area that is
absolutely essential for genomics. Bioinformaticians generally have solid training in both biology and computer science, and they work together with bench scientists to analyze genome data. Most positions in bioinformatics require a B.S. or M.S. degree. Visit the Human Genome Program Information page for an outstanding site on career possibilities in genetics that also includes links to many other valuable resources. Also visit the Federation of American Societies for Experimental
Biology career opportunity website. (See Keeping Cur-cent: Web Links at the end of the chapter for specific URLs.) 3. What does the phrase "gene expression" mean? 4. Suppose that you identified a new strain of bacteria. If the DNA content of this organism's cells is 1 3% adenine, approximately what percentage of this organism's genome consists of guantnes? Explain your answer. 5. Provide at leasi three important differences between DNA and RNA. 6. Consider the following sequence ol mRNA5'-AGCACCAUGCCCCGAACCUCAAAGUGAAA-CAAAAA-3' How many codons are included in this mRNA? How many amino acids are coded for by this sequence? Determine the amino acid sequence