Rat Genomics (001-095)
Rat Genomics (001-095)
Rat Genomics (001-095)
G. Thomas Hayman
Jennifer R. Smith
Melinda R. Dwinell
Mary Shimoyama Editors
Rat
Genomics
M E T H O D S IN M OLECULAR B IO LO
GY
Series Editor
John M. Walker
School of Life and Medical Sciences
University of Hertfordshire
Hatfield, Hertfordshire, UK
Edited by
G. Thomas Hayman
Department of Biomedical Engineering, Rat Genome Database, Medical College of Wisconsin,
Milwaukee, WI, USA
Jennifer R. Smith
Department of Biomedical Engineering, Rat Genome Database, Medical College of Wisconsin,
Milwaukee, WI, USA
Melinda R. Dwinell
Genomic Sciences and Precision Medicine Center, Medical College of Wisconsin, Milwaukee, WI, USA
Department of Physiology, Rat Genome Database, Medical College of Wisconsin, Milwaukee, WI,
USA
Mary Shimoyama
Department of Biomedical Engineering, Rat Genome Database, Medical College of Wisconsin,
Milwaukee, WI, USA
This Humana imprint is published by the registered company Springer Science+Business Media, LLC, part of Springer
Nature.
The registered company address is: 233 Spring Street, New York, NY 10013, U.S.A.
Dedication
This book is gratefully dedicated to the rat research community and to colleagues and
friends who are gone too soon: Rat Genome Database members, Dr. Timothy F. Lowry
and Dr. Victoria Petri, and founder of the Rat Resource and Research Center, Dr. John
K. Critser. They are sincerely missed.
v
Preface
It is an exciting time to be involved in rat research. The rich history of physiological and
behavioral data that is available for the rat spans over 150 years of research. This extensive
body of data provides a solid foundation for the use of genomic technologies such as
whole- genome and whole-exome sequencing, single-nucleotide variant discovery, and
transcrip- tomics to explore similarities and differences between established rat models for
human diseases such as kidney disease, cancer, and metabolic syndrome, as well as new
models like the hybrid rat diversity panel. Recent advances in the use of genome-editing
reagents and embryonic stem cells now allow researchers to produce new, more targeted
models and to discover the molecular mechanisms underlying both normal and disease-
related physiologi- cal processes. Thanks to the improvements in cryopreservation and
rederivation, new models can be produced, studied for a period of time, and then preserved
and stored to await new questions or the advent of new technologies to uncover the
answers to questions we can’t answer now and, in some cases, don’t even know we should
be asking. The emerging areas of interest, such as the microbiome, have opened up new
vistas for research- ers interested in the interactions between genetics and the environment.
This book provides both a historical perspective on rat research through the years and
practical information to support researchers either currently involved in genomic research
or planning to begin such a project. In some cases, a detailed protocol is provided for
researchers looking to move into a new area of investigation or to leverage a new
technology. In other cases, a detailed review of the existing models or a description of
available resources can help the researcher find, understand, and/or utilize the information,
the data, and the tools that they need to support their research efforts. Whatever the
application, it is becoming increasingly obvious that in this so-called post-genomic era, no
single type of research is sufficient to answer the increasingly complex questions of human
disease and translational research. The rat as a biomedical model is uniquely poised to
provide the ideal combination of established experimental models, extensive physiological
data, and genomic manipulability to facilitate exploration of the underlying biology.
vii
Acknowledgments
We are thankful for nearly 20 years of funding from the National Heart, Lung, and Blood
Institute on behalf of the National Institutes of Health. We appreciate the contributions of
the authors and the assistance of Prof. John Walker and Ms. Anna Rakovsky, which helped
make this book a reality.
ix
Contents
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Preface.........................................................................................................................................vii
Acknowledgments..........................................................................................................................ix
Contributors................................................................................................................................xiii
xi
xii Contents
Index................................................................................................................................ 327
Contributors
xiii
xiv Contributors
Abstract
The laboratory rat, Rattus norvegicus, has been used in biomedical research for more than 150 years,
and in many cases remains the model of choice for studies of physiology, behavior, and complex
human disease. This book provides detailed information on a number of methodologies that can be
used in rat. This chapter gives an introduction to rat as a species and as a biomedical model,
providing historical information, a brief introduction to the current state of rat research, and a
perspective on the future of rat as a model for human disease.
G. Thomas Hayman et al. (eds.), Rat Genomics, Methods in Molecular Biology, vol. 2018,
https://doi.org/10.1007/978-1-4939-9581-3_1, © Springer Science+Business Media, LLC, part of Springer Nature 2019
1
2 Jennifer R. Smith et al.
The use of rats in biomedical research began more than 160 years
ago. The first recorded use of rats for scientific investigation was a
study by J. M. Philipeaux in 1856 [11] on the results of adrenalec-
tomy in albino rats (the 1836 article by Samuel Moss, “Notes on
the habits of a domesticated White Rat and a Terrier Dog that
lived in harmony together” notwithstanding). Since that time, rats
have been used as models to study a wide variety of biological,
physio- logical, and medical subjects.
2.1 Nutrition In 1993, Dr. Janet R. Hunt [12] stated, “Rats were the principal
animal used to discover most of the vitamins, the essential trace
elements, and the essential amino acids. As a result, more is
known about the nutritional requirements of the rat than about any
other species.”
The first study of the nutritional quality of proteins in a mam-
mal was an article published in both The Lancet [13] and the
Proceedings of the Royal Society of London [14] in 1863 entitled
“Experiments on food; its destination and uses.” In it, William
S. Savory detailed how he fed rats “nitrogenous” (high protein,
very low fat), “non-nitrogenous” (high carbohydrate and fat,
very low protein), and mixed diets to ascertain whether
nitrogenous materials were utilized for “heat production,”
tissue formation, or both. Savory explained his use of rats in a
footnote: “Rats were chosen as subjects for these experiments
because they are omnivo- rous and will readily feed on almost
any kind of diet. Moreover from their size they are very
convenient to manage.”
4 Jennifer R. Smith et al.
2.2 Breeding, The practice of breeding rats for variations in coat color substan-
Genetics, tially predates their use in laboratory science. Rather, it was com-
and mon in Japan at least as far back as the 1700s as evidenced by the
Characterization first guidebook for breeding fancy rats entitled Yoso-tama-no-kake-
hashi, published in 1775 [34]. Text and illustrations in the book
describe and show a variety of coat colors and patterns, many of
which are still seen in modern laboratory rat strains and fancy rat
6 Jennifer R. Smith et al.
lines. In addition, the author gave advice for breeding these rats so
as to not lose their “special characteristics.”
In his 1947 paper on the domestication of the rat, W. E. Castle
described a number of early studies of coat color inheritance in rat
[35]. Between 1877 and 1885, H. Crampe published a series of
articles detailing extensive breeding experiments beginning with a
tame albino female rat and a wild gray male and continuing with
successive rounds of interbreeding of the offspring. This pre-
Mendelian study showed inheritance patterns which we now know
to be controlled by three coat color mutations occurring at the c
(albino), a (non-agouti), and h (hooded) loci, although this
explanation for the various patterns of coat color and of
inheritance was far from clear at the time. Crampe’s prodigious
dataset was reviewed and reanalyzed using Mendelian principles
by Bateson in 1903. Doncaster used Crampe’s data to categorize
the offspring color patterns from his breeding experiments of
brown/gray, black, and albino rats, published in 1906. A
publication by MacCurdy and Castle in 1907 [36], which also
detailed breeding experiments to determine coat color in rats and
guinea pigs, was beginning to move closer to the idea that the
inheritance patterns of color and markings are controlled by more
than one factor, although obvi- ously without an understanding of
the specific genes or mutations.
W. E. Castle began studying rat coat color mutations in 1907,
reporting the development of a “pink-eyed yellow” and a “red-
eyed yellow” mutant in England at that time [35]. Among other
pur- suits, he and colleagues continued their exploration of the
genetics of coat color, mutations, and linkage in the rat through
1951, resulting in the publication of more than 20 articles on the
subject (e.g., [37–54]). Several of these were published with Dr.
Helen Dean King of the Wistar Institute, looking at the linkage
groups for the mutations—both those determining coat color and
physiologi- cal mutations such as “waltzing”—which were found
or confirmed by Dr. King during her domestication studies of wild
gray rats and breeding studies of the Wistar albinos. In the
aforementioned article on the domestication of the rat [35], Castle
lists 23 known mutations, 14 of which he was able to place into
four linkage groups, representing four of the 20 rat autosomes.
Dr. Henry H. Donaldson was a professor of neurology at the
University of Chicago while John B. Watson was working on his
degree. Donaldson had previously published papers on the nervous
systems of human and frog but during his tenure at the University
of Chicago he was introduced to the rat as a model for human
neurology, probably through the influence of Swiss neuropatholo-
gist Adolf Meyer. After careful consideration, Donaldson chose to
begin working with the albino rat. E. G. Conklin quoted him in his
1938 biography of Donaldson [55] as saying “It was found that the
nervous system of the rat grows in the same manner as that of man
—only some thirty times as fast. Further, the rat of three
Rat in Biomedical Research 7
2.3 BEHAVIOr Reports about studies on rat behavior began in 1898 with a paper
in the inaugural issue of the American Journal of Physiology,
“Var- iations in daily activity produced by alcohol and by changes
in barometric pressure and diet, with a description of recording
meth- ods,” by Colin C. Stewart [62]. The paper described a fairly
sophis- ticated system for recording the total daily activity of the
rats. A drum-shaped cage that the rat rotated by running in it was
connected to a clock modified to show the count of the number of
rotations of the cage and display it on the dial, possibly the first
semi-automated activity monitor. This paper was probably also
one of the first papers to look at the behavioral impact of nutrition
and of addictive substances such as alcohol.
The paper, “An experimental study of the mental processes
of the rat. II.” by Willard S. Small [63], published in the
American Journal of Psychology, appears to be the first
description of the use of a maze to test rat behavior. Small
mentioned that the “Hampton Court Maze served as model for
the apparatus. The diagram given in the Encyclopedia
Britannica was corrected to a rectangular form, as being easier
of construction.” Since the goal of the study was to examine
“the method of animal intelligence” rather than being a
quantitative study of the time needed for a rat to traverse the
maze, the author gave extensive descriptions of the movements
of the rats in the maze during two series of trials, one
consisting of five experiments and the other consisting of nine.
Not surprisingly, he noted that overall, the time to achieve the
reward at the center of the maze and the number of errors along
the way both diminished over time, although there were some
variations.
Two years later, in 1903, the psychologist John B. Watson,
who is best known for establishing the “behaviorism” theory of
psychology, earned his PhD from the University of Chicago on the
basis of his work on the relationship between brain myelination
and learning in rats of various ages [64]. Watson also employed
mazes as a test of learning ability. In the book “Behavior: an
introduction to comparative psychology” [65], published in 1914,
Watson began his description of the modified Hampton Court
maze by saying that it was “too well known to require
description,” implying that they were commonly in use.
Although best known for his work on human behaviors and
behaviorism as a branch of psychology, Watson was a strong
Rat in Biomedical Research 9
2.4 Endocrinology As mentioned previously, the first recorded use of rats for the study
of endocrinology was in an 1856 study by J. M. Philipeaux on the
results of adrenalectomy in albino rats. Like Philipeaux’s
10 Jennifer R. Smith et al.
3.1 Rat From Crampe’s first adrenalectomy studies in the 1850s, rats have
as a Precision been used as models for human physiology and disease. In
Model for Disease “Animal Models of Human Nutrition” Dr. Janet R. Hunt
references E. V. McCollum’s decision to use rats in his nutrition
studies “because of their convenient size, omnivorous feeding
habits and lack of eco- nomic value” and goes on to state that the
“omnivorous feeding patterns of the rat usually make it a better
model for human nutrition questions than a strict herbivore such as
the rabbit.” [12]. Likewise, H. C. Sherman, who was the first to
develop quantitative measures of nutrients based on their ability to
correct diseases stemming from nutritional deficiencies, said of his
research with rats, “These animals are my burettes and balances.
They give
Rat in Biomedical Research 11
3.2 The
DEVelopment of Rat In 1958, Smirk and Hall reported results of ongoing breeding of
Strains to Study genetically hypertensive (GH) rats from the Wistar-derived rat
Disease Mechanisms colony at the University of Otago Medical School [96]. They
developed several lines of rats by both cross-breeding and brother-
3.2.1 CARDIOVASCULAR sister mated inbreeding. In the first report, they showed that the
Diseases line produced by cross-breeding had a higher average blood
pressure than the lines produced by inbreeding (141.95
12.53 mmHg T for cross-bred males vs. 135.81 T
8.2 mmHg for one strain of inbred males and 124.14 T
10.64 mmHg for control males). The differences reported are not
as great as for the SHR rats; however they are statistically signifi-
cant. In a subsequent publication [97], in which they reported on
the development of cardiac hypertrophy in the B strain of their
genetically hypertensive rats, they stated that more than 50% of the
male rats in that strain “have blood pressures exceeding 150
mmHg.”
In 1962, L. K. Dahl et al. published the first report of two
strains of rats, selected from an outbred (unselected) colony of
Sprague-Dawley rats [98]. At the time of publication, these rats
had been selectively bred by brother-sister matings for three gen-
erations and already displayed a divergence in the effects of salt on
their blood pressure. The rats were selected based on blood pres-
sure measurements after being fed a diet containing 11.6% sea salt
and administration of triiodothyronine which had been shown to
accelerate the development of hypertension in these rats. Later
studies demonstrated that the “S,” i.e., salt sensitive (SS), rats
developed increased blood pressure whether on T3 + high salt or
on high salt alone whereas the blood pressure of the “R,” or salt
resistant, rats was less than or equal to the BP of the parental
Sprague Dawley rats regardless of the conditions. By contrast,
12 Jennifer R. Smith et al.
3.2.2 Metabolic
Syndrome The Lyon hypertensive (LH), normotensive (LN), and hypotensive
(LL) rat strains were first developed in 1973 in Lyon, France, from
a colony of Sprague Dawley rats [108]. The rats for breeding were
selected on the basis of the mean systolic blood pressure at 6–
12 weeks of age and the slope of systolic blood pressure vs. age.
Although only selected for blood pressure traits, the inbred LH
strain was found to also display an increased body weight and
increased plasma lipids. Plasma phospholipids, total cholesterol,
HDL-cholesterol, and VLDL+ LDL-cholesterol were all elevated
in the LH strain relative to LN and LL. Interestingly, at 5 weeks of
age, the LL strain had the highest plasma triglyceride level. How-
ever, as the rats aged, the triglyceride level in the LH rats
increased so that at 32 weeks of age LH was significantly higher
than either LN or LL, but LL was still significantly higher than
LN. Additional studies showed that the LH strain displayed
additional metabolic disorders, namely an increase in both the
insulin level and the insulin:glucose ratio [109].
Like the LH rat strain, the SHR strain was selected for
increased blood pressure, but was later found to display symptoms
of meta- bolic disfunction. Fasting glucose was greater in SHR rats
than WKY [110] and the insulin response to an oral glucose
challenge was higher in SHR than WKY, suggesting possible
insulin resistance [111].
The OLETF strain was developed at the Tokushima Research
Institute in Japan from a spontaneously diabetic rat discovered in
1984 in an outbred colony of Long-Evans rats [112, 113]. The rats
were mildly obese and developed spontaneous hyperglycemia. Sex
differences were noted in the course of the disease. Males
developed hyperglycemia much earlier than females (25 weeks of
age for males vs. 65 weeks for females). Over time, male rats also
became hypoinsulinemic and required insulin therapy to survive
which was not seen in females. Histopathological changes in the
pancreatic islets and in the kidney were also seen in males but not
females.
The Zucker “fatty” rat was first discovered as a mutant in the
13M rat stock, an outbred line derived from black offspring of
albinos from the colony of Dr. H. C. Sherman (Columbia Univer-
sity), crossed with wild males (the “M” line) and additional rats
from the Sherman colony [114] at the Harriet G. Bird Memorial
Laboratory in Stow, Massachusetts. The mutation was named
“fatty” because when present in a homozygous form the rat
became extremely obese as a juvenile. Since heterozygous litter
mates were lean and phenotypically indistinguishable from the
non-mutant homozygotes, the mutation was understood to be a
recessive allele in a single gene and has since been shown to be a
p.Gln269Pro mutation in the extracellular domain of the leptin
receptor. The original description of the Zucker rat includes
severe hyperlipid- emia and kidney lesions, but not hyperglycemia
[115].
14 Jennifer R. Smith et al.
3.2.3 BEHAVIOR As noted earlier in this chapter, the use of rats for studies of
and Addiction behavior began over 100 years ago with studies by C. C. Stewart
on the effect of alcohol consumption, diet, and barometric pressure
on activity in captive rats. Since that time, rats have been the
model of choice for the study of behavior and addiction, and for
testing treatments for psychiatric disorders. The body of literature
covering these topics is enormous: a search in PubMed for “rat
behavior” in 2018 returned over 150,000 articles, including almost
6500 review articles.
Often rat strains developed for the study of a non-behavioral
phenotype are found to also show behavioral abnormalities. For
instance, the Spontaneously Hypertensive Rat (SHR), established
as a model of age-related hypertension, was found to develop
vascular brain disorder with associated behavioral changes as a
result of its increased blood pressure, and has also been used as a
model for Attention Deficit Hyperactivity Disorder (ADHD) as a
result of observed changes to the catecholaminergic transmission
system [118]. Similarly, the WAG/Rij rat, a model for absence
epilepsy, also showed depression-like symptoms [119]. The
Wistar Kyoto (WKY) strain was originally bred as a normotensive
control for the SHR but has been shown to display “depressive-
like symp- toms,” including increased immobility in the forced
swim test
[120] and anhedonia characterized by lower consumption of a
sweet-tasting solution in response to acute or chronic mild stress
[121]. Physiologically, the WKY exhibited abnormalities in dopa-
minergic and noradrenergic responses and the HPA axis and TSH
systems [122]. The WKY rat, in addition to being considered a
model for depression, also displayed traits considered to be indica-
tive of anxiety such as reduced activity in the open field as well as
development of stress-induced ulcers. Another model of depres-
sion, the Flinders Sensitive Line (FSL) rat, on the other hand,
displayed similar immobility in the forced swim test but did not
appear to have an increased tendency toward anxiety [120]. It was
shown, however, that young FSL rats engaged in more “intrusive”
Rat in Biomedical Research 15
3.2.4 Cancer In 1919, Dr. F. D. Bullock and Dr. M. R. Curtis at the Crocker
Institute of Cancer Research (Columbia University, New York,
NY, USA) began a project to produce a number of inbred strains
of rat for use in their studies of cancer. Previous work to induce
neo- plasms using tapeworm infestation as a chronic irritant had
demon- strated that rats from some sources were more susceptible
to tumor induction than those from other sources [133]. The group
began inbreeding rats from four commercial breeders—Fischer,
Zimmer- man, Marshall, and August—in 1919 and expanded their
efforts to include rats originally sourced from Copenhagen,
Denmark, in 1920 [7]. From these efforts, at least ten inbred
strains, including ACI, COP, Marshall 520 (M520), Fischer 344
(F344), and the now extinct Fischer 230, were developed, a
number of which are still in active use for cancer research.
Between 1920 and 1970, Dr. Curtis with her colleague Dr. W. F.
Dunning and coworkers produced a substantial number of articles
(for example, [134–142]) on tapeworm-induced sarcomas, strain
differences in susceptibility to a variety of spontaneous tumors,
chemical carcinogenesis, and chemotherapy using these inbred
strains.
Because of its history of use for cancer research, the F344/N
rat strain was the model of choice for the National Cancer
Institute, and subsequently the National Toxicology Program
(NTP), for standardized bioassays of carcinogenicity for chemical
compounds. During the more than three decades of use for these
studies, the NCI and NTP amassed an immense, publicly available
dataset derived from the testing of thousands of possibly
carcinogenic compounds [143, 144]. However, in 2006, an NTP
workshop, “Animal Models for the NTP Rodent Cancer Bioassay:
Strains and Stocks—Should We Switch?” [145] reviewed the use
of the F344/ N strain for future studies. Workshop participants
concluded that, due to problems with infertility, seizures and
chylothorax in the F344/N colony in particular, and a variable but
relatively high incidence of spontaneous tumors, particularly
testicular interstitial cell tumors and mononuclear cell leukemia,
inherent to the F344 strain in general [146], the F344/N strain
should no longer be used for NTP bioassays. After considering
several alternatives, including using a different substrain of F344
or using F1 rats from a cross between F344 and BN, the
recommendation was to replace the inbred strain with outbred rats.
The initial
Rat in Biomedical Research 17
4.1.3 GERRC With the advent of genome editing technologies for the rat, the
demand for genetically modified rats for use as disease models
has skyrocketed. However, for many researchers, production of
a gene- edited rat to confirm the involvement of a gene or
genomic region in their disease or phenotype of interest was
out of reach due to the lack of expertise and/or funding to
produce such models. In 2013, Dr. Howard Jacob and his
colleagues at the Medical College of Wisconsin were awarded a
grant to begin the MCW Gene Editing Rat Resource Center
(GERRC, https://rgd.mcw.edu/wg/gerrc/). The GERRC was
designed to leverage existing infrastructure and expertise in
gene editing to support the needs of the rat research
Rat in Biomedical Research 19
4.1.4 Commercial
Vendors In addition to the large non-commercial repositories already men-
tioned, there are several commercial sources for rat strains.
Sprague Dawley, Inc. was started in 1925 by Robert Dawley near
Madison, Wisconsin. The original breeding stock was purported to
have come from mating a hooded male of unclear origin with
albino females from Wistar stock, and subsequently with the
albino off- spring of that mating. The original male was described
as “a hybrid hooded male rat of exceptional size and vigor which
genetically was half-white” [149]. The line was partially inbred,
then changed to random breeding and the parental strain was
considered outbred. Sprague Dawley, Inc. was obtained by Harlan
in 1980 to form Harlan Sprague Dawley, which was in turn
acquired by Envigo, Inc. in 2015 (http://www.envigo.com/).
Envigo sells 14 strains of rats, two of which are offered specifically
as aged animals—Sprague
Dawley® outbred rats (SD) and Fischer 344 inbred rats. The rats
offered by Envigo include direct descendants of a number of the
original laboratory rat stocks, including Holtzman rats (an offshoot
of SD), Lewis rats, and Wistar outbred rats, in addition to the
aforementioned SD rats.
Charles River Laboratories (https://www.criver.com/)
was started in 1947 by veterinarian Dr. Henry Foster to breed rats
and supply them to laboratories in the Boston, Massachusetts
area. Their catalog currently lists 40 rat strains that are available,
including 22 inbred strains and 16 outbred.
Originally Sage Laboratories, Horizon Discovery’s stock of
“off the shelf” knockout rats
(https://www.horizondiscovery. com/in-vivo-models) includes
models for several research areas. These include knockouts of
xenobiotic sensors and drug transpor- ters for
toxicology/ADMET (absorption, distribution,
20 Jennifer R. Smith et al.
4.2 Where to The first two of the FAIR principles for data management
Find Data for Rat [150, 151] require that data be both Findable and Accessible.
Model Strains This requires the development and maintenance of data stores and
knowledgebases to consolidate and integrate data from various
sources, and in many cases, to expand, interpret, and/or analyze
the data through processes such as manual curation of the litera-
ture. Such resources also create environments in which researchers
can access and utilize the data for their own analyses and
download both the original data and their analysis results for their
own records. The resources available for rat data will be covered
more completely in Chapter 3, but we will touch on some of these
sources here.
4.2.1 RGD Arguably, the most diverse and inclusive source for rat data is the
Rat Genome Database (RGD, https://rgd.mcw.edu, [152]).
RGD was started in 1999 “to collect, consolidate and integrate
data generated from ongoing rat genetic and genomic research
efforts and make these data widely available to the scientific
community.” From the beginning, RGD was intended as a multiple
datatype and cross-species resource, including data for rat genes,
markers, quan- titative trait loci (QTLs), and strains, as well as
homologous mouse and human genes for comparative purposes.
This cross-disciplinary focus has continued. RGD now houses data
which associate disease, phenotype, molecular function, biological
process, subcellular localization, molecular pathway, gene-
chemical interactions, and protein-protein interactions with the
genomes of rat, human, mouse, dog, squirrel, chinchilla, pig, and
bonobo. Many of these associations are via the genes for these
species. In addition, RGD imports data for disease and phenotype
associations for human variants, as well as the extensive
phenotype data for mouse genes and QTLs to assist with
comparative genomic analyses.
Rat in Biomedical Research 21
research categories that strain has been used for, strain character-
istics and breeding performance, and references where applicable.
In many cases, there is a picture of the rat and an image of repre-
sentative organs for that strain. As also mentioned previously,
NBRP-Rat performs extensive phenotyping on strains submitted to
their repository. These data are available on their website under
the “Phenome” tab (http://www.anim.med.kyoto-u.ac.jp/NBR/
phenome.aspx) and are available in both graphical and tabular
format.
4.2.3 NCBI, Ensembl, The National Center for Biotechnology Information (NCBI,
and UCSC Genome https://www.ncbi.nlm.nih.gov/), the European
Browser for Genes Bioinformatics Institute’s Ensembl
and Genomics (https://www.ensembl.org/index.html), and the University of
California, Santa Cruz’s (UCSC, https://
genome.ucsc.edu/) Genome resources are multispecies
resources with diverse datasets that include genome sequences,
gene, tran- script and protein records, protein domain
information, functional data, and more. Much of the data
provided are consolidated from other resources, including the
Rat Genome Database in terms of functional annotations for rat
genes as well as curated records for QTLs. NCBI and Ensembl
both do gene predictions for whole genome assemblies.
Because the algorithms they use are disparate, the predicted
gene sets are not the same although there is substan- tial
overlap. For more information about the NCBI and Ensembl
genome annotation pipelines, see Chapter 2 in this book. These
resources also supply tools for analysis of the data they
provide, including genome browsers for viewing genes and
other genomic elements in the wider genomic context.
4.2.5 dbSNP/EBI’s The dbSNP database at NCBI and EBI’s European Variation
European Variation Archive (EVA) have, in the past, both accepted submissions of rat
ARCHIVE genomic variant data to be included in their multispecies variant
resources. As of 2017, however, dbSNP is no longer storing or
presenting variant data for nonhuman species, making EVA the
major source for nonhuman variants. The data presented include
genomic positions, affected genes, where applicable, and predicted
or validated variant consequences for corresponding transcripts.
Rat in Biomedical Research 23
4.4 Molecular
Genetic Tools The molecular genetic toolbox for rat includes genetic and geno-
mic data as well as a variety of tools for using them. Interest in rat
genetics has been a foundational research focus since the first
studies on the genetics of coat color linkage and inheritance in
the late nineteenth and early twentieth centuries. Since that time,
the field has evolved to incorporate the use of genetic markers and
single nucleotide variants as markers for QTLs, establishment of a
reference genome sequence for the rat, and whole genome
sequencing (WGS) of a number of inbred rat strains which are
considered to be either established models of human disease or
control strains for those models. Many of the chapters that follow
outline research, methods and resources that came as a direct result
of this molecular genetic toolbox and/or associated investments
into infrastructure for the rat.
4.4.1 Genetic Markers Until the advent of whole genome sequencing, assignment of
genes and genetic markers to relative chromosomal positions on
chromo- somes was accomplished by somatic cell hybrids, genetic
or radia- tion hybrid mapping or in situ hybridization. In 1990
[161] and again in 1991 [162], Levan et al. published the then-
current rat gene map, consisting of 214 genes and 11 linkage
groups with
24 Jennifer R. Smith et al.
4.4.2 Genome Sequence The project to sequence the rat genome was initiated in 2000
with a Request for Application (RFA) to form a “Network for
Large-Scale Sequencing of the Rat Genome”
(https://grants.nih.gov/grants/ guide/rfa-files/RFA-HG-00-
002.html). The initiative was funded jointly by the National
Human Genome Research Institute (NHGRI) and the National
Heart, Lung and Blood Institute (NHLBI). The stated goal was
to produce “a working draft version (3-4 fold sequence
coverage) of the rat genome sequence in two years or less.”
The specification that the result would be a working draft
version indicated that there was no intention to “finish” the
sequence, meaning that, although the draft was expected to be
of
Rat in Biomedical Research 25
4.5 Genome The availability of whole genome sequence for the rat has resulted
- Related in increased availability of genome-dependent data such as
Data variants and RNA sequencing data.
4.5.1 Strain-Specific Along with upgrading and maintaining the BN reference, there
Variants have been many attempts to identify and catalog regions of genetic
variation in strains and substrains. As detailed earlier in this
chapter, the rat has served as a physiological model for over 100
years, and as multiple strains have been developed to study
pathophysiological processes, comparative analyses between
phenotypically similar and phenotypically disparate rat models
may yield insights into the underlying mechanisms. As such, a
number of whole genome sequencing projects were undertaken in
order to find and catalog these genetic differences between rat
strains.
Atanur et al. [170] sequenced the genomes of 27 rat strains,
including both cardiovascular and metabolic models of disease as
well as control strains, and determined variations using the
Rnor_3.4 assembly. Two years later, Hermsen et al. [171] not
only reanalyzed the sequence data for these 27 strains, but also
expanded the analysis by including an additional 13 strains,
aligning all of the sequences against the Rnor_5.0 assembly, and
calling the variants using updated software.
Findings from these and other similar studies suggested that in
pathophysiological models of diseases like hypertension, relevant
phenotypes may arise due to different combinations of genetic
factors, and this, in turn, reflects the complex nature of these
diseases in humans [170, 172].
When the Rnor 6.0 version of the rat reference genome was
released, an interim remapping of variants from some of the
strains was done, utilizing UCSC’s Batch Coordinate Conversion,
or “LiftOver” tool to convert the coordinates of SNPs from the
v5.0 assembly to the corresponding coordinates on the v6.0
assembly. This, however, does not always give uniformly reliable
results when
Rat in Biomedical Research 27
4.5.2 Variant ARCHIVES In May of 2017, NCBI announced that they would no longer
support the submission, storage, and presentation of nonhuman
variants, including single nucleotide polymorphisms in dbSNP and
larger structural variations in dbVar. Responsibility for all nonhu-
man variants was transitioned to the European Variation Archive
(EVA) at EMBL-EBI as of November of 2017
(https://www.ebi. ac.uk/eva/?Help#key-steps-transitional-
process). During and immediately following the initial transition
period, all nonhuman variants were transferred from NCBI’s
dbSNP database to the EVA. The EVA now accepts variant
submissions, assigns unique IDs to each variant and periodically
consolidates redundant variant records into single “reference
variant” records with the same rs-formatted IDs previously
assigned by dbSNP. In addition, EVA normalizes the data to
ensure that variant positions are standardized, annotates variant
effects, and calculates allele frequencies. Variants, whether from
small studies with only a few variants or from high- throughput
studies producing millions of variants, are submitted in VCF files.
A VCF validation software suite is provided so sub- mitters can
ensure their files are ready for loading to expedite the process. Rat
researchers unfamiliar with the validation and submis- sion
process can submit their variants to the Rat Genome Database.
RGD can incorporate the variants into the Variant Visualizer tool
and concurrently help with the submission process to the EVA.
4.6 Gene
Manipulation Transgenic rat models have been generated for more than 30 years
in the Rat by DNA microinjection of donor DNA into embryos [176, 177]
and used to study the function of a gene of interest. Early methods
to create mutations within specific genes include ENU (N-ethyl-N-
nitrosourea) mutagenesis or through introduction of a Sleeping
Beauty transposon [178]. Both of these methods successfully iden-
tified mutations in targeted genes but are limited in efficiency and
specificity. Many pups need to be screened using a variety of stra-
tegies to identify positive mutant founders [179]. Although these
random mutagenesis strategies require large-scale screening
efforts to identify specific mutations, ENU-induced mutant models
have been archived at the Rat Resource and Research Center
(http:// www.rrrc.us/) and the PhysGen Knockout Program for
subsequent follow-up phenotyping [178]. Additionally, an archive
of ENU-induced mutant sperm was created by the Kyoto Univer-
sity Mutant Rat Archive [180] to be used for large-scale screening
or rederivation of models using intracytoplasmic sperm injection.
The Sleeping Beauty transposon system was implemented in rats
after successful use in mice [181, 182]. This strategy had several
advantages over ENU mutagenesis, including the ability to modify
both the transposon and transposases and also that only a few
transposon insertions were made in each founder [178]. Similar
to the ENU models, the Rat Resource and Research Center reposi-
tory has nearly 100 transposon-derived models available to
investigators.
The advent of site-directed nucleases to target specific genes
in rat embryos has rapidly changed the gene engineering landscape
for investigators using rat models [183]. These sequence-specific
nucleases, including zinc finger nucleases (ZFN), TALENs, and
CRISPR/Cas approaches, have rapidly allowed the rat genome to
be manipulated in ways previously only available in the mouse.
These strategies were initially used to create a double-stranded
break in the genomic DNA at specific locations in the genome.
These breaks are typically repaired by nonhomologous end joining
30 Jennifer R. Smith et al.
4.7 Rat in the Rat is an excellent model for human disease, and in many cases of
Larger Context complex disease it is the model of choice. But for cases where
another model is preferred, or where additional information is
needed, a researcher needs to be able to leverage the research
done in other organisms and apply that information to their
system. In these cases, it is helpful, possibly even necessary, to be
able to access data for multiple species on a single site, and even in
a single view. To meet this need, groups like the Rat Genome
Database (RGD) and the Alliance of Genome Resources offer sites
that
Rat in Biomedical Research 31
5 Future Directions
References
1. Lack JB, Hamilton MJ, Braun JK, Mares https://doi.org/10.1128/jvi. 00725-11
MA, Van Den Bussche RA (2013)
Comparative phylogeography of invasive
Rattus rattus and Rattus norvegicus in the
U.S. reveals distinct colonization histories
and dispersal. Biol Inva- sions 15(5):1067–
1087. https://doi.org/10. 1007/s10530-
012-0351-5
2. Courchamp F, Chapuis JL, Pascal M
(2003) Mammal invaders on islands:
impact, control and control impact. Biol
Rev Camb Philos Soc 78(3):347–383
3. Centers for Disease Control and
Prevention.
https://www.cdc.gov/rodents/diseases/
index.html. Accessed 15 Aug 2018
4. Lin XD, Guo WP, Wang W, Zou Y, Hao
ZY, Zhou DJ et al (2012) Migration of
Norway rats resulted in the worldwide
distribution of Seoul hantavirus today. J
Virol 86 (2):972–981.
5. Kosoy M, Khlyap L, Cosson J-F, Morand
S (2015) Aboriginal and invasive rats of
genus Rattus as hosts of infectious agents.
Vector Borne Zoonotic Dis 15(1):3–12.
https:// doi.org/10.1089/vbz.2014.1629
6. Harper GA, Bunbury N (2015) Invasive
rats on tropical islands: their population
biology and impacts on native species.
Glob Ecol Conserv 3:607–627.
https://doi.org/10.
1016/j.gecco.2015.02.010
7. Lindsey JR (1979) Historical foundations.
In: Baker HJ, Lindsey JR, Weisbroth SH
(eds) The laboratory rat: biology and
diseases, American College of Laboratory
Animal Medicine Series, vol 1. Academic
Press, New York, pp 1–36
8. Puckett EE, Park J, Combs M, Blum MJ,
Bryant JE, Caccone A et al (2016) Global
population divergence and admixture of the
brown rat (Rattus norvegicus). Proc Biol
Sci
Rat in Biomedical Research 33
effectiveness of selection and of the theory of 56. Donaldson HH (1915) The rat: reference
gametic purity in Mendelian crosses. The tables and data for the albino rat and the Nor-
Car- negie Institution of Washington, way rat. The Wistar Institute of Anatomy and
Washington, DC Biology, Philadelphia
38. Castle WE, Wright S (1915) Two color muta- 57. Donaldson HH (1924) The rat: data and ref-
tions of rats which show partial coupling. erence tables for the albino rat (Mus norve-
Sci- ence 42(1075):193–195. gius albinus) and the Norway rat (Mus
https://doi.org/ norvegius). The Wistar Institute of Anatomy
10.1126/science.42.1075.193 and Biology, Philadelphia
39. Castle WE (1919) Piebald rats and the 58. Ogilvie MB (2007) Inbreeding, eugenics,
theory of genes. Proc Natl Acad Sci U S A and Helen Dean King (1869–1955). J Hist
5 (4):126–130 Biol 40(3):467–507.
40. Castle WE, Wachter WL (1924) Variations https://doi.org/10.1007/ s10739-006-
of linkage in rats and mice. Genetics 9(1):1– 9117-1
12 59. King HD, Donaldson HH (1929) Life pro-
41. Castle WE (1925) A sex difference in linkage cesses and size of the body and organs of the
in rats and mice. Genetics 10(6):580–582 gray Norway rat during ten generations in
42. King HD, Castle WE (1935) Linkage studies captivity. In: Stockard CR, Evans HM (eds)
of the rat (Rattus norvegicus). Proc Natl Acad The American Anatomical Memoirs, vol 14.
Sci U S A 21(6):390–399 The Wistar Institute of Anatomy and Biology,
43. King HD, Castle WE (1937) Linkage studies Philadelphia
of the rat (Rattus norvegicus): II. Proc Natl 60. King HD (1939) Life processes in gray Nor-
Acad Sci U S A 23(2):56–60 way rats during fourteen years in captivity.
44. Castle WE, King HD (1940) Linkage studies In: Stockard CR, Evans HM (eds) The
of the rat (Rattus norvegicus): III. Proc Natl American Anatomical Memoirs, vol 17. The
Acad Sci U S A 26(9):578–580 Wistar Institute of Anatomy and Biology,
45. Castle WE, King HD (1941) Linkage studies Philadelphia
of the rat (Rattus norvegicus): V. Proc Natl 61. Wilkins AS, Wrangham RW, Fitch WT
Acad Sci U S A 27(8):394–398 (2014) The “domestication syndrome” in
46. Castle WE, King HD, Daniels AL (1941) mammals: a unified explanation based on
Linkage studies of the rat (Rattus neural crest cell behavior and genetics.
norvegicus): Genetics 197 (3):795–808.
https://doi.org/10.1534/
IV. Proc Natl Acad Sci U S A 27(6):250–254
genetics.114.165423
47. Castle WE (1941) Influence of certain
62. Stewart CC (1898) Variations in daily
color mutations on body size in mice, rats,
activity produced by alcohol and by
and rabbits. Genetics 26(2):177–191
changes in baro- metric pressure and diet,
48. Castle WE (1944) Linkage of Waltzing in with a description of recording methods.
the rat. Proc Natl Acad Sci U S A Am J Phys 1(1):40–56.
30(9):226–230 https://doi.org/10.1152/ajplegacy.1898.1.
49. Castle WE, King HD (1944) Linkage studies 1.40
of the rat (Rattus norvegicus): VI. Proc Natl 63. Willard SS (1901) Experimental Study of the
Acad Sci U S A 30(4):79–82 Mental Processes of the Rat. II. Am J Psychol
50. Castle WE (1946) Linkage in the albino 12(2):206–239.
chro- mosome of the rat. Proc Natl Acad https://doi.org/10.2307/ 1412534
Sci U S A 32(2):33–36 64. Watson JB (1903) Animal education; an
51. Castle WE, King HD (1947) Linkage experimental study on the psychical develop-
studies of the rat. J Hered 38(11):341–344 ment of the white rat, correlated with the
52. Castle WE, King HD (1948) Linkage growth of its nervous system. The University
studies of the rat: IX. Cataract. Proc Natl of Chicago Press, Chicago, p 122
Acad Sci U S A 34(4):135–136 65. Watson JB (1914) Behavior: an
53. Castle WE, King HD (1949) Linkage studies introduction to comparative psychology.
of the rat: X. Proc Natl Acad Sci U S A 35 Holt, New York
(9):545–546 66. Schulkin J, Rozin P, Stellar E (1994) Curt
54. Castle WE (1951) Variation in the hooded P. Richter – February 20, 1894–December
pattern of rats, and a new allele of hooded. 21, 1988. Biogr Mem Natl Acad Sci 65:311–
Genetics 36(3):254–266 320
55. Conklin EG (1938) Biographical memoir of 67. Richter CP (1922) A behavioristic study of
Henry Herbert Donaldson 1857–1938. the activity of the rat. Williams & Wilkins
Biogr Mem Natl Acad Sci 20:227–243 Company, Baltimore
Rat in Biomedical Research 35
68. Richter CP (1936) Increased salt appetite in 80. Evans HM, Becks H (1948) The gigantism
adrenalectomized rats. Am J Phys 115 produced in normal rats by injection of the
(1):155–161. pituitary growth hormone; skeletal changes;
https://doi.org/10.1152/ tibia, costochondral junction, and caudal ver-
ajplegacy.1936.115.1.155 tebrae. Growth 12(1):43–54
69. Richter CP (1953) Experimentally produced 81. Li CH, Simpson ME, Evans HM (1948)
behavior reactions to food poisoning in wild The gigantism produced in normal rats by
and domesticated rats. Ann N Y Acad Sci 56 injec- tion of the pituitary growth hormone;
(2):225–239 main chemical components of the body.
70. Richter CP (1968) Experiences of a reluctant Growth 12 (1):39–42
rat-catcher: the common Norway rat-friend 82. Evans HM, Sompson ME, Li CH (1948)
or enemy? Proc Am Philos Soc 112 (6):403– The gigantism produced in normal rats by
415 injec- tion of the pituitary growth hormone;
71. Stotsenberg J (1909) On the growth of the body growth and organ changes. Growth 12
albino rat (Mus norvegicus var. albus) after (1):15–32
castration. Anat Rec 3(4):233–244. 83. Li CH, Simpson ME, Evans HM (1949)
https:// doi.org/10.1002/ar.1090030410 Influence of growth and
72. Stotsenburg JM (1913) The effect of spaying adrenocorticotropic hormones on the body
and semi-spaying young albino rats (Mus composition of hypo- physectomized rats.
nor- vegicus albinus) on the growth in body Endocrinology 44 (1):71–75.
weight and body length. Anat Rec 7(6):183– https://doi.org/10.1210/endo- 44-1-71
194. 84. Asling CW, Walker DG, Simpson ME, Evans
https://doi.org/10.1002/ar.1090070602 HM (1950) Differences in the skeletal devel-
73. Hammett FS (1924) Studies of the thyroid opment attained by 60-day-old female rats
apparatus. XX. The effect of thyro- hypophysectomized at ages varying from
parathyroidectomy and parathyroidectomy at 6 to 28 days. Anat Rec 106(4):555–569
75 days of age on the growth of the brain 85. Walker DG, Simpson ME, Asling CW, Evans
and spina cord of male and female albino HM (1950) Growth and differentiation in the
rats. J Comp Neurol 37(1):15–30. rat following hypophysectomy at 6 days of
https://doi.org/ 10.1002/cne.900370103 age. Anat Rec 106(4):536–554
74. Long JA, Evans HM (1922) The oestrous 86. Koneff AA, Moon HD, Simpson ME, Li
cycle in the rat and its associated phenomena, CH, Evans HM (1951) Neoplasms in rats
vol 6. Memoirs of the University of treated with pituitary growth hormone. IV
California. University of California Press, Pituitary gland. Cancer Res 11(2):113–117
Berkeley, CA 87. Van Dyke DC, Garcia JF, Simpson ME, Huff
75. Evans HM, Long JA (1921) The effect of RL, Contopoulos AN, Evans HM (1952)
feeding the anterior lobe of the hypophysis Maintenance of circulating red cell volume in
on the oestrous cycle of the rat. Anat Rec 21 rats after removal of the posterior and inter-
(1):62. mediate lobes of the pituitary. Blood 7
https://doi.org/10.1002/ar. 1090210105 (10):1005–1016
76. Evans HM, Long JA (1921) The effect of 88. Asling CW, Walker DG, Simpson ME, Li
the anterior lobe administered CH, Evans HM (1952) Deaths in rats
intraperitoneally upon growth, maturity, submitted to hypophysectomy at an extremely
and oestrous cycles of the rat. Anat Rec early age and the survival effected by growth
21(1):62–63. https:// hormone. Anat Rec 114(1):49–65
doi.org/10.1002/ar.1090210105 89. Walker DG, Asling CW, Simpson ME, Li
77. Simpson ME, Evans HM (1946) Comparison CH, Evans HM (1952) Structural alterations
of the spermatogenic and androgenic proper- in rats hypophysectomized at six days of age
ties of testosterone propionate with those of and their correction with growth hormone.
pituitary ICSH in hypophysectomized 40-day Anat Rec 114(1):19–47
old male rats. Endocrinology 39(5):281–285. 90. Fels IG, Simpson ME, Evans HM (1953) The
https://doi.org/10.1210/endo-39-5-281 effect of magnesium ion upon the alkaline
78. Li CH, Kalman C, Evans HM (1947) The phosphatase activity in the thyroid of the
effect of the hypophyseal growth hormone hypophysectomized rat. J Biol Chem 204
on the alkaline phosphatase of rat plasma. J (2):807–814
Biol Chem 169(3):625–629 91. Nelson MM, Lyons WR, Evans HM (1953)
79. Evans HM, Becks H, Asling CW, Li CH Comparison of ovarian and pituitary
(1947) Gigantism produced in normal female
rats by chronic treatment with pure pituitary
growth hormone. Anat Rec 97(3):333
36 Jennifer R. Smith et al.
mammary cancer by this agent. Cancer Res section of the National Institutes of Health,
12 (10):702–706 in “Rat Quality: A Consideration of
138. Dunning WF, Curtis MR, Madsen ME Heredity, Diet and Disease.” Proceedings
(1947) The induction of neoplasms in five of the Sym- posium Held at Columbia
strains of rats with acetylaminofluorene. Can- University, College of Physicians and
cer Res 7(3):134–140 Surgeons, New York, January 31, 1952. Q
139. Dunning WF, Curtis MR, Maun ME (1949) Rev Biol 30:4. https://
The effect of dietary fat and carbohydrate on doi.org/10.1086/401094
diethylstilbestrol-induced mammary cancer 150. Wilkinson MD, Dumontier M, Aalbersberg
in rats. Cancer Res 9(6):354–361 IJ, Appleton G, Axton M, Baak A et al
140. Dunning WF, Curtis MR, Segaloff A (2016) The FAIR Guiding Principles for sci-
(1953) Strain differences in response to entific data management and stewardship. Sci
estrone and the induction of mammary Data 3:160018. https://doi.org/10.1038/
gland, adrenal, and bladder cancer in rats. sdata.2016.18
Cancer Res 13 (2):147–152 151. FAIR principles for data stewardship
141. Dunning WF, Curtis MR, Stevens M (1968) (2016) Nat Genetics 48(4):343.
Comparative carcinogenic activity of https://doi.org/10. 1038/ng.3544
dimethyl and trimethyl derivatives of 152. Shimoyama M, De Pons J, Hayman GT, Lau-
benz(a)anthra- cene in Fischer line 344 rats. lederkind SJ, Liu W, Nigam R et al (2015)
Proc Soc Exp Biol Med 128(3):720–722 The Rat Genome Database 2015: genomic,
142. Dunning WF, Curtis MR, Stevens ML, phenotypic and environmental variations and
Dumenigo F (1967) Five transplantable leu- disease. Nucleic Acids Res 43(Database
kemias in the Fischer rat, and their respon- issue):D743–D750.
siveness to steroids. Cancer Res 27(6 Pt https://doi.org/10. 1093/nar/gku1026
2):696–727 153. Laulederkind SJ, Liu W, Smith JR, Hayman
143. Zeiger E (2017) Reflections on a career and GT, Wang SJ, Nigam R et al (2013) Pheno-
on the history of genetic toxicity testing in Miner: quantitative phenotype curation at the
the National Toxicology Program. Mutation rat genome database. Database 2013:bat015.
Res 773:282–292. https://doi.org/10.1093/database/bat015
https://doi.org/10.1016/j. 154. Wang SJ, Laulederkind SJ, Hayman GT,
mrrev.2017.03.002 Petri V, Liu W, Smith JR et al (2015)
144. Bucher JR (2002) The National Toxicology Pheno- Miner: a quantitative phenotype
Program rodent bioassay: designs, interpreta- database for the laboratory rat, Rattus
tions, and scientific contributions. Ann N Y norvegicus. Appli- cation in hypertension
Acad Sci 982:198–207 and renal disease. Database 2015:bau128.
145. King-Herbert A, Thayer K (2006) NTP https://doi.org/10. 1093/database/bau128
workshop: animal models for the NTP 155. Shimoyama M, Nigam R, McIntosh LS,
rodent cancer bioassay: stocks and strains-- Nagarajan R, Rice T, Rao DC, Dwinell MR
should we switch? Toxicol Pathol (2012) Three ontologies to define
34(6):802–805. https://doi.org/10.1080/ phenotype measurement data. Front Genet
01926230600935938 3:87. https://
146. King-Herbert AP, Sills RC, Bucher JR doi.org/10.3389/fgene.2012.00087
(2010) Commentary: update on animal 156. Smith JR, Park CA, Nigam R,
models for NTP studies. Toxicol Pathol 38 Laulederkind SJ, Hayman GT, Wang SJ et
(1):180–181. al (2013) The clinical measurement,
https://doi.org/10.1177/ measurement method and experimental
0192623309356450 condition ontologies: expansion,
147. Kim U, Clifton KH, Furth J (1960) A highly improvements and new applica- tions. J
inbred line of Wistar rats yielding spontane- Biomed Semantics 4(1):26. https://
ous mammo-somatotropic pituitary and doi.org/10.1186/2041-1480-4-26
other tumors. J Natl Cancer Inst 24:1031– 157. Nigam R, Munzenmaier DH, Worthey EA,
1055 Dwinell MR, Shimoyama M, Jacob HJ
148. Shimoyama M, Smith JR, Bryda E, (2013) Rat Strain Ontology: structured con-
Kuramoto T, Saba L, Dwinell M (2017) Rat trolled vocabulary designed to facilitate
genome and model resources. ILAR J 58 access to strain data at RGD. J Biomed
(1):42–58. Semantics 4 (1):36.
https://doi.org/10.1093/ilar/ ilw041 https://doi.org/10.1186/2041- 1480-4-36
149. Poiley SM (1955) History and information 158. Hayman GT, Laulederkind SJ, Smith JR,
concerning the rat colonies in the animal Wang SJ, Petri V, Nigam R et al (2016) The
disease portals, disease-gene annotation and
the RGD disease ontology at the Rat Genome
Rat in Biomedical Research 39
199. Smith CM, Hayamizu TF, Finger JH, Bello Truong A, Yang WP, He A, Kayne P,
SM, McCright IJ, Xu J et al (2018) The Gargalovic P, Kirchgessner T, Pan C, Castel-
mouse Gene Expression Database (GXD): lani LW, Kostem E, Furlotte N, Drake TA,
2019 update. Nucleic Acids Res 47(Data- Eskin E, Lusis AJ (2010) A high-resolution
base-Issue):D774–D779. association mapping panel for the dissection
https://doi.org/ 10.1093/nar/gky922 of complex traits in mice. Genome Res 20
200. Ruzicka L, Bradford YM, Frazer K, Howe (2):281–290. https://doi.org/10.1101/gr.
DG, Paddock H, Ramachandran S et al 099234.109
(2015) ZFIN, The zebrafish model organism 203. Saba L, Hoffman P, Tabakoff B (2017)
database: updates and new directions. Using baseline transcriptional connectomes
Genesis 53(8):498–509. in rat to identify genetic pathways associated
https://doi.org/10.1002/ dvg.22868 with pre- disposition to complex traits.
201. Ashbrook DG, Mulligan MK, Williams Methods Mol Biol 1488:299–317.
RW (2018) Post-genomic behavioral https://doi.org/10. 1007/978-1-4939-
genetics: from revolution to routine. Genes 6427-7_14
Brain Behav 17(3):e12441. 204. Perry ME, Valdes KM, Wilder E, Austin CP,
https://doi.org/10. 1111/gbb.12441 Brooks PJ (2018) Genome editing to ‘re-
202. Bennett BJ, Farber CR, Orozco L, Kang HM, write’ wrongs. Nat Rev Drug Discov 17
Ghazalpour A, Siemers N, Neubauer M, (10):689–690.
Neuhaus I, Yordanova R, Guan B, https://doi.org/10.1038/ nrd.2018.91
Chapter 2
Abstract
The first and only published version of the rat reference genome sequence was RGSC3.1, accomplished
by the Rat Genome Sequencing Project Consortium. Here we present the history of the community effort
in the correction of sequence errors and filling missing gaps in the process of refining and providing
researchers with a high-quality rat reference sequence. The genome assembly improvements, addition of
different evidence resources over time, such as RNA-Seq data, and software development methodologies
had a positive impact on the gene model annotations. Over the years we observed a great increase in the
numbers of genes, protein coding sequences, predicted transcripts and transcript features. Before the
sequencing of the rat genome was possible, first biochemical and next genomic markers like RAPD,
AFLP, RFLP, and SSLP were fundamental in research studies involving cross-breeding between different
rat strains, in finding the level of polymorphism, linkage mapping, and phylogeny. Linkage maps provide
information on recombination rates, give insight into intra- and interspecies gene rearrangements, and help
to identify Mendelian loci and Quantitative Trait Loci (QTL). In the 1990s many reports were published
on the construction of rat linkage maps that incorporated increasing numbers of markers and facilitated
the localization of disease loci. Current genetic monitoring and linkage mapping relies on single
nucleotide polymorphisms (SNPs). The Rat Genome Database collects information on genetic variation
from the worldwide community of rat researchers and provides tools for searching and retrieving these
data. As of today we show details about almost 605 million variants coming from many studies in our
Variant Visualizer tool.
Key words Reference genome, Gene model annotations, Genomic markers, Variants, SNPs
1.1 History of the The laboratory rat (Rattus norvegicus) became the third mamma-
Rat Genome lian genome to be sequenced when the Rat Genome Sequencing
Sequencing Project Project Consortium (RGSPC) published a high-quality draft
sequence of the rat genome. The project was a collaborative effort
involving sequencing and analyses by researchers at 40 organiza-
tions from seven countries, coordinated by the Baylor College of
Medicine Human Genome Sequencing Center (BCM-HGSC).
Funding was primarily supplied by the National Human Genome
G. Thomas Hayman et al. (eds.), Rat Genomics, Methods in Molecular Biology, vol. 2018,
https://doi.org/10.1007/978-1-4939-9581-3_2, © Springer Science+Business Media, LLC, part of Springer Nature 2019
43
44 Monika Tutaj et al.
The first public release of the rat genome sequence was desig-
nated RGSC/Rnor2.0. Its release in November of 2002 was fol-
lowed shortly thereafter by the release of Rnor version 2.1 in
January of 2003. Release notes for the v2.1 assembly noted that a
“reduction of assembly artifacts has slightly reduced the number of
bases assembled while increasing the size of contigs, scaffolds,
and ultrabactigs” (reports available at the BCM-HGSC ftp site).
The total size of the assembly mapped onto chromosomes was
2.72 Gbp for the 2.0 release and 2.66 Gbp for the 2.1 release,
while the average size of ultrabactigs and scaffolds increased by
5.9% and 17.2%, respectively.
Rnor3.0 represented a complete reassembly of the genome
sequence. Improvements incorporated into this version of the
sequence included the addition of new sequences (over 1100 new
BACs to cover gaps), better software accuracy and relevance, utili-
zation of an improved marker set from the Medical College of
Wisconsin, and a new FPC map from the BC Cancer Agency
Genome Sciences Centre [5]. The new FPC map was based on
automated assembly of BAC clones based on the “fingerprinting,”
followed by a process of manual curation and sequencing of
clones, and the use of human and mouse orthology information to
resolve conflicts and to correct placement of sequence units. This
process of automated and manual editing expanded the contiguity
of the rat fingerprint map and in turn allowed for targeted BAC
clone selec- tion and filling of contig gaps, as well as linking some
of the unlocalized segments of the rat assembly to
chromosomes [3]. Rnor3.0 was rapidly followed by the release of
version 3.1 in June of 2003. Rnor3.1 was considered a minor
update to the previous assembly with changes only affecting
chromosomes 7 and X. Although not the first public version of
the sequence, Rnor3.1 was the first (and only) published version
of the rat refer- ence genome sequence [1]. Analysis of the
assembly revealed that it had an average overall of approximately
sevenfold sequence cover- age, with 60% provided by WGS and
40% by BACs. The assembly covered about 90% of the estimated
2.75 Gbp rat genome and contained a similar number of genes as
described for human and mouse (20,000–25,000) [1, 6].
In 2004, a series of minor updates to the assembly brought the
designation to Rnor3.4. Updates included three additions of fin-
ished BAC sequences and the correction of several alignment
switch points. Despite additional work done to improve the assem-
bly during the interim, the Rnor3.4 assembly remained the de facto
reference assembly for the rat for almost eight years. After 2004 a
number of improvements were proposed and funded by the
NHGRI to provide a more complete genome with improved accu-
racy. New assemblies were released in March of 2008 (Rnor4.0)
and November of 2009 (Rnor4.1), both of which utilized reads
downloaded from NCBI in January of 2007. The assemblies were
46 Monika Tutaj et al.
1.2 Other Rat Since 2004, the genomes of a number of other rat strains have
Genome been sequenced by the rat community. In 2008, the STAR
Assemblies consortium used a combination of shotgun sequencing, low
coverage WGS, and BAC end sequencing to discover almost three
million single nucleotide variants from the SS/Jr., WKY/Bbb,
GK/Ox, SHRSP/ Bbb, Sprague Dawley, and F344/Stm strains [9].
A subset of approximately 20,000 of these were used to genotype
167 inbred strains and 2 recombinant inbred (RI) panels (the 31
HXB-BXH strains and the 33 FXLE-LEXF strains). Almost
10,000 of the SNVs were then used to genotype an additional 89
F2 animals
Table 1
Statistics of rat genome assemblies
4
7
48 Monika Tutaj et al.
Table 2
Archived references
from a cross between the BN/Par and GK/Ox strains. In 2010 the
SHR/OlaIpcv rat genome was sequenced at 10.7-fold coverage by
paired-end sequencing on the Illumina platform. Initially 681.8
million reads were mapped to the BN reference genome (v3.4)
and covered 97.7% of the reference assembly by at least three
reads [10]. Subsequently, the NGS data set of the SHR/OlaIpcv
strain was expanded, thereby increasing the median coverage of
this genome to 23-fold in 2012. The researchers, in the same study,
also generated whole genome NGS data from the same genetic
material that was used to create the BN reference sequence
(referred to in that study as “Eve”), as well as from the BN- Lx, a
mutant strain closely related to BN. The data from the new BN
sequence (32-fold NGS coverage) were used to search for
Rnor3.4 assembly errors [11].
In 2013, two more genomes were sequenced, becoming the
first non-reference de novo assemblies of rat genomes [12]. The
DA and F344 rat strains were sequenced with an average depth of
32 × using Illumina technology. Researchers employed a reference-
aided assembly method (RAM), using the BN reference genome as
well as the Short Oligonucleotide Alignment Program (SOAP),
and GapCloser, an algorithm for contig-end extension and gap
filling. First, a semi-finished genome was constructed by aligning
sequencing reads to Rnor3.4 using SOAPaligner [13], then an
independent de novo assembly of contigs and scaffolds was gener-
ated using SOAPdenovo [14]. Finally, the genome draft was
assem- bled by anchoring scaffolds onto the semi-finished
genome. The read alignment of each strain with the BN genome
covered 98% of the reference (three reads or more). The DA and
F344 genome drafts were 1.94% and 1.91% larger than the BN
genome, respec- tively. The DA and F344 genome drafts
contained more than 49 million novel base pairs for each genome
that bridged around 400,000 gaps of the BN genome [12].
In 2013 the eight inbred strains (ACI/N, BN/SsN, BUF/N,
F344/N, M520/N, MR/N, WKY/N, and WN/N) which had
been used as the founder/progenitor strains for the NIH’s
hetero- geneous stock (N:HS) rats were sequenced using
SOLiD technol- ogy at an average of 22-fold sequence
coverage, that represents
Rat Genome Assemblies, Annotation, and Variants 49
~88% of the reference genome [15]. The same year another large-
scale sequencing project was completed. The sequencing of 27 rat
strains that served as popular disease models of hypertension, dia-
betes, and insulin resistance resulted in the identification of a num-
ber of genomic variants and coevolved gene clusters [16]. The
researchers produced the sequence data with 20-fold coverage on
average for all strains except for BBDP/Wor and WKY/NHsd rat
strains that reached approximately a tenfold coverage level.
Variant data from each of these large-scale strain-specific
sequencing pro- jects are available at the Rat Genome Database
(see below for details).
1.3 Reported
Reference The regions that posed special problems to complete genome
Genome Errors assembly were regions with unusual repeat structures, polymorph-
isms, possible BAC rearrangements, and low sequence coverage.
The Rnor3.4 assembly contained many gaps, inconsistencies, and
sequence errors due to the relatively low coverage and errors asso-
ciated with capillary technology. Genetic single nucleotide poly-
morphism (SNP) mapping by the STAR consortium in 2008
identified discrepancies between the genetic map and the draft
genome: a p11-centromeric segment of chromosome 1 was
wrongly inserted into the p14-telomeric region of chromosome
17, intra- and inter-chromosomal relocations were observed in
regions of chromosomes 2, 4, 11, 12, 14, and 17 [16]. The reloca-
tion in the p14 region of chromosome 17 and one conflict on
chromosome 9 were discovered during the revision of differences
between BCM and Celera rat genome assemblies [9]. In the study
of the rat genomic variation in complex traits, four pairs of regions
on chromosomes 1, 4 (2 regions), 9, 12, 14, 17, and 19 showed
high inter-chromosomal linkage disequilibrium, due to mis-
assembly of the Rnor3.4 reference sequence [15]. Analyses at the
Rat Genome Database of changes in NCBI (National Center for
Biotechnology Information) gene position annotations between rat
genome assemblies showed a number of co-localized genes that in
upgraded reference versions were re-annotated, frequently
together, to different chromosomes (see Table 3). We observed
25 out of 49 genes that occupied an 8.3 Mbp region of
chromosome 1 in the reference version 3.4 were relocated to two
different geno- mic regions in v5.0: 17 genes moved to
chromosome 8, while 6 genes moved to chromosome 9. All
together, we found 7 clusters of 19 to 96 co-localized genes, that
span 1.2 to 8.3 Mb regions in chromosomes 1, 4, 7, 8, 17 and
chromosome X of the assembly v3.4 that changed genomic
position in the v5.0 (see Fig. 1). The changes were less profound
between assemblies v5.0 and v6.0. Four clusters of 10 to 93 genes
spanning 1.5 to 6 Mbp regions of chromosomes 1, 3, and X
changed genomic position in the v6.0 assembly with reference to
v5.0. However, there were also numerous smaller changes in
other chromosomes in both transitions: 435 genes in
50 Monika Tutaj et al.
Table 3
Annotation changes between different rat genome assemblies
Number of Number of
Compared Chromosome genesa/ Genomic position (bp) on genesa/ Chrom osome
Asemblies OLD position total Mbp older assembly total new position
Rnor3.4 1 25/49 8.3 58877734-67150234 17/49 8
and 6/49 9
Rnor5.0 4 19/28 4.6 99068636–103646030 17/28 3
7 49/69 1.7 137288635–1389558414 49/69 X
9/69 6
8 40/41 1.2 40493892–41711691 18/41 15
16/41 4
17 27/33 5.3 44656–5318612 28/33 1
X 96/132 7 153730373–160683450 83/132 1
23/132 6.5 122692432–129236338 17/132 3
Rnor5.0 1 93/113 6 147946356–153934661 95/113 X
and 10/113 6.5 64184207–70664968 6/113 11
Rnor6.0 3 20/23 1.5 52231912–53715589 20/23 X
X 60/63 1.6 114700497–116300972 60/63 7
a
Including protein-coding genes, noncoding genes, pseudogenes, and genes under revision
Fig. 1 Number of genes re-annotated from chromosomes of the assembly Rnor3.4 to different
genomic position in Rnor5.0; Number of genes limited to 100 for better visualization
Rat Genome Assemblies, Annotation, and Variants 51
2.1 Gene In 2004, the Ensembl gene prediction pipeline predicted 20,973
Model genes with 28,516 transcripts and 205,623 exons for the Rnor3.1
Prediction assembly [1]. The improvement provided by reassembly of the
reference sequence in general, and by the Rnor 6.0 assembly in
particular, had a positive impact on the assembly annotation. Gene
model predictions consider known protein and transcript data for
rat, as well as homology to other sequences, including rodent
proteins, non-rodent vertebrate proteins, rat cDNA data from
RefSeq and EMBL, and mouse cDNAs from Riken, RefSeq, and
EMBL. The statistics depend on the quality of the genome
sequence, the gene prediction method, the alignment criteria, and
the amount of expressed sequence evidence. Table 4 lists the
current number of gene model predictions provided by NCBI for
the v3.4, v5.0, and v6.0 rat genome assemblies. There is an
increase in the numbers of genes, protein coding sequences
(CDS), and defined noncoding 50 and 30 untranslated regions
(UTRs). Genome annotations and prediction accuracy benefit from
the addition of different evidence resources, such as the use of
RNA-Seq data, and new methodologies. This is clearly demon-
strated by the substantial increase in the number of predicted
transcripts and transcript features for the v5.0 and v6.0 assemblies
where the incorporation of RNA-seq transcriptomic data
improved the identification of isoforms, UTRs, exon boundaries
and transcripts with only low expression. Worth noting is that the
number of noncoding genes doubles from v3.4 to v5.0 and more
than triples from v3.4 to v6.0.
52 Monika Tutaj et al.
Table 4
Number of genomic features for the rat assemblies
2.2 NCBI Genome annotations, i.e., the prediction and localization of genes
and Ensembl Gene and other genomic elements on a genome sequence, differ between
Annotation Models NCBI and Ensembl because of variations in annotation strategies,
algorithms, and input data. The NCBI Eukaryotic Genome Anno-
tation Pipeline [18] utilizes a suite of informatic tools that include
the alignment programs Splign and ProSplign, and the gene pre-
diction program Gnomon to generate sets of genes with their
associated transcripts and proteins. The annotation process relies
heavily on the availability of transcript or protein evidence for the
species. Originally implemented in 2000 as a semi-manual process
to align Genbank and RefSeq transcripts to the genome using the
BLAST algorithm then supplementing these with ab initio gene
predictions using GenomeScan [19], NCBI’s pipeline has gone
through a number of substantial improvements. These include
the addition of EST and protein data as input and the development
of the splicing-aware aligners Splign for transcripts and ProSplign
for proteins. Addition of RNA-Seq data improved the quality of
the annotations, particularly for organisms that have little or no
experi- mental mRNA or EST data available. Reengineering the
pipeline using a new framework for parallel execution enhanced
its extensi- bility, robustness and reproducibility, as well as
improving tracking, all of which were necessary to keep pace with
both annotation of new genomic sequences and reannotation of
improved genome assemblies.
The current Eukaryotic Genome Annotation Pipeline takes as
input same-species transcripts, proteins and RNA-seq reads, and
where necessary, transcripts and proteins from closely related spe-
cies. Input transcripts include known coding and noncoding
Rat Genome Assemblies, Annotation, and Variants 53
2.3 Gene Annotation The genome size for human is 3.257Gb (GRCh38), slightly
Differences larger than for rat—2.870Gb (Rnor6.0) or mouse—2.819Gb
(GRCm38), but there are substantial differences in the number
of annotated proteins and transcripts between the three of them.
Currently the amount of expressed sequence evidence is much
more abundant for human and mouse than for rat. Table 5
presents data collected from annotation pages for individual
species that are available in the NCBI and Ensembl databases
[22, 23]. There are more than 2 times the number of transcripts
and 3–4 times more EST data used for the human and mouse
gene prediction models than for the rat in both Ensembl and
NCBI. There are 4 times more protein sequences used for the
human model, and 2 times more protein data for the mouse in
the NCBI prediction. We compared the number of genes
between NCBI and Ensembl for human, mouse, and rat (see
Fig. 2). There are 19,633 rat genes shared by the two gene
models, 25,372 human genes and 24,637 mouse genes. Even
though the amount of evidence for both human and mouse is
much higher than for rat, the proportion of overlapping
predicted genes is low. It suggests that the results of the
prediction do not depend on the number of provided evidences
but depend on a design of the prediction strategy. We counted
how many of NCBI’s and Ensembl’s predicted genes have been
assigned the same genomic position in the genome reference
(gene bound- aries—from start to stop position). We found that
an exact match of position applies to only a limited number of
genes: 15% of human genes (9595), 12% of rat (3902), and
10% of mouse (5637) genes. Examples of differences in rat
gene model prediction between Ensembl and NCBI are shown
in Fig. 3. Some genes are predicted to be in the same genomic
location but differ in length, number of exons or the exons’
positions. In some cases, exons have the same predicted
positions but different genes are assigned to that position by
two models. There are examples of single genes in one
database that are split into two genes in another one. In some
locations one model predicts the presence of genes whereas the
other one does not. Number and length of predicted transcripts
also differ.
Table 5
Resources for generating gene models
5
5
56 Monika Tutaj et al.
19633
38595 25372 35124 44383 24637 29309
RGD 13250
Human Mouse Rat
Total Gene NCBI 60496 69020 47959
Number (IDs) ENSEMBL 63967 53946 32883
Fig. 2 The comparison of gene annotations for human, rat, and mouse. Number of genes with unique ID
shared between NCBI and Ensembl
3.1 Genomi There are over 700 inbred strains of rats, and the history of their
c Markers generation and evolution is not always well known. Markers are
important in research studies involving cross-breeding between
different rat strains, and essential in finding the level of polymor-
phism and genetic homogeneity between them (inter-strain and
intra-strain differences).
Years before the sequencing of the rat genome was possible,
there was an intensive search for novel markers that could be
integrated into rat genetic and radiation hybrid maps [27–
29]. The inbred mouse and rat strains were known by coat colors
and MHC until the 1970s, when biochemical markers
Fig. 3 Examples of differences in gene model prediction between ENSEMBL (yellow-blue) and NCBI
(red-brown) using the JBrowse genome viewer [24]: (a) genes in the same genomic location differ in
length and number of exons or other genes are assigned to that position; (b) the same genes differ
in the exons’ position; (c) one gene in NCBI is split into two genes in ENSEMBL; (d) in some locations one
model predicts the presence of genes whereas the other one does not; (e) number and length of
predicted transcripts differ
58 Monika Tutaj et al.
5
9
60 Monika Tutaj et al.
Number
of rat Secondary Publication data dbSNP
Assembly SNVs total strains analysis provider Primary data source PMID provider Sequence platform dbSNP build
3.4 205,581,620 28 ICL ICL, MCW, WTSI, UI, MDC, 23890820, Atanur et al. Illumina HiSeq 4,877,558 dbSnp136
KNAW, ERIBA-UMCG, 24628878, 2013, Ma 2000, SOLiD
CHPM-UT, UMMC, 20430781 et al. 2014, 2,3 and
UniSR, ICAMS, ILA Atanur et al. 4 (2 strains)
KyotoU, EMBL-EBI, 2010
DZHK, INSERM; Simonis
et al. 2012
12 HI-KNAW RGSMC1: WTCHG, 23708188 Baud et al. 2013 SOLiD 4 and
HI-KNAW, ERIBA- 5500, Illumina R
UMCG, KI-CNS, ICAMS, HiSeq 2000 at
IUSM, INSERM, KI-MBB, (1 strain) G
MDC, INc-UAB, e
CEA-CNG, MPIMG, n
ECRC, HGSC-BCM, o
WTSI, ICL, EMBL-EBI, m
DZHK e
A
2 HI-KNAW HI-KNAW, ICL, BCGSC, 22541052 Simonis et al. SOLiD 2,3 and ss
IPHYS-CAS 2012 4 (2 strains) e
2 ICAHN BGI, FIMR, UESTC 23695301 Guo et al. 2013 Illumina HiSeq m
2000 bli
es
4 UMich UMich PhD thesis2 Dr. Jun Z. Li Illumina HiSeq ,
group 2000 A
7 MCW MCW (1 strain SMPH-UW) NA Dr. Howard Illumina HiSeq n
Jacob group 2000 n
ot
(continued) ati
o
6
1
Table 7 6
(continued) 2
Number M
of rat Secondary Publication data dbSNP o
Assembly SNVs total strains analysis provider Primary data source PMID provider Sequence platform dbSNP build ni
ka
3 MDC MDC NA Dr. Norbert Illumina/Solexa, T
Huebner Genome ut
group Analyzer II aj
et
5.0 261,827,016 42 HI-KNAW HI-KNAW, MDC, UCSM, 25943489 Hermsen et al. Illumina HiSeq 4,806,887 dbSnp138
MRC-ICS, ERIBA - 2015 2000
UMCG; Ma et al. 2014,
Guo et al. 2013, Atanur
et al. 2013, Baud et al.
2013, Simonis et al. 2012
20 MiB-KyushuU MiB-KyushuU, KDRI, 27882299 Yoshihara et al. Illumina NextSeq
ILA-KyotoU 2016 500
3 UDEL Nemours, UDEL, Penn Med 26502805 Barthold et al. Illumina HiSeq
2016 2500
9 MCW MCW (1 strain SMPH-UW) NA Dr. Howard Illumina HiSeq
Jacob group 2000
6.0 137,495,090 25 RGD Atanur et al. 2013 NA in preparation Illumina HiSeq 4,726,744 dbSnp146
2000
9 MCW MCW (1 strain SMPH-UW) NA Dr. Howard Illumina HiSeq
Jacob group 2000
8 MCW (lift-over HI-KNAW NA Dr. Michael SOLiD 4 and
results) Flister group 5500
1. Rat Genome Sequencing and Mapping Consortium, 2. Ref. [57]. BCM-HGSC Human Genome Sequencing Center, Baylor College of Medicine, USA, BGI BGI-Shenzhen,
China, CBMR The Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, Denmark, CEA-CNG Commissariat a l’Energie Atomique, Centre
National de Ge´notypage, France, DZHK German Centre for Cardiovascular Research, Germany, ECRC Experimental and Clinical Research Center, Charite´
Universittsmedizin Berlin, Germany, ERIBA UMCG European Research Institute for the Biology of Ageing, University Medical Center Groningen, Netherlands, EMBL-EBI
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, UK, FIMR Laboratory of Experimental Rheumatology, Feinstein
Institute for Medical Research, USA, HGSC-BCM Human Genome Sequencing Center, Baylor College of Medicine, USA, HI-KNAW Hubrecht Institute, Royal Netherland
Academy of Arts and Sciences, Netherlands, ICL Imperial College London, UK, ICAHN The Icahn School of Medicine at Mount Sinai, USA, ICAMS The Institute of
Cardiovascular and Medical Sciences,
University of Glasgow, UK, ILA-KyotoU Institute of Laboratory Animals, Kyoto University, Japan, INc-UAB Institute of Neurosciences, Universitat Auto`noma de Barcelona, Spain,
INSERM INSERM UMR-S872, Cordeliers Research Centre & INSERM U698, Hoˆpital Bichat, France, IPHYS-CAS Institute of Physiology, Czech Academy of Sciences, Czech
Republic, IUSM Department of Medical and Molecular Genetics, Indiana University School of Medicine, USA, KDRI Department of Technology Development, Kazusa DNA
Research Institute, Japan, KI-CNS Department of Clinical Neuroscience, Karolinska Institute, Sweden, KI-MBB Department of Medical Biochemistry and Biophysics, Karolinska
Institute, Sweden, MiB-KyushuU Medical Institute of Bioregulation, Kyushu University, Japan, MDC Max Delbruck Center for Molecular Medicine, Germany, MCW Medical
College of Wisconsin, USA, MPIMG Department of Computational Biology, Max Planck Institute for Molecular Genetics, Germany, MRC-ICS Physiological Genomic and
Medicine Group, Institute of Clinical Sciences, UK, Nemours Nemours Alfred I. duPont Hospital for Children, USA, Penn Med Perelman School of Medicine, University of
Pennsylvania, USA, SMPH-UW School of Medicine and Public Health, University of Wisconsin, USA, UCSF University of California, San Francisco, USA, UCSM Department of
Pharmacology, University of Colorado School of Medicine, USA, UDEL Center for Bioinformatics and Computational Biology, University of Delaware, USA, UESTC School of
Life Science and Technology, University of Electronic Science and Technology of China, China, UMich Department of Human Genetics, University of Michigan, USA, UMMC
University of Mississippi Medical Center, USA, UniSR The Vita-Salute San Raffaele University, Italy, WTCHG Wellcome Trust Centre for Human Genetics, UK, WTSI The
Wellcome Trust Sanger Institute, UK
R
at
G
e
n
o
m
e
A
ss
e
m
bli
es
,
A
n
n
ot
ati
o
6
3
64 Monika Tutaj et al.
indels were located in coding regions or splice sites, and 489 genes
were affected by variant changes. Only 19 of 193 genes expressed
in liver were differentially expressed between SHR and BN-Lx.
The variant predictions were correct but the unaffected genes
produced transcripts without affected exons. The duplicated gene
Mx2 showed heterozygous SNVs and the expression was also
heterozy- gous in the RNA-seq data. In BN-Lx, which does not
carry the duplications, all positions were homozygous at the DNA
and RNA levels. The results implied that many of the analyzed
liver tran- scripts were not spliced according to the available
annotations. Changes in transcript structure rarely overlapped with
genomic variants [11].
The Rat Genome Sequencing and Mapping Consortium ana-
lyzed 160 phenotypes from an outbred rat heterogeneous stock
(NIH-HS) and showed the presence of segregating variation in
commonly used laboratory rats [15]. NIH-HS rats were pheno-
typed for six disease models (anxiety, diabetes, hypertension,
aortic elastic lamina ruptures, multiple sclerosis, and osteoporosis)
and several related risk factors (lipid and cholesterol levels,
cardiac hypertrophy, etc.). The researchers identified 355 QTL
for 122 phenotypes using 265,551 polymorphic high-quality
SNPs. They investigated the extent to which the variants would
identify genes and causative mutations. 212 QTL (62%) had no
candidate variant. The median proportion of heritability explained
by QTL in rats and mice was above 30% and differed substantially
from the mean proportion of heritability in human, which was less
than 10%. The important observation was that genetic variants
present in both inbred rat strains and inbred mouse strains rarely
contributed to the same phenotype [15].
Atanur et al. studied coevolved gene clusters they named
“putative artificial selective sweep (PASS)” regions and defined
by the presence of many fixed rare variants and at least one variant
that contributed to selection [16]. PASS regions co-localized with
QTL, indicating an increased genetic variation in these regions
and therefore an enrichment of variation in genes associated with
disease phenotypes for which the strains were selected. They iden-
tified 9,665,340 SNVs and 3,502,117 short indels, and 29,131
SNVs were nonsynonymous coding (NSC) variants across 27 rat
strains, including 11 models of hypertension, diabetes, and insulin
resis- tance. Half of all single strain SNVs resided within the 189
segments that occupied only 0.8% of the genome, so private SNVs
were concentrated in a small number of discrete regions of the
genome. They identified clusters of coevolved transcripts that
were unique for each disease model. In the Milan hypertensive rat
strain (MHS), a cluster of 65 transcripts (47 genes) was found
containing NSC sequence variants. The NSC cluster included the
Add1 gene that is known to cause hypertension in MHS rats due to
amino acid substitution (F316Y). ADD1 is also associated with
human
Rat Genome Assemblies, Annotation, and Variants 65
3.3 Phylogeny Researchers interested in the relationships between rat strains used
different types of markers to produce phylogenetic trees. Early
researchers, using 28 biochemical marker loci, distinguished 52
genetically different rat strains that grouped in three clusters [49].
In later studies, highly variable microsatellites and high- density
SNPs were used as genetic markers to construct phyloge- netic
trees, establish the relatedness of organisms, and predict
66 Monika Tutaj et al.
Table 8
Comparison of phylogeny studies of laboratory rats
Publication year First author Number of strains Marker type
19841995
Number of markers 1997200320052006200820082013201520152017
Brown Norway Festing Canzian Canzian Thomas Smits* Mashimo Nijman STAR Atanur Hermsen* Battula Puckett
Long Evans Sprague Dawley 1 Sprague Dawley 2 Wistar - WKY Wistar - BB
F344 52 13 63 (214 sub) RAPD Bioch, SSLP
48 39 93 37 167 28 41 51 326 (29 inb)
Sabra & Cohen PVG Bioch 28 SSLP SNP 861 SSLP 357 SNP 820 SNP SNP SNP SSLP 76 SNP 32,127
BD ACI >4800 BN
264 995 6 20,283 9.6 Mln ~ 9 Mln
4
5 1 14 8 1 1 10 1 3 1 3
4 41 7 23 3 2 5 1 6 8 1
222211326253 5 5 31 10 2 3 2 3 32
4411 725 1 37 1 22 2 2 2 2
21 2 5 4 4 2 7 1 5 2 31
2
22 331 1 3 6 4 1 4 5
5 8
3 4 7 6 2 1 3
3 11 6 52 5 2
7 7 6
3 4
3 4 3
4 2 5
52
a
Phylogeny studies that did not explore phylogeny distances
4 Conclusion
variants (SNV), and QTL. The data are available for three rat
genome assemblies for a range of commonly used laboratory rat
strains. This repository is valuable for researchers that use rats in
medical research but also for those who do comparative analysis
using other organisms. RGD’s major goal is to present rat genomic
and phenotypic data making it easy to interpret, to assist in experi-
mental design and in the aftermath to facilitate rat research and
interspecies comparison. RGD’s resources may improve the repro-
ducibility of scientific research between laboratories and thus
ensure the overall quality of biomedical animal research.
References
1. Gibbs RA, Weinstock GM, Metzker ML, and haplotype mapping for genetic analysis
Muzny DM, Sodergren EJ, Scherer S et al in the rat. Nat Genet 40(5):560–566.
(2004) Genome sequence of the Brown Nor- https:// doi.org/10.1038/ng.124
way rat yields insights into mammalian 10. Atanur SS, Birol I, Guryev V, Hirst M,
evolu- tion. Nature 428(6982):493–521 Hummel O, Morrissey C et al (2010) The
2. Havlak P, Chen R, Durbin KJ, Egan A, Ren Y, genome sequence of the spontaneously
Song XZ et al (2004) The Atlas genome hyper- tensive rat: analysis and functional
assem- bly system. Genome Res 14(4):721– significance. Genome Res 20(6):791–803.
732 https://doi. org/10.1101/gr.103499.109
3. Krzywinski M, Wallis J, Go¨sele C, Bosdet 11. Simonis M, Atanur SS, Linsen S, Guryev V,
I, Chiu R, Graves T et al (2004) Integrated and Ruzius FP, Game L et al (2012) Genetic basis
sequence-ordered BAC- and YAC-based of transcriptome differences between the
physi- cal maps for the rat genome. Genome founder strains of the rat HXB/BXH recombi-
Res 14 (4):766–779 nant inbred panel. Genome Biol 13(4):r31.
4. Kren V, Qi N, Krenova D, Zidek V, Sladka´ M, https://doi.org/10.1186/gb-2012-13-4-r31
Ja´chymova´ M, M´ıkova´ B et al 12. Guo X, Brenner M, Zhang X, Laragione T,
(2001) Y-chromosome transfer induces Tai S, Li Y et al (2013) Whole-genome
changes in blood pressure and blood lipids in sequences of DA and F344 rats with different
SHR. Hypertension 37(4):1147–1152 susceptibilities to arthritis, autoimmunity,
5. Gibbs R, Weinstock G (2005) Upgrading the inflammation and cancer. Genetics 194
DNA sequence of the rat genome. White (4):1017–1028.
paper available at https://doi.org/10.1534/
https://www.genome.gov/pages/ genetics.113.153049
research/sequencing/seqproposals/ 13. Li R, Yu C, Li Y, Lam TW, Yiu SM,
ratupgradeseq.pdf Kristiansen K et al (2009) SOAP2: an
6. van Boxtel R, Cuppen E (2010) Rat traps: improved ultrafast tool for short read
filling the toolbox for manipulating the rat alignment. Bioinformatics 25(15):1966–1967.
genome. Genome Biol 11(9):217. https://doi.org/10.
https:// doi.org/10.1186/gb-2010-11-9-217 1093/bioinformatics/btp336
7. Prokop JW, Underwood AC, Turner ME, 14. Li R, Zhu H, Ruan J, Qian W, Fang X, Shi Z
Miller N, Pietrzak D, Scott S et al (2013) et al (2010) De novo assembly of human gen-
Anal- ysis of Sry duplications on the Rattus omes with massively parallel short read
norvegi- cus Y-chromosome. BMC sequencing. Genome Res 20(2):265–272.
Genomics 14:792. https://doi.org/10.1101/gr.097261.109
https://doi.org/10.1186/1471-2164-14- 15. Rat Genome Sequencing and Mapping Con-
792 sortium, Baud A, Hermsen R, Guryev V,
8. Rozen S, Warren WC, Weinstock G, O’Brien Stridh P, Graham D et al (2013) Combined
SJ, Gibbs RA, Richard K et al (2006) sequence-based and genetic mapping analysis
Sequenc- ing and annotating new mammalian of complex traits in outbred rats. Nat Genet 45
Y chromo- somes. White paper available at (7):767–775.
https://www. https://doi.org/10.1038/ng. 2644
genome.gov/pages/research/sequencing/ 16. Atanur SS, Diaz AG, Maratou K, Sarkis A,
seqproposals/ychromosomewp.pdf Rotival M, Game L et al (2013) Genome
9. STAR Consortium, Saar K, Beck A, Bihoreau sequencing reveals loci under artificial
MT, Birney E, Brocklebank D et al (2008) selection
SNP
Rat Genome Assemblies, Annotation, and Variants 69
that underlie disease phenotypes in the linkage map of the laboratory rat, Rattus nor-
labora- tory rat. Cell 154(3):691–703. vegicus. Nat Genet 9(1):63–69
https://doi. org/10.1016/j.cell.2013.06.040 29. Steen RG, Kwitek-Black AE, Glenn C,
17. Aitman TJ, Dong R, Vyse TJ, Norsworthy Gullings-Handley J, Van Etten W, Atkinson
PJ, Johnson MD, Smith J et al (2006) Copy OS et al (1999) A high-density integrated
num- ber polymorphism in Fcgr3 genetic linkage and radiation hybrid map of
predisposes to glo- merulonephritis in rats the laboratory rat. Genome Res 9(6): AP1–
and humans. Nature 439(7078):851–855 AP8. Erratum in: Genome Res 1999 9 (8):793
18. Thibaud-Nissen F, Souvorov A, Murphy T, 30. Hutton JJ, Roderick TH (1970) Linkage ana-
DiCuccio M, Kitts P (2013) Eukaryotic lyses using biochemical variants in mice.
genome annotation pipeline. In: The NCBI 3. Linkage relationships of eleven biochemical
handbook, 2nd edn. National Center for Bio- markers. Biochem Genet 4(2):339–350
technology Information, Bethesda. 31. Moutier R, Toyama K, Charrier MF (1973)
https:// Biochemical polymorphism in the rat, Rattus
www.ncbi.nlm.nih.gov/books/NBK169439/ norvegicus: genetic study of four markers.
19. Yeh RF, Lim LP, Burge CB (2001) Computa- Bio- chem Genet 8(3):321–328
tional inference of homologous gene structures 32. Botstein D, White RL, Skolnick M, Davis RW
in the human genome. Genome Res 11 (1980) Construction of a genetic linkage map
(5):803–816 in man using restriction fragment length poly-
20. Aken BL, Ayling S, Barrell D, Clarke L, morphisms. Am J Hum Genet 32(3):314–331
Curwen V, Fairley S et al (2016) The Ensembl 33. Williams JG, Kubelik AR, Livak KJ, Rafalski
gene annotation system. Database (Oxford) JA, Tingey SV (1990) DNA polymorphisms
2016:baw093. https://doi.org/10.1093/data ampli- fied by arbitrary primers are useful as
base/baw093 genetic markers. Nucleic Acids Res 18:6531–
21. Birney E, Clamp M, Durbin R (2004) Gene- 6535
Wise and Genomewise. Genome Res 14 34. Bryda EC, Riley LK (2008) Multiplex micro-
(5):988–995 satellite marker panels for genetic monitoring
22. National Center for Biotechnology Informa- of common rat strains. J Am Assoc Lab Anim
tion (2005) US National Library of Sci 47(3):37–41
Medicine, Bethesda. 35. Jacob HJ, Lindpaintner K, Lincoln SE,
http://www.ncbi.nlm.nih.gov. Accessed 1 Kusumi K, Bunker RK, Mao YP et al (1991)
Feb 2015 Genetic mapping of a gene causing hyperten-
23. Yates A, Akanni W, Amode MR, Barrell D, sion in the stroke-prone spontaneously hyper-
Billis K, Carvalho-Silva D et al (2016) tensive rat. Cell 67(1):213–224
Ensembl 2016. Nucleic Acids Res 44:D710– 36. Levan G, Szpirer J, Szpirer C, Klinga K,
D716. Hanson C, Islam MQ (1991) The gene map
https://doi.org/10.1093/nar/gkv1157 of the Norway rat (Rattus norvegicus) and
24. Buels R, Yao E, Diesh CM, Hayes RD, comparative mapping with mouse and man.
Munoz- Torres M, Helt G et al (2016) Genomics 10:699–718
JBrowse: a dynamic web platform for 37. Bihoreau M-T, Sebag-Montefiore L, Godfrey
genome visualiza- tion and analysis. RF, Wallis RH, Brown JH, Danoy PA et al
Genome Biol 17:66. (1997) A high-resolution consensus linkage
https://doi.org/10.1186/s13059-016- map of the rat, integrating radiation hybrid
0924- 1 and genetic maps. Genomics 75:57–69
25. Kumar D, Yadav AK, Jia X, Mulvenna J, 38. Brown DM, Matise TC, Koike G, Simon JS,
Dash D (2015) Integrated transcriptomic- Winer ES, Zangen S et al (1998) An
proteomic analysis using a proteogenomic integrated genetic linkage map of the
workflow refines rat genome annotation. Mol laboratory rat. Mamm Genome 9(7):521–
Cell Prote- omics 15(1):329–339. 530
https://doi.org/10. 1074/mcp.M114.047126 39. Jensen-Seaman MI, Furey TS, Payseur BA,
26. Wu PY, Phan JH, Wang MD (2013) Lu Y, Roskin KM, Chen CF et al (2004) Com-
Assessing the impact of human genome parative recombination rates in the rat, mouse,
annotation choice on RNA-seq expression and human genomes. Genome Res 14 (4):528–
estimates. BMC Bioinformatics 11:S8. 538
https://doi.org/ 10.1186/1471-2105-14- 40. Littrell J, Tsaih SW, Baud A, Rastas P,
S11-S8 Solberg- Woods L, Flister MJ (2018) A high-
27. Serikawa T, Kuramoto T, Hilbert P, Mori M, resolution genetic map for the laboratory rat.
Yamada J, Dubay CJ et al (1992) Rat gene G3 (Bethesda) 8(7):2241–2248
mapping using PCR-analyzed microsatellites.
Genetics 131(3):701–721
28. Jacob HJ, Brown DM, Bunker RK, Daly MJ,
Dzau VJ, Goodman A et al (1995) A genetic
70 Monika Tutaj et al.
41. Bhe´rer C, Campbell CL, Auton A 49. Festing MF, Bender K (1984) Genetic
(2017) Refined genetic maps reveal sexual relation- ships between inbred strains of rats.
dimorphism in human meiotic recombination An analysis based on genetic markers at 28
at multiple scales. Nat Commun 8:14994 biochemical loci. Genet Res 44(3):271–281
42. Morgan AP, Gatti DM, Najarian ML, Keane 50. Canzian F, Ushijima T, Pascale R, Sugimura
TM, Galante RJ, Pack AI et al (2017) Struc- T, Dragani TA, Nagao M (1995) Construction
tural variation shapes the landscape of of a phylogenetic tree for inbred strains of rat
recom- bination in mouse. Genetics by arbitrarily primed polymerase chain
206:603–619 reaction (AP-PCR). Mamm Genome
43. Ulirsch JC, Nandakumar SK, Wang L, Giani 6(4):231–235
FC, Zhang X, Rogov P et al (2016) Systematic 51. Canzian F (1997) Phylogenetics of the labora-
functional dissection of common genetic varia- tory rat Rattus norvegicus. Genome Res 7
tion affecting red blood cell traits. Cell 165 (3):262–267
(6):1530–1545.
https://doi.org/10.1016/j. cell.2016.04.048 52. Thomas MA, Chen CF, Jensen-Seaman MI,
Tonellato PJ, Twigger SN (2003) Phyloge-
44. Wood AR, Esko T, Yang J, Vedantam S, netics of rat inbred strains. Mamm Genome
Pers TH, Gustafsson S et al (2014) Defining 14(1):61–64
the role of common variation in the genomic
and biological architecture of adult human 53. Mashimo T, Voigt B, Tsurumi T, Naoi K,
height. Nat Genet 46(11):1173–1186. Nakanishi S, Yamasaki K et al (2006) A set of
https://doi. org/10.1038/ng.3097 highly informative rat simple sequence length
polymorphism (SSLP) markers and genetically
45. Shimoyama M, De Pons J, Hayman GT, Lau- defined rat strains. BMC Genet 7:19
lederkind SJ, Liu W, Nigam R et al (2015) The
Rat Genome Database 2015: genomic, pheno- 54. Nijman IJ, Kuipers S, Verheul M, Guryev V,
typic and environmental variations and Cuppen E (2008) A genome-wide SNP panel
disease. Nucleic Acids Res 43(Database for mapping and association studies in the
issue): D743–D750 rat. BMC Genomics 9:95.
https://doi.org/10. 1186/1471-2164-9-95
46. Twigger SN, Pruitt KD, Ferna´ndez-Sua
´rez XM, Karolchik D, Worley KC, Maglott 55. Battula KK, Nappanveettil G, Nakanishi S,
Kuramoto T, Friedman JM, Kalashikam RR
DR et al (2008) What everybody should
(2015) Genetic relatedness of WNIN and
know about the rat genome and its online
WNIN/Ob with major rat strains in biomedi-
resources. Nat Genet 40(5):523–527.
cal research. Biochem Genet 53 (4–6):132–
https://doi.org/ 10.1038/ng0508-523
140. https://doi.org/10.1007/ s10528-
47. Hermsen R, de Ligt J, Spee W, Blokzijl F, 015-9679-8
Sch€afer S, Adami E et al (2015) Genomic
56. Smits BM, Guryev V, Zeegers D, Wedekind
land- scape of rat strain and substrain
D, Hedrich HJ, Cuppen E (2005) Efficient
variation. BMC Genomics 16:357.
single nucleotide polymorphism discovery in
https://doi.org/10.1186/ s12864-015-
labora- tory rat strains using wild rat-derived
1594-1
SNP can- didates. BMC Genomics 6:170
48. She R, Jarosz DF (2018) Mapping causal var-
57. Ren Y (2016) Multi-omics analysis of a rat
iants with single-nucleotide resolution reveals
model of aerobic exercise capacity and meta-
biochemical drivers of phenotypic change.
bolic fitness. PhD dissertation, University of
Cell 172(3):478–490.
Michigan, Michigan
https://doi.org/10.1016/j. cell.2017.12.015
Chapter 3
Abstract
Resources for rat researchers are extensive, including strain repositories and databases all around the
world. The Rat Genome Database (RGD) serves as the primary rat data repository, providing both manual
and computationally collected data from other databases.
Key words Database, Genomics, Analysis, Visualization, Disease, Phenotype, Pathway, Gene, Anno-
tation, Model organism
1 Introduction
G. Thomas Hayman et al. (eds.), Rat Genomics, Methods in Molecular Biology, vol. 2018,
https://doi.org/10.1007/978-1-4939-9581-3_3, © Springer Science+Business Media, LLC, part of Springer Nature 2019
71
72 Stanley J. F. Laulederkind et al.
2.1 RRRC Many important rat strains for life science research have been
maintained by scientists in individual laboratories. This type of
resource propagation is inefficient and susceptible to changes in
funding or local interest. The NIH rat model repository workshop
was held in 1998, with scientists from around the world discussing
the needs, opportunities, and parameters for optimal standardiza-
tion, maintenance, and distribution of genetically defined rat
strains. Those scientists strongly encouraged the NIH to establish a
national rat genetics resource center, and as a result, the Rat
Resource Research Center (RRRC) was established in 2001.
The service functions of the RRRC
(https://www.rrrc.us/) involve the procurement of non-
commercial rat lines, sperm and embryo cryopreservation, cryo-
resuscitation or rederivation with pathogen and genotype quality
control, genotyping and cyto- genetic services, gut microbiome
characterization, and distribution of rats, cell lines, and tissues to
biomedical investigators. RRRC also performs research to make
improvements in rat model develop- ment and enhancement.
been supplied with rat strains or rat DNA from NBRP-Rat. Proto-
cols for cryopreservation and rederivation techniques have also
been supplied by NBRP-Rat to the research community [10].
NBRP-Rat’s Phenome Project was a reevaluation of more than
150 strains based on over 100 phenotypic parameters in seven
general categories [11]. A major benefit of all these phenotypic
measurements is the generation of biological ranges of various
parameters, which allows visualization of normal and abnormal
values for the different rat strains examined. The data can be
visualized on the NBRP-Rat web site or in the RGD PhenoMiner
tool [12].
More than 700 rat strains have been deposited at NBRP-Rat,
with most of those available as cryopreserved sperm or embryos
and the remaining available as live animals. All of the deposited
strains can be obtained by interested researchers. The Kyoto
University Rat Mutant Archive (KURMA) was added to NBRP-
Rat to provide ENU mutant strains, which provide many models
for biomedical research. More than 150 strains have been
genotyped by NBRP- Rat with more than 300 microsatellite
markers [13, 14]. These genotyped rats have provided data to
create phylogenetic charts and SSLP charts, which allow a visual
approximation of the genetic distance between different strains.
There are also various other tools which allow public access to
data at NBRP-Rat.
2.3 Gene Editing The Gene Editing Rat Resource Center (GERRC;
Rat Resource http://rgd.mcw. edu/wg/gerrc) at the Medical College of
Center Wisconsin committed in 2013 to produce about 200 genetically
modified rat strains over a five-year period for use by researchers.
These selected strains have specific genes knocked out using
several different gene editing technologies. There were two
application rounds each year, during which researchers requested
genes to be knocked out in a specific strain, with up to two
applications allowed per laboratory. Applica- tions were reviewed
by an external advisory board to determine which models, up to
25, were to be made. After the strains were created, usually 9 to 12
months after application, the requesting investigator received the
first breeder pair. Any other breeder pairs are available to other
investigators on a first come, first served basis. An annotated list
of all the mutant strains generated by the project is available on the
GERRC web site.
3.1 Rat The original 2004 release of the reference genome for the rat
Genome Project [15] was done by the Rat Genome Sequencing Consortium
(RGSC) led by the Human Genome Sequencing Center at
Baylor College of Medicine (BCM-HGSC). Access to the
original data and assembly updates (including Rnor 6.0) is
available on the BCM-HGSC web
74 Stanley J. F. Laulederkind et al.
site (https://www.hgsc.bcm.edu/other-mammals/rat-
genome- project) and at the National Center for Biotechnology
Information (NCBI)
(https://www.ncbi.nlm.nih.gov/genome/73).
3.2.1 RGD Data Objects RGD stores data about various “objects,” including genes, quanti-
tative trait loci (QTLs), markers, references, strains, and cell lines.
Report pages for these objects are presented in a similar format.
The most data-rich report pages are gene, QTL, and strain pages.
Disease data for genes, QTLs, and strains can also be accessed
Rat Resources 75
through various RGD Disease Portals. Pathway data for genes can
be accessed through the RGD Pathway Portal. Physiological data
for strains is accessible through the Phenotypes and Models Portal.
Quantitative Trait Loci Another type of RGD object presented on report pages is the
Quantitative Trait Locus (QTL), which is a large region of DNA
associated with a physiological or pathological phenotype. RGD
has data on a large variety of QTLs (rat, mouse, and human)
describing physiologic and anatomic traits, like blood pressure
and organ weight, to disease traits for cancer, diabetes, and other
pathological conditions.
The top section of a QTL report page provides the QTL
name, trait and measurement type. Significance scores, map
information, and strains crossed to derive the QTL are also
provided. The Annotation section contains disease annotations
with DO terms, phenotype annotations with MP (rat, mouse) or
HP (human) terms, and experimental data annotations, which
use the following ontologies: the Vertebrate Trait Ontology
(VT), the Clinical Mea- surement Ontology (CMO), the
Measurement Method Ontology (MMO), and the Experimental
Condition Ontology (XCO). The CMO, MMO, and XCO are
RGD-produced and maintained ontol- ogies. References and
disease portal links are also provided in the Annotation section
of the QTL report page. The “Region” section provides
position markers for the QTL, and genes, markers, and
overlapping QTLs in the region.
Fig. 1 Gene Report Page for Rat Ptgs1. (A) The top half of the page contains general information,
ortholog assignments, genomic positions, JBrowse model, and links to external sites. (B) The bottom half
of the page has annotations in various categories, genomic information, sequence information, and more,
all in expand- able, labeled bars
Rat Resources 77
Fig. 2 Ontology Term Browser. The RGD term browser with “rat strain” (RS) selected in the top
panel. (A) Selection “SHR” in the bottom panel has an accompanying “View Strain Report” link
3.2.2 Portal RGD currently has 12 disease portals encompassing many disease
Access to RGD areas from developmental and age-related to cardiovascular and
Data neurological. Each portal is an entry point where investigators can
Disease Portals access data and tools relevant to their research area. One can
access rat, mouse, and human genes and QTLs, and rat strains
annotated to a selected disease category or subcategory (see Fig.
3A). Annota- tions for a disease-related phenotype, biological
process, or
78 Stanley J. F. Laulederkind et al.
Fig. 3 Hematologic Disease Portal Home Page. (A) Drop down menus for selection of disease category
and specific disease. (B) Numerical summary of results for the selected disease category/disease. (C)
GViewer display of results with approximate positions of all disease genes, QTLs, and strains. (D) Lists of
genes, QTLs, and strains annotated to selected disease category/disease. (E) Graphs showing Gene
Ontology annotations for all selected disease-annotated genes, using GO slim (subset)
representations of the three GO aspects
Fig. 4 De Novo Pyrimidine Biosynthetic Pathway Diagram. The diagram is accompanied by a text
description above it and a key to the left of it
are listed, linked to their respective report pages. The bottom of the
page shows graphs displaying GO annotation enrichment data.
Pathway Portal The RGD Pathway Portal presently contains 200 interactive path-
way diagram pages organized into five branches, based on the five
branches of the Pathway Ontology, which was developed at RGD.
Some pathway pages are organized into suites of related pathways,
and suite networks—higher order organizations of suites. The
molecular pathway diagrams (see Fig. 4) are designed with
Elsevier’s Pathway Studio software
(https://support.pathwaystudio.com/) and feature hyperlinks
from most of the objects in the diagram to RGD pages
representing the respective term, gene, chemical, or associated
secondary pathway. Beneath the diagram is a download- able list
of genes in the pathway (see Fig. 5A), with tabs for rat, human,
mouse, and other species. Below the gene lists are tables of
additional elements in the pathway (see Fig. 5B), disease
80 Stanley J. F. Laulederkind et al.
Fig. 5 Pathway Gene/Element Lists. A number of gene lists are found on pathway diagram pages
below the diagram. (A) A list of genes annotated to de novo pyrimidine biosynthetic pathway and its
children terms. The
Rat Resources 81
JBrowse The JBrowse genome browser [25, 26] from the Generic Model
Organism Database project (http://www.gmod.org) is an
interac- tive tool which allows researchers to visualize a variety of
genetic and phenotypic data types in their genomic context.
Virtually all of
Fig. 5 (continued) list includes links to RGD gene report pages, JBrowse, and reference pages. (B) A
list of additional elements in the pathway. (C) A list of disease ontology terms/genes that can be
toggled by the title bar to genes/disease terms. All the disease terms link to ontology report pages
and the gene symbols link to gene report pages. (D) A list of additional pathways associated with
genes annotated to the diagrammed pathway. (E) A list of phenotypes associated with the genes
annotated to the diagrammed pathway
82 Stanley J. F. Laulederkind et al.
Fig. 6 Interviewer Search/Results. The target protein (rat Grb2) that initiated the search is shown in the
center of the graphic display. Individual proteins are indicated by color-coded circles (red—rat, green
—mouse, blue—human). The types of interactions are designated by color-coded lines between
the circles
the data within the Rat Genome Database has been associated with
the genome sequence in one way or another. As fundamental
datasets such as genes, quantitative trait loci, microsatellite and
SNP markers, and sequence resources such as ESTs, are aligned
with the genome sequence, they bring with them phenotypic and
other information. This information includes gene-chemical inter-
action data, genetic associations with disease, RNA-Seq data, syn-
teny views of rat, mouse, and human genomes, and many types of
variant/mutation data. Any or all of these can be accessed via the
JBrowse genome browser and their relationship to the genomic
sequence explored.
Rat Resources 83
Gene Annotator The Gene Annotator (GA) takes a list of gene symbols, RGD IDs,
GenBank accession numbers, Ensembl identifiers, and/or a chro-
mosomal region, and retrieves annotation data from RGD. The
tool will retrieve annotations from most ontologies used at RGD
for genes and their orthologs, as well as links to additional informa-
tion at other databases. The entry page (see Fig. 7A) is very similar
to the InterViewer entry page.
The first GA page after a search is an annotation/external link/
species selection page where everything is selected by default
(see Fig. 7B). Clicking the submit button returns a page with all
anno- tations for the first gene (and selected orthologs) in the list.
The lists include links to RGD gene pages, ontology term pages,
anno- tation pages, and external data pages (see Fig. 7C).
A list of links at the top of the page allows the user to pick a
particular type of analysis to view (Annotation Distribution or
Comparison Heat Map) or to send the gene list to another tool
by selecting the “All Analysis Tools” link.
On the “Annotation Distribution” page (see Fig. 7D) there are
enrichment lists of terms by category, which rank the terms
accord- ing to how many of the searched genes are annotated to
those particular terms. Each entry in the list can be opened to see
which genes and which specific terms are in the annotations.
Subsets of annotations can be displayed by selecting at least two of
the check boxes which appear to the right of every term in the
lists.