Background: Size of the reference population and reliability of phenotypes are crucial factors in... more Background: Size of the reference population and reliability of phenotypes are crucial factors influencing the reliability of genomic predictions. It is therefore useful to combine closely related populations. Increased accuracies of genomic predictions depend on the number of individuals added to the reference population, the reliability of their phenotypes, and the relatedness of the populations that are combined. Methods: This paper assesses the increase in reliability achieved when combining four Holstein reference populations of 4000 bulls each, from European breeding organizations, i.e. UNCEIA (France), VikingGenetics (Denmark, Sweden, Finland), DHV-VIT (Germany) and CRV (The Netherlands, Flanders). Each partner validated its own bulls using their national reference data and the combined data, respectively. Results: Combining the data significantly increased the reliability of genomic predictions for bulls in all four populations. Reliabilities increased by 10%, compared to reliabilities obtained with national reference populations alone, when they were averaged over countries and the traits evaluated. For different traits and countries, the increase in reliability ranged from 2% to 19%. Conclusions: Genomic selection programs benefit greatly from combining data from several closely related populations into a single large reference population.
In admixed populations, markers may be associated to different QTL depending on the origin of a g... more In admixed populations, markers may be associated to different QTL depending on the origin of a given genomic segment. The goal of this study was to investigate if taking into account of the breed origin of alleles in a breed of origin genomic model (BOGM) can improve genomic predictions compared to a traditional genomic model (TGM) in admixture populations. Real genotype data of Danish Holstein and Jersey breeds were used as base populations to simulate F1 crosses. This was followed by simulating 5 discrete generations of random mating to achieve a highly admixed population. A single trait with heritability of 0.25 and 100 QTL was simulated on the genome. Three different scenarios were considered, where the QTL effects of two breeds were sampled from a multivariate normal distribution with correlation 1.0, 0.5, or 0.1. Accuracy and bias of models which were measured as correlation and regression coefficient between true and estimated genomic breeding values, respectively, were vali...
The aim of this study was to investigate the effect of different strategies for handling lowquali... more The aim of this study was to investigate the effect of different strategies for handling lowquality or missing data on prediction accuracy for direct genomic values of protein yield, mastitis and fertility using a Bayesian variable model and a GBLUP model in the Danish Jersey population. The data contained 1 071 Jersey bulls that were genotyped with the Illumina Bovine 50K chip. After preliminary editing, 39 227 single nucleotide polymorphism (SNPs) remained in the dataset. Four methods to handle missing genotypes were: 1) BEAGLE: missing markers were imputed using Beagle 3.3 software, 2) COMMON: missing genotypes at a locus were replaced by the most common genotype at this locus observed in the marker data, 3) EX-ALLELE: missing marker genotypes at a locus were treated as an extra allele, and 4) POP-EXP: missing genotypes at a locus were replaced with population expectation at this locus. It was shown that among the methods used in this study, the imputation with Beagle was the best approach to handle missing genotypes. Treating missing markers as a pseudo-allele, replacing missing markers with a population average or substituting the most common alleles each reduced the accuracy of genomic predictions. The results from this study suggest that missing genotypes should be imputed in order to improve genomic prediction. Editing the marker data with a stringent threshold on GenCall scores and then imputing the discarded genotypes did not lead to higher accuracy. All marker genotypes with a GenCall score over 0.15 should be retained for genomic prediction.
Accurate genomic prediction requires a large reference population, which is problematic for trait... more Accurate genomic prediction requires a large reference population, which is problematic for traits that are expensive to measure. Traits related to milk protein composition are not routinely recorded due to costly procedures and are considered to be controlled by a few quantitative trait loci of large effect. The amount of variation explained may vary between regions leading to heterogeneous (co)variance patterns across the genome. Genomic prediction models that can efficiently take such heterogeneity of (co)variances into account can result in improved prediction reliability. In this study, we developed and implemented novel univariate and bivariate Bayesian prediction models, based on estimates of heterogeneous (co)variances for genome segments (BayesAS). Available data consisted of milk protein composition traits measured on cows and de-regressed proofs of total protein yield derived for bulls. Single-nucleotide polymorphisms (SNPs), from 50K SNP arrays, were grouped into non-ove...
With the development of SNP chips, SNP information provides an efficient approach to further dise... more With the development of SNP chips, SNP information provides an efficient approach to further disentangle different patterns of genomic variances and covariances across the genome for traits of interest. Due to the interaction between genotype and environment as well as possible differences in genetic background, it is reasonable to treat the performances of a biological trait in different populations as different but genetic correlated traits. In the present study, we performed an investigation on the patterns of region-specific genomic variances, covariances and correlations between Chinese and Nordic Holstein populations for three milk production traits. Variances and covariances between Chinese and Nordic Holstein populations were estimated for genomic regions at three different levels of genome region (all SNP as one region, each chromosome as one region and every 100 SNP as one region) using a novel multi-trait random regression model which uses latent variables to model hetero...
Litter size and piglet mortality are important traits in pig production. The study aimed to ident... more Litter size and piglet mortality are important traits in pig production. The study aimed to identify quantitative trait loci (QTL) for litter size and mortality traits, including total number of piglets born (TNB), litter size at day 5 (LS5) and mortality rate before day 5 (MORT) in Danish Landrace and Yorkshire pigs by genome-wide association studies (GWAS). The phenotypic records and genotypes were available in 5,977 Landrace pigs and 6,000 Yorkshire pigs born from 1998 to 2014. A linear mixed model (LM) with a single SNP regression and a Bayesian mixture model (BM) including effects of all SNPs simultaneously were used for GWAS to detect significant QTL association. The response variable used in the GWAS was corrected phenotypic value which was obtained by adjusting original observations for non-genetic effects. For BM, the QTL region was determined by using a novel post-Gibbs analysis based on the posterior mixture probability. The detected association patterns from LM and BM mo...
Dominance and imprinting genetic effects have been shown to contribute to genetic variance for ce... more Dominance and imprinting genetic effects have been shown to contribute to genetic variance for certain traits but are usually ignored in genomic prediction of complex traits in livestock. The objectives of this study were to estimate variances of additive, dominance and imprinting genetic effects and to evaluate predictions of genetic merit based on genomic data for average daily gain (DG) and backfat thickness (BF) in Danish Duroc pigs. Corrected phenotypes of 8113 genotyped pigs from breeding and multiplier herds were used. Four Bayesian mixture models that differed in the type of genetic effects included: (A) additive genetic effects, (AD) additive and dominance genetic effects, (AI) additive and imprinting genetic effects, and (ADI) additive, dominance and imprinting genetic effects were compared using Bayes factors. The ability of the models to predict genetic merit was compared with regard to prediction reliability and bias. Based on model ADI, narrow-sense heritabilities of 0...
Abstract Text: This study investigated reliability of genomic prediction in various scenarios wit... more Abstract Text: This study investigated reliability of genomic prediction in various scenarios with regard to relationship between test and reference animals and between animals within the reference population. Different reference populations were generated from EuroGenomics data and 1288 Nordic Holstein bulls as a common test population. A GBLUP model and a Bayesian mixture model were applied to predict Genomic breeding values for bulls in the test data. Result showed that a closer relationship between test and reference animals led to a higher reliability, while a closer relationship between reference animal resulted in a lower reliability. Therefore, the design of reference population is important for improving the reliability of genomic prediction. With regard to model, the Bayesian mixture model in general led to slightly a higher reliability of genomic prediction than the GBLUP model. Keywords: genomic prediction genomic relationship reliability
Non-additive genetic variation is usually ignored when genome-wide markers are used to study the ... more Non-additive genetic variation is usually ignored when genome-wide markers are used to study the genetic architecture and genomic prediction of complex traits in human, wild life, model organisms or farm animals. However, non-additive genetic effects may have an important contribution to total genetic variation of complex traits. This study presented a genomic BLUP model including additive and non-additive genetic effects, in which additive and non-additive genetic relation matrices were constructed from information of genome-wide dense single nucleotide polymorphism (SNP) markers. In addition, this study for the first time proposed a method to construct dominance relationship matrix using SNP markers and demonstrated it in detail. The proposed model was implemented to investigate the amounts of additive genetic, dominance and epistatic variations, and assessed the accuracy and unbiasedness of genomic predictions for daily gain in pigs. In the analysis of daily gain, four linear mod...
Data from the joint Nordic breeding value prediction for Danish and Swedish Holstein grandsire fa... more Data from the joint Nordic breeding value prediction for Danish and Swedish Holstein grandsire families were used to locate quantitative trait loci (QTL) for female fertility traits in Danish and Swedish Holstein cattle. Up to 36 Holstein grandsires with over 2,000 sons were genotyped for 416 microsatellite markers. Single trait breeding values were used for 12 traits relating to female fertility and female reproductive disorders. Data were analyzed by least squares regression analysis within and across families. Twenty-six QTL were detected on 17 different chromosomes. The best evidence was found for QTL segregating on Bos taurus chromosome (BTA)1, BTA7, BTA10, and BTA26. On each of these chromosomes, several QTL were detected affecting more than one of the fertility traits investigated in this study. Evidence for segregation of additional QTL on BTA2, BTA9, and BTA24 was found.
The use of genomic information in genetic evaluation has revolutionized dairy cattle breeding. It... more The use of genomic information in genetic evaluation has revolutionized dairy cattle breeding. It remains a major challenge to understand the genetic basis of variation for quantitative traits. Here, we study the genetic architecture for milk, fat, protein, mastitis and fertility indices in dairy cattle using NGS variants. The analysis was done using a linear mixed model (LMM) and a Bayesian mixture model (BMM). The top 10 QTL identified by LMM analyses explained 22.61, 23.86, 10.88, 18.58 and 14.83% of the total genetic variance for these traits respectively. Trait-specific sets of 4,964 SNPs from NGS variants (most 'associated' SNP for each 0.5 Mbp bin) explained 81.0, 81.6, 85.0, 60.4 and 70.9% of total genetic variance for milk, fat, protein, mastitis and fertility indices when analyzed simultaneously by BMM.
Newcastle disease (ND) and avian influenza (AI) are the most feared diseases in the poultry indus... more Newcastle disease (ND) and avian influenza (AI) are the most feared diseases in the poultry industry worldwide. They can cause flock mortality up to 100%, resulting in a catastrophic economic loss. This is the first study to investigate the feasibility of genomic selection for antibody response to Newcastle disease virus (Ab-NDV) and antibody response to Avian Influenza virus (Ab-AIV) in chickens. The data were collected from a crossbred population. Breeding values for Ab-NDV and Ab-AIV were estimated using a pedigree-based best linear unbiased prediction model (BLUP) and a genomic best linear unbiased prediction model (GBLUP). Single-trait and multiple-trait analyses were implemented. According to the analysis using the pedigree-based model, the heritability for Ab-NDV estimated from the single-trait and multiple-trait models was 0.478 and 0.487, respectively. The heritability for Ab-AIV estimated from the two models was 0.301 and 0.291, respectively. The estimated genetic correlation between the two traits was 0.438. A four-fold cross-validation was used to assess the accuracy of the estimated breeding values (EBV) in the two validation scenarios. In the family sample scenario each half-sib family is randomly allocated to one of four subsets and in the random sample scenario the individuals are randomly divided into four subsets. In the family sample scenario, compared with the pedigree-based model, the accuracy of the genomic prediction
Background: Growth and carcass traits are very important traits for broiler chickens. However, ca... more Background: Growth and carcass traits are very important traits for broiler chickens. However, carcass traits can only be measured postmortem. Genomic selection may be a powerful tool for such traits because of its accurate prediction of breeding values of animals without own phenotypic information. This study investigated the efficiency of genomic prediction in Chinese triple-yellow chickens. As a new line, Chinese triple-yellow chicken was developed by cross-breeding and had a small effective population. Two growth traits and three carcass traits were analyzed: body weight at 6 weeks, body weight at 12 weeks, eviscerating percentage, breast muscle percentage and leg muscle percentage. Results: Genomic prediction was assessed using a 4-fold cross-validation procedure for two validation scenarios. In the first scenario, each test data set comprised two half-sib families (family sample) and the rest represented the reference data. In the second scenario, the whole data were randomly divided into four subsets (random sample). In each fold of validation, one subset was used as the test data and the others as the reference data in each single validation. Genomic breeding values were predicted using a genomic best linear unbiased prediction model, a Bayesian least absolute shrinkage and selection operator model, and a Bayesian mixture model with four distributions. The accuracy of genomic estimated breeding value (GEBV) was measured as the correlation between GEBV and the corrected phenotypic value. Using the three models, the correlations ranged from 0.448 to 0.468 for the two growth traits and from 0.176 to 0.255 for the three carcass traits in the family sample scenario, and were between 0.487 and 0.536 for growth traits and between 0.312 and 0.430 for carcass traits in the random sample scenario. The differences in the prediction accuracies between the three models were very small; the Bayesian mixture model was slightly more accurate. According to the results from the random sample scenario, the accuracy of GEBV was 0.197 higher than the conventional pedigree index, averaged over the five traits. Conclusions: The results indicated that genomic selection could greatly improve the accuracy of selection in chickens, compared with conventional selection. Genomic selection for growth and carcass traits in broiler chickens is promising.
Introduction: The state-of-the-art for dealing with multiple levels of relationship among the sam... more Introduction: The state-of-the-art for dealing with multiple levels of relationship among the samples in genome-wide association studies (GWAS) is unified mixed model analysis (MMA). This approach is very flexible, can be applied to both family-based and population-based samples, and can be extended to incorporate other effects in a straightforward and rigorous fashion. Here, we present a complementary approach, called 'GENMIX (genealogy based mixed model)' which combines advantages from two powerful GWAS methods: genealogy-based haplotype grouping and MMA. Subjects and Methods: We validated GENMIX using genotyping data of Danish Jersey cattle and simulated phenotype and compared to the MMA. We simulated scenarios for three levels of heritability (0.21, 0.34, and 0.64), seven levels of MAF (0.05, 0.10, 0.15, 0.20, 0.25, 0.35, and 0.45) and five levels of QTL effect (0.1, 0.2, 0.5, 0.7 and 1.0 in phenotypic standard deviation unit). Each of these 105 possible combinations (3 h 2 x 7 MAF x 5 effects) of scenarios was replicated 25 times. Results: GENMIX provides a better ranking of markers close to the causative locus' location. GENMIX outperformed MMA when the QTL effect was small and the MAF at the QTL was low. In scenarios where MAF was high or the QTL affecting the trait had a large effect both GENMIX and MMA performed similarly. Conclusion: In discovery studies, where high-ranking markers are identified and later examined in validation studies, we therefore expect GENMIX to enrich candidates brought to follow-up studies with true positives over false positives more than the MMA would.
The spectacular increase in productivity of dairy cattle has been accompanied by the concomitant ... more The spectacular increase in productivity of dairy cattle has been accompanied by the concomitant decline in fertility. It is generally assumed that this decline is primarily due to the negative energy balance of highproducing cows at the peak of lactation. We herein describe the fine-mapping of a major fertility QTL in Nordic Red cattle, and identify a 660-Kb deletion encompassing four genes as the causative variant. We show that the deletion is a recessive embryonically lethal mutation. This probably results from the loss of RNASEH2B, which is known to cause embryonic death in mice. Despite its dramatic effect on fertility, 13%, 23% and 32% of the animals carry the deletion in Danish, Swedish and Finnish Red Cattle, respectively. To explain this, we searched for favorable effects on other traits and found that the deletion has strong positive effect on milk yield. This study demonstrates that embryonic lethal mutations account for a non-negligible fraction of the decline in fertility of domestic cattle, and that associated positive effect on milk yield may account for part of the negative genetic correlation. Our study adds to the evidence that structural variants contribute to animal phenotypic variation, and that balancing selection might be more common in livestock species than previously appreciated.
Despite its importance, fertility has been declining in many cattle populations. In dairy cattle,... more Despite its importance, fertility has been declining in many cattle populations. In dairy cattle, this decline is often attributed to the negative correlation between fertility and productions traits. Recent studies showed that embryonic lethal variants might also account for a non-negligible fraction of the fertility decline. Therefore identification of such embryonic lethal variants is essential to improve fertility. We herein illustrate, with an example of a large recessive lethal deletion recently identified in Nordic Red cattle, that haplotype-based method are particularly efficient to identify such embryonic lethal variants. We first show that haplotypes can be used in traditional QTL mapping approaches and that they present very high linkage disequilibrium with underlying variants. Haplotypes can also be used in scan for lack of homozygosity. Indeed, if a haplotype is associated to a recessive lethal variant, significantly fewer living individuals will be homozygote for that haplotype than expected. For both approaches, haplotype-based methods were particularly efficient. The lack of homozygosity approach achieved higher significance than the QTL approach. Only frequent variants can be detected with both approaches unless huge genotyped cohorts are available. An alternative approach would rely on identifying potential harmful variants in next-generation sequencing data followed by the genotyping of a larger population for these variants.
Background: Genomic prediction uses two sources of information: linkage disequilibrium between ma... more Background: Genomic prediction uses two sources of information: linkage disequilibrium between markers and quantitative trait loci, and additive genetic relationships between individuals. One way to increase the accuracy of genomic prediction is to capture more linkage disequilibrium by regression on haplotypes instead of regression on individual markers. The aim of this study was to investigate the accuracy of genomic prediction using haplotypes based on local genealogy information.
Background: Size of the reference population and reliability of phenotypes are crucial factors in... more Background: Size of the reference population and reliability of phenotypes are crucial factors influencing the reliability of genomic predictions. It is therefore useful to combine closely related populations. Increased accuracies of genomic predictions depend on the number of individuals added to the reference population, the reliability of their phenotypes, and the relatedness of the populations that are combined. Methods: This paper assesses the increase in reliability achieved when combining four Holstein reference populations of 4000 bulls each, from European breeding organizations, i.e. UNCEIA (France), VikingGenetics (Denmark, Sweden, Finland), DHV-VIT (Germany) and CRV (The Netherlands, Flanders). Each partner validated its own bulls using their national reference data and the combined data, respectively. Results: Combining the data significantly increased the reliability of genomic predictions for bulls in all four populations. Reliabilities increased by 10%, compared to reliabilities obtained with national reference populations alone, when they were averaged over countries and the traits evaluated. For different traits and countries, the increase in reliability ranged from 2% to 19%. Conclusions: Genomic selection programs benefit greatly from combining data from several closely related populations into a single large reference population.
Background In China, the reference population of genotyped Holstein cattle is relatively small wi... more Background In China, the reference population of genotyped Holstein cattle is relatively small with to date, 80 bulls and 2091 cows genotyped with the Illumina 54 K chip. Including genotyped Holstein cattle from other countries in the reference population could improve the accuracy of genomic prediction of the Chinese Holstein population. This study investigated the consistency of linkage disequilibrium between adjacent markers between the Chinese and Nordic Holstein populations, and compared the reliability of genomic predictions based on the Chinese reference population only or the combined Chinese and Nordic reference populations. Methods Genomic estimated breeding values of Chinese Holstein cattle were predicted using a single-trait GBLUP model based on the Chinese reference dataset, and using a two-trait GBLUP model based on a joint reference dataset that included both the Chinese and Nordic Holstein data. Results The extent of linkage disequilibrium was similar in the Chinese ...
Background Genome-wide association study (GWAS) is a powerful tool for revealing the genetic basi... more Background Genome-wide association study (GWAS) is a powerful tool for revealing the genetic basis of quantitative traits. However, studies using GWAS for conformation traits of cattle is comparatively less. This study aims to use GWAS to find the candidates genes for body conformation traits. Results The Illumina BovineSNP50 BeadChip was used to identify single nucleotide polymorphisms (SNPs) that are associated with body conformation traits. A least absolute shrinkage and selection operator (LASSO) was applied to detect multiple SNPs simultaneously for 29 body conformation traits with 1,314 Chinese Holstein cattle and 52,166 SNPs. Totally, 59 genome-wide significant SNPs associated with 26 conformation traits were detected by genome-wide association analysis; five SNPs were within previously reported QTL regions (Animal Quantitative Trait Loci (QTL) database) and 11 were very close to the reported SNPs. Twenty-two SNPs were located within annotated gene regions, while the remainde...
Background: Size of the reference population and reliability of phenotypes are crucial factors in... more Background: Size of the reference population and reliability of phenotypes are crucial factors influencing the reliability of genomic predictions. It is therefore useful to combine closely related populations. Increased accuracies of genomic predictions depend on the number of individuals added to the reference population, the reliability of their phenotypes, and the relatedness of the populations that are combined. Methods: This paper assesses the increase in reliability achieved when combining four Holstein reference populations of 4000 bulls each, from European breeding organizations, i.e. UNCEIA (France), VikingGenetics (Denmark, Sweden, Finland), DHV-VIT (Germany) and CRV (The Netherlands, Flanders). Each partner validated its own bulls using their national reference data and the combined data, respectively. Results: Combining the data significantly increased the reliability of genomic predictions for bulls in all four populations. Reliabilities increased by 10%, compared to reliabilities obtained with national reference populations alone, when they were averaged over countries and the traits evaluated. For different traits and countries, the increase in reliability ranged from 2% to 19%. Conclusions: Genomic selection programs benefit greatly from combining data from several closely related populations into a single large reference population.
In admixed populations, markers may be associated to different QTL depending on the origin of a g... more In admixed populations, markers may be associated to different QTL depending on the origin of a given genomic segment. The goal of this study was to investigate if taking into account of the breed origin of alleles in a breed of origin genomic model (BOGM) can improve genomic predictions compared to a traditional genomic model (TGM) in admixture populations. Real genotype data of Danish Holstein and Jersey breeds were used as base populations to simulate F1 crosses. This was followed by simulating 5 discrete generations of random mating to achieve a highly admixed population. A single trait with heritability of 0.25 and 100 QTL was simulated on the genome. Three different scenarios were considered, where the QTL effects of two breeds were sampled from a multivariate normal distribution with correlation 1.0, 0.5, or 0.1. Accuracy and bias of models which were measured as correlation and regression coefficient between true and estimated genomic breeding values, respectively, were vali...
The aim of this study was to investigate the effect of different strategies for handling lowquali... more The aim of this study was to investigate the effect of different strategies for handling lowquality or missing data on prediction accuracy for direct genomic values of protein yield, mastitis and fertility using a Bayesian variable model and a GBLUP model in the Danish Jersey population. The data contained 1 071 Jersey bulls that were genotyped with the Illumina Bovine 50K chip. After preliminary editing, 39 227 single nucleotide polymorphism (SNPs) remained in the dataset. Four methods to handle missing genotypes were: 1) BEAGLE: missing markers were imputed using Beagle 3.3 software, 2) COMMON: missing genotypes at a locus were replaced by the most common genotype at this locus observed in the marker data, 3) EX-ALLELE: missing marker genotypes at a locus were treated as an extra allele, and 4) POP-EXP: missing genotypes at a locus were replaced with population expectation at this locus. It was shown that among the methods used in this study, the imputation with Beagle was the best approach to handle missing genotypes. Treating missing markers as a pseudo-allele, replacing missing markers with a population average or substituting the most common alleles each reduced the accuracy of genomic predictions. The results from this study suggest that missing genotypes should be imputed in order to improve genomic prediction. Editing the marker data with a stringent threshold on GenCall scores and then imputing the discarded genotypes did not lead to higher accuracy. All marker genotypes with a GenCall score over 0.15 should be retained for genomic prediction.
Accurate genomic prediction requires a large reference population, which is problematic for trait... more Accurate genomic prediction requires a large reference population, which is problematic for traits that are expensive to measure. Traits related to milk protein composition are not routinely recorded due to costly procedures and are considered to be controlled by a few quantitative trait loci of large effect. The amount of variation explained may vary between regions leading to heterogeneous (co)variance patterns across the genome. Genomic prediction models that can efficiently take such heterogeneity of (co)variances into account can result in improved prediction reliability. In this study, we developed and implemented novel univariate and bivariate Bayesian prediction models, based on estimates of heterogeneous (co)variances for genome segments (BayesAS). Available data consisted of milk protein composition traits measured on cows and de-regressed proofs of total protein yield derived for bulls. Single-nucleotide polymorphisms (SNPs), from 50K SNP arrays, were grouped into non-ove...
With the development of SNP chips, SNP information provides an efficient approach to further dise... more With the development of SNP chips, SNP information provides an efficient approach to further disentangle different patterns of genomic variances and covariances across the genome for traits of interest. Due to the interaction between genotype and environment as well as possible differences in genetic background, it is reasonable to treat the performances of a biological trait in different populations as different but genetic correlated traits. In the present study, we performed an investigation on the patterns of region-specific genomic variances, covariances and correlations between Chinese and Nordic Holstein populations for three milk production traits. Variances and covariances between Chinese and Nordic Holstein populations were estimated for genomic regions at three different levels of genome region (all SNP as one region, each chromosome as one region and every 100 SNP as one region) using a novel multi-trait random regression model which uses latent variables to model hetero...
Litter size and piglet mortality are important traits in pig production. The study aimed to ident... more Litter size and piglet mortality are important traits in pig production. The study aimed to identify quantitative trait loci (QTL) for litter size and mortality traits, including total number of piglets born (TNB), litter size at day 5 (LS5) and mortality rate before day 5 (MORT) in Danish Landrace and Yorkshire pigs by genome-wide association studies (GWAS). The phenotypic records and genotypes were available in 5,977 Landrace pigs and 6,000 Yorkshire pigs born from 1998 to 2014. A linear mixed model (LM) with a single SNP regression and a Bayesian mixture model (BM) including effects of all SNPs simultaneously were used for GWAS to detect significant QTL association. The response variable used in the GWAS was corrected phenotypic value which was obtained by adjusting original observations for non-genetic effects. For BM, the QTL region was determined by using a novel post-Gibbs analysis based on the posterior mixture probability. The detected association patterns from LM and BM mo...
Dominance and imprinting genetic effects have been shown to contribute to genetic variance for ce... more Dominance and imprinting genetic effects have been shown to contribute to genetic variance for certain traits but are usually ignored in genomic prediction of complex traits in livestock. The objectives of this study were to estimate variances of additive, dominance and imprinting genetic effects and to evaluate predictions of genetic merit based on genomic data for average daily gain (DG) and backfat thickness (BF) in Danish Duroc pigs. Corrected phenotypes of 8113 genotyped pigs from breeding and multiplier herds were used. Four Bayesian mixture models that differed in the type of genetic effects included: (A) additive genetic effects, (AD) additive and dominance genetic effects, (AI) additive and imprinting genetic effects, and (ADI) additive, dominance and imprinting genetic effects were compared using Bayes factors. The ability of the models to predict genetic merit was compared with regard to prediction reliability and bias. Based on model ADI, narrow-sense heritabilities of 0...
Abstract Text: This study investigated reliability of genomic prediction in various scenarios wit... more Abstract Text: This study investigated reliability of genomic prediction in various scenarios with regard to relationship between test and reference animals and between animals within the reference population. Different reference populations were generated from EuroGenomics data and 1288 Nordic Holstein bulls as a common test population. A GBLUP model and a Bayesian mixture model were applied to predict Genomic breeding values for bulls in the test data. Result showed that a closer relationship between test and reference animals led to a higher reliability, while a closer relationship between reference animal resulted in a lower reliability. Therefore, the design of reference population is important for improving the reliability of genomic prediction. With regard to model, the Bayesian mixture model in general led to slightly a higher reliability of genomic prediction than the GBLUP model. Keywords: genomic prediction genomic relationship reliability
Non-additive genetic variation is usually ignored when genome-wide markers are used to study the ... more Non-additive genetic variation is usually ignored when genome-wide markers are used to study the genetic architecture and genomic prediction of complex traits in human, wild life, model organisms or farm animals. However, non-additive genetic effects may have an important contribution to total genetic variation of complex traits. This study presented a genomic BLUP model including additive and non-additive genetic effects, in which additive and non-additive genetic relation matrices were constructed from information of genome-wide dense single nucleotide polymorphism (SNP) markers. In addition, this study for the first time proposed a method to construct dominance relationship matrix using SNP markers and demonstrated it in detail. The proposed model was implemented to investigate the amounts of additive genetic, dominance and epistatic variations, and assessed the accuracy and unbiasedness of genomic predictions for daily gain in pigs. In the analysis of daily gain, four linear mod...
Data from the joint Nordic breeding value prediction for Danish and Swedish Holstein grandsire fa... more Data from the joint Nordic breeding value prediction for Danish and Swedish Holstein grandsire families were used to locate quantitative trait loci (QTL) for female fertility traits in Danish and Swedish Holstein cattle. Up to 36 Holstein grandsires with over 2,000 sons were genotyped for 416 microsatellite markers. Single trait breeding values were used for 12 traits relating to female fertility and female reproductive disorders. Data were analyzed by least squares regression analysis within and across families. Twenty-six QTL were detected on 17 different chromosomes. The best evidence was found for QTL segregating on Bos taurus chromosome (BTA)1, BTA7, BTA10, and BTA26. On each of these chromosomes, several QTL were detected affecting more than one of the fertility traits investigated in this study. Evidence for segregation of additional QTL on BTA2, BTA9, and BTA24 was found.
The use of genomic information in genetic evaluation has revolutionized dairy cattle breeding. It... more The use of genomic information in genetic evaluation has revolutionized dairy cattle breeding. It remains a major challenge to understand the genetic basis of variation for quantitative traits. Here, we study the genetic architecture for milk, fat, protein, mastitis and fertility indices in dairy cattle using NGS variants. The analysis was done using a linear mixed model (LMM) and a Bayesian mixture model (BMM). The top 10 QTL identified by LMM analyses explained 22.61, 23.86, 10.88, 18.58 and 14.83% of the total genetic variance for these traits respectively. Trait-specific sets of 4,964 SNPs from NGS variants (most 'associated' SNP for each 0.5 Mbp bin) explained 81.0, 81.6, 85.0, 60.4 and 70.9% of total genetic variance for milk, fat, protein, mastitis and fertility indices when analyzed simultaneously by BMM.
Newcastle disease (ND) and avian influenza (AI) are the most feared diseases in the poultry indus... more Newcastle disease (ND) and avian influenza (AI) are the most feared diseases in the poultry industry worldwide. They can cause flock mortality up to 100%, resulting in a catastrophic economic loss. This is the first study to investigate the feasibility of genomic selection for antibody response to Newcastle disease virus (Ab-NDV) and antibody response to Avian Influenza virus (Ab-AIV) in chickens. The data were collected from a crossbred population. Breeding values for Ab-NDV and Ab-AIV were estimated using a pedigree-based best linear unbiased prediction model (BLUP) and a genomic best linear unbiased prediction model (GBLUP). Single-trait and multiple-trait analyses were implemented. According to the analysis using the pedigree-based model, the heritability for Ab-NDV estimated from the single-trait and multiple-trait models was 0.478 and 0.487, respectively. The heritability for Ab-AIV estimated from the two models was 0.301 and 0.291, respectively. The estimated genetic correlation between the two traits was 0.438. A four-fold cross-validation was used to assess the accuracy of the estimated breeding values (EBV) in the two validation scenarios. In the family sample scenario each half-sib family is randomly allocated to one of four subsets and in the random sample scenario the individuals are randomly divided into four subsets. In the family sample scenario, compared with the pedigree-based model, the accuracy of the genomic prediction
Background: Growth and carcass traits are very important traits for broiler chickens. However, ca... more Background: Growth and carcass traits are very important traits for broiler chickens. However, carcass traits can only be measured postmortem. Genomic selection may be a powerful tool for such traits because of its accurate prediction of breeding values of animals without own phenotypic information. This study investigated the efficiency of genomic prediction in Chinese triple-yellow chickens. As a new line, Chinese triple-yellow chicken was developed by cross-breeding and had a small effective population. Two growth traits and three carcass traits were analyzed: body weight at 6 weeks, body weight at 12 weeks, eviscerating percentage, breast muscle percentage and leg muscle percentage. Results: Genomic prediction was assessed using a 4-fold cross-validation procedure for two validation scenarios. In the first scenario, each test data set comprised two half-sib families (family sample) and the rest represented the reference data. In the second scenario, the whole data were randomly divided into four subsets (random sample). In each fold of validation, one subset was used as the test data and the others as the reference data in each single validation. Genomic breeding values were predicted using a genomic best linear unbiased prediction model, a Bayesian least absolute shrinkage and selection operator model, and a Bayesian mixture model with four distributions. The accuracy of genomic estimated breeding value (GEBV) was measured as the correlation between GEBV and the corrected phenotypic value. Using the three models, the correlations ranged from 0.448 to 0.468 for the two growth traits and from 0.176 to 0.255 for the three carcass traits in the family sample scenario, and were between 0.487 and 0.536 for growth traits and between 0.312 and 0.430 for carcass traits in the random sample scenario. The differences in the prediction accuracies between the three models were very small; the Bayesian mixture model was slightly more accurate. According to the results from the random sample scenario, the accuracy of GEBV was 0.197 higher than the conventional pedigree index, averaged over the five traits. Conclusions: The results indicated that genomic selection could greatly improve the accuracy of selection in chickens, compared with conventional selection. Genomic selection for growth and carcass traits in broiler chickens is promising.
Introduction: The state-of-the-art for dealing with multiple levels of relationship among the sam... more Introduction: The state-of-the-art for dealing with multiple levels of relationship among the samples in genome-wide association studies (GWAS) is unified mixed model analysis (MMA). This approach is very flexible, can be applied to both family-based and population-based samples, and can be extended to incorporate other effects in a straightforward and rigorous fashion. Here, we present a complementary approach, called 'GENMIX (genealogy based mixed model)' which combines advantages from two powerful GWAS methods: genealogy-based haplotype grouping and MMA. Subjects and Methods: We validated GENMIX using genotyping data of Danish Jersey cattle and simulated phenotype and compared to the MMA. We simulated scenarios for three levels of heritability (0.21, 0.34, and 0.64), seven levels of MAF (0.05, 0.10, 0.15, 0.20, 0.25, 0.35, and 0.45) and five levels of QTL effect (0.1, 0.2, 0.5, 0.7 and 1.0 in phenotypic standard deviation unit). Each of these 105 possible combinations (3 h 2 x 7 MAF x 5 effects) of scenarios was replicated 25 times. Results: GENMIX provides a better ranking of markers close to the causative locus' location. GENMIX outperformed MMA when the QTL effect was small and the MAF at the QTL was low. In scenarios where MAF was high or the QTL affecting the trait had a large effect both GENMIX and MMA performed similarly. Conclusion: In discovery studies, where high-ranking markers are identified and later examined in validation studies, we therefore expect GENMIX to enrich candidates brought to follow-up studies with true positives over false positives more than the MMA would.
The spectacular increase in productivity of dairy cattle has been accompanied by the concomitant ... more The spectacular increase in productivity of dairy cattle has been accompanied by the concomitant decline in fertility. It is generally assumed that this decline is primarily due to the negative energy balance of highproducing cows at the peak of lactation. We herein describe the fine-mapping of a major fertility QTL in Nordic Red cattle, and identify a 660-Kb deletion encompassing four genes as the causative variant. We show that the deletion is a recessive embryonically lethal mutation. This probably results from the loss of RNASEH2B, which is known to cause embryonic death in mice. Despite its dramatic effect on fertility, 13%, 23% and 32% of the animals carry the deletion in Danish, Swedish and Finnish Red Cattle, respectively. To explain this, we searched for favorable effects on other traits and found that the deletion has strong positive effect on milk yield. This study demonstrates that embryonic lethal mutations account for a non-negligible fraction of the decline in fertility of domestic cattle, and that associated positive effect on milk yield may account for part of the negative genetic correlation. Our study adds to the evidence that structural variants contribute to animal phenotypic variation, and that balancing selection might be more common in livestock species than previously appreciated.
Despite its importance, fertility has been declining in many cattle populations. In dairy cattle,... more Despite its importance, fertility has been declining in many cattle populations. In dairy cattle, this decline is often attributed to the negative correlation between fertility and productions traits. Recent studies showed that embryonic lethal variants might also account for a non-negligible fraction of the fertility decline. Therefore identification of such embryonic lethal variants is essential to improve fertility. We herein illustrate, with an example of a large recessive lethal deletion recently identified in Nordic Red cattle, that haplotype-based method are particularly efficient to identify such embryonic lethal variants. We first show that haplotypes can be used in traditional QTL mapping approaches and that they present very high linkage disequilibrium with underlying variants. Haplotypes can also be used in scan for lack of homozygosity. Indeed, if a haplotype is associated to a recessive lethal variant, significantly fewer living individuals will be homozygote for that haplotype than expected. For both approaches, haplotype-based methods were particularly efficient. The lack of homozygosity approach achieved higher significance than the QTL approach. Only frequent variants can be detected with both approaches unless huge genotyped cohorts are available. An alternative approach would rely on identifying potential harmful variants in next-generation sequencing data followed by the genotyping of a larger population for these variants.
Background: Genomic prediction uses two sources of information: linkage disequilibrium between ma... more Background: Genomic prediction uses two sources of information: linkage disequilibrium between markers and quantitative trait loci, and additive genetic relationships between individuals. One way to increase the accuracy of genomic prediction is to capture more linkage disequilibrium by regression on haplotypes instead of regression on individual markers. The aim of this study was to investigate the accuracy of genomic prediction using haplotypes based on local genealogy information.
Background: Size of the reference population and reliability of phenotypes are crucial factors in... more Background: Size of the reference population and reliability of phenotypes are crucial factors influencing the reliability of genomic predictions. It is therefore useful to combine closely related populations. Increased accuracies of genomic predictions depend on the number of individuals added to the reference population, the reliability of their phenotypes, and the relatedness of the populations that are combined. Methods: This paper assesses the increase in reliability achieved when combining four Holstein reference populations of 4000 bulls each, from European breeding organizations, i.e. UNCEIA (France), VikingGenetics (Denmark, Sweden, Finland), DHV-VIT (Germany) and CRV (The Netherlands, Flanders). Each partner validated its own bulls using their national reference data and the combined data, respectively. Results: Combining the data significantly increased the reliability of genomic predictions for bulls in all four populations. Reliabilities increased by 10%, compared to reliabilities obtained with national reference populations alone, when they were averaged over countries and the traits evaluated. For different traits and countries, the increase in reliability ranged from 2% to 19%. Conclusions: Genomic selection programs benefit greatly from combining data from several closely related populations into a single large reference population.
Background In China, the reference population of genotyped Holstein cattle is relatively small wi... more Background In China, the reference population of genotyped Holstein cattle is relatively small with to date, 80 bulls and 2091 cows genotyped with the Illumina 54 K chip. Including genotyped Holstein cattle from other countries in the reference population could improve the accuracy of genomic prediction of the Chinese Holstein population. This study investigated the consistency of linkage disequilibrium between adjacent markers between the Chinese and Nordic Holstein populations, and compared the reliability of genomic predictions based on the Chinese reference population only or the combined Chinese and Nordic reference populations. Methods Genomic estimated breeding values of Chinese Holstein cattle were predicted using a single-trait GBLUP model based on the Chinese reference dataset, and using a two-trait GBLUP model based on a joint reference dataset that included both the Chinese and Nordic Holstein data. Results The extent of linkage disequilibrium was similar in the Chinese ...
Background Genome-wide association study (GWAS) is a powerful tool for revealing the genetic basi... more Background Genome-wide association study (GWAS) is a powerful tool for revealing the genetic basis of quantitative traits. However, studies using GWAS for conformation traits of cattle is comparatively less. This study aims to use GWAS to find the candidates genes for body conformation traits. Results The Illumina BovineSNP50 BeadChip was used to identify single nucleotide polymorphisms (SNPs) that are associated with body conformation traits. A least absolute shrinkage and selection operator (LASSO) was applied to detect multiple SNPs simultaneously for 29 body conformation traits with 1,314 Chinese Holstein cattle and 52,166 SNPs. Totally, 59 genome-wide significant SNPs associated with 26 conformation traits were detected by genome-wide association analysis; five SNPs were within previously reported QTL regions (Animal Quantitative Trait Loci (QTL) database) and 11 were very close to the reported SNPs. Twenty-two SNPs were located within annotated gene regions, while the remainde...
Uploads
Papers by Mogens Lund