Breeding and genetics

The effect of missing marker genotypes on the accuracy of gene-assisted breeding value estimation: a comparison of methods*

H. A. Muldera1 c1, T. H. E. Meuwissena2, M. P. L. Calusa1 and R. F. Veerkampa1

a1 Animal Breeding and Genomics Centre, Animal Sciences Group, Wageningen UR, P.O. Box 65, 8200 AB Lelystad, The Netherlands

a2 University of Life Sciences, Department of Animal and Aquacultural Sciences, N-1432 Ås, Norway


In livestock populations, missing genotypes on a large proportion of the animals is a major problem when implementing gene-assisted breeding value estimation for genes with known effect. The objective of this study was to compare different methods to deal with missing genotypes on accuracy of gene-assisted breeding value estimation for identified bi-allelic genes using Monte Carlo simulation. A nested full-sib half-sib structure was simulated with a mixed inheritance model with one bi-allelic quantitative trait loci (QTL) and a polygenic effect due to infinite number of polygenes. The effect of the QTL was included in gene-assisted BLUP either by random regression on predicted gene content, i.e. the number of positive alleles, or including haplotype effects in the model with an inverse IBD matrix to account for identity-by-descent relationships between haplotypes using linkage analysis information (IBD–LA). The inverse IBD matrix was constructed using segregation indicator probabilities obtained from multiple marker iterative peeling. Gene contents for unknown genotypes were predicted using either multiple marker iterative peeling or mixed model methodology. For both methods, gene-assisted breeding value estimation increased accuracies of total estimated breeding value (EBV) with 0% to 22% for genotyped animals in comparison to conventional breeding value estimation. For animals that were not genotyped, the increase in accuracy was much lower (0% to 5%), but still substantial when the heritability was 0.1 and when the QTL explained at least 15% of the genetic variance. Regression on predicted gene content yielded higher accuracies than IBD–LA. Allele substitution effects were, however, overestimated, especially when only sires and males in the last generation were genotyped. For juveniles without phenotypic records and traits measured only on females, the superiority of regression on gene content over IBD–LA was larger than when all animals had phenotypes. Missing gene contents were predicted with higher accuracy using multiple-marker iterative peeling than with using mixed model methodology, but the difference in accuracy of total EBV was negligible and mixed model methodology was computationally much faster than multiple iterative peeling. For large livestock populations it can be concluded that gene-assisted breeding value estimation can be practically best performed by regression on gene contents, using mixed model methodology to predict missing marker genotypes, combining phenotypic information of genotyped and ungenotyped animals in one evaluation. This technique would be, in principle, also feasible for genomic selection. It is expected that genomic selection for ungenotyped animals using predicted single nucleotide polymorphism gene contents might be beneficial especially for low heritable traits.

(Received December 19 2008)

(Accepted July 27 2009)

(Online publication September 18 2009)


c1 E-mail:


* This paper was presented at the session ‘Genomics selection and bioinformatics’ of the 59th Annual meeting of the European Association for Animal Production held in Vilnius (Lithuania), 24–27 August 2008. Dr A. Maki-Tanila acted as guest editor.