Hostname: page-component-7c8c6479df-8mjnm Total loading time: 0 Render date: 2024-03-28T10:51:50.068Z Has data issue: false hasContentIssue false

Use of partial least squares regression to predict single nucleotide polymorphism marker genotypes when some animals are genotyped with a low-density panel

Published online by Cambridge University Press:  04 January 2011

C. Dimauro*
Affiliation:
Dipartimento di Scienze Zootecniche, Università di Sassari, via De Nicola, 9-07100 Sassari, Italy
R. Steri
Affiliation:
Dipartimento di Scienze Zootecniche, Università di Sassari, via De Nicola, 9-07100 Sassari, Italy
M. A. Pintus
Affiliation:
Dipartimento di Scienze Zootecniche, Università di Sassari, via De Nicola, 9-07100 Sassari, Italy
G. Gaspa
Affiliation:
Dipartimento di Scienze Zootecniche, Università di Sassari, via De Nicola, 9-07100 Sassari, Italy
N. P. P. Macciotta
Affiliation:
Dipartimento di Scienze Zootecniche, Università di Sassari, via De Nicola, 9-07100 Sassari, Italy
*
E-mail: dimauro@uniss.it
Get access

Abstract

High-density single nucleotide polymorphism (SNP) platforms are currently used in genomic selection (GS) programs to enhance the selection response. However, the genotyping of a large number of animals with high-throughput platforms is rather expensive and may represent a constraint for a large-scale implementation of GS. The use of low-density marker (LDM) platforms could overcome this problem, but different SNP chips may be required for each trait and/or breed. In this study, a strategy of imputation independent from trait and breed is proposed. A simulated population of 5865 individuals with a genome of 6000 SNP equally distributed on six chromosomes was considered. First, reference and prediction populations were generated by mimicking high- and low-density SNP platforms, respectively. Then, the partial least squares regression (PLSR) technique was applied to reconstruct the missing SNP in the low-density chip. The proportion of SNP correctly reconstructed by the PLSR method ranged from 0.78 to 0.97 when 90% and 50%, respectively, of genotypes were predicted. Moreover, data sets consisting of a mixture of actual and PLSR-predicted SNP or only actual SNP were used to predict genomic breeding values (GEBVs). Correlations between GEBV and true breeding values varied from 0.74 to 0.76, respectively. The results of the study indicate that the PLSR technique can be considered a reliable computational strategy for predicting SNP genotypes in an LDM platform with reasonable accuracy.

Type
Full Paper
Copyright
Copyright © The Animal Consortium 2011

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abdi, H 2003. Partial least squares (PLS) regression. In Encyclopaedia of social sciences research methods (ed. M Lewis–Beck, A Bryman and T Futing), pp. 17. Sage Publication, Thousand Oaks, CA.Google Scholar
Browning, SR, Browning, BL 2007. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. American Journal of Human Genetics 81, 10841097.CrossRefGoogle ScholarPubMed
De Jong, S 1993. SIMPLS: an alternative approach to partial least squares regression. Chemometrics and Intelligent Laboratory Systems 18, 251263.CrossRefGoogle Scholar
Draper, NR, Smith, H 1981. Applied regression analysis. John Wiley and Sons, New York.Google Scholar
Druet, T, Georges, M 2010. Hidden Markov model combining linkage and linkage disequilibrium information for haplotype reconstruction and quantitative trait locus fine mapping. Genetics 184, 789798.CrossRefGoogle ScholarPubMed
Habier, D, Fernando, RL, Dekkers, JCM 2007. The impact of genetic relationship information on genome-assisted breeding values. Genetics 177, 23892397.CrossRefGoogle ScholarPubMed
Habier, D, Fernando, RL, Dekkers, JCM 2009. Genomic selection using low-density marker panels. Genetics 182, 343353.CrossRefGoogle ScholarPubMed
Hayes, BJ, Goddard, ME 2001. The distribution of the effects of genes affecting quantitative traits in livestock. Genetics Selection Evolution 33, 209229.CrossRefGoogle ScholarPubMed
Hayes, BJ, Goddard, ME 2008. Technical note: prediction of breeding values using marker-derived relationship matrices. Journal of Animal Science 86, 20892092.CrossRefGoogle ScholarPubMed
Hoeskuldsson, A 1988. Partial least squares PLS methods. Journal of Chemometrics 88, 211228.Google Scholar
Hubert, M, Branden, KV 2003. Robust methods for partial least squares regression. Journal of Chemometrics 17, 537549.CrossRefGoogle Scholar
Lund, MS, Sahana, D, De Koning, DJ, Su, G, Carlborg, Ö 2009. Comparison of analyses of QTLMAS XII common dataset I: genomic selection. BMC proceedings 3 (suppl. 1), S1.CrossRefGoogle ScholarPubMed
Macciotta, NPP, Dimauro, C, Bacciu, N, Fresi, P, Cappio-Borlino, A 2006. Use of a partial least-squares regression model to predict test day of milk, fat and protein yields in dairy goats. Animal Science 82, 463468.CrossRefGoogle Scholar
Macciotta, NPP, Gaspa, G, Steri, R, Nicolazzi, E, Dimauro, C, Pieramati, C, Cappio-Borlino, A 2010. Use of principal component analysis to reduce the number of predictor variables in the estimation of genomic breeding values. Journal of Dairy Science 93, 27652774.CrossRefGoogle ScholarPubMed
Meuwissen, THE, Hayes, BJ, Goddard, ME 2001. Prediction of total genetic values using genome-wide dense marker maps. Genetics 157, 18191829.CrossRefGoogle ScholarPubMed
Solberg, TR, Sonesson, AK, Woolliams, J, Meuwissen, THE 2009. Reducing dimensionality for prediction of genome-wide breeding values. Genetics Selection Evolution 41, 2936.CrossRefGoogle ScholarPubMed
VanRaden, PM, Van Tassell, CP, Wiggans, GR, Sonstengard, TS, Schnabel, RD, Taylor, JF, Schenkel, FS 2009. Reliability of genomic predictions for north American Holstein bulls. Journal of Dairy Science 92, 1624.CrossRefGoogle ScholarPubMed
Weigel, KA, Van Tassell, CP, O'Connell, JR, VanRaden, PM, Wiggans, GR 2010. Prediction of unobserved single nucleotide polymorphism genotypes of Jersey cattle using reference panels and population-based imputation algorithms. Journal of Dairy Science 93, 22292238.CrossRefGoogle ScholarPubMed
Weigel, KA, De Los Campos, G, González-Recio, O, Naya, H, Wu, L, Long, N, Rosa, GJ, Gianola, D 2009. Predictive ability of direct genomic values for lifetime net merit of Holstein sires using selected subsets of single nucleotide polymorphism markers. Journal of Dairy Science 92, 52485257.CrossRefGoogle ScholarPubMed
Wold, S, Michael Sjöström, M, Eriksson, L 2001. PLS-regression: a basic tool of chemometrics. Chemometrics and Intelligent Laboratory Systems 58, 109130.CrossRefGoogle Scholar