Use of partial least squares regression to predict single nucleotide polymorphism marker genotypes when some animals are genotyped with a low-density panel

C. Dimauro; R. Steri; M. A. Pintus; G. Gaspa; N. P. P. Macciotta

doi:10.1017/S1751731110002600

Use of partial least squares regression to predict single nucleotide polymorphism marker genotypes when some animals are genotyped with a low-density panel

Published online by Cambridge University Press: 04 January 2011

C. Dimauro ,

R. Steri ,

M. A. Pintus ,

G. Gaspa and

N. P. P. Macciotta

Show author details

C. Dimauro*: Affiliation:
Dipartimento di Scienze Zootecniche, Università di Sassari, via De Nicola, 9-07100 Sassari, Italy
R. Steri: Affiliation:
Dipartimento di Scienze Zootecniche, Università di Sassari, via De Nicola, 9-07100 Sassari, Italy
M. A. Pintus: Affiliation:
Dipartimento di Scienze Zootecniche, Università di Sassari, via De Nicola, 9-07100 Sassari, Italy
G. Gaspa: Affiliation:
Dipartimento di Scienze Zootecniche, Università di Sassari, via De Nicola, 9-07100 Sassari, Italy
N. P. P. Macciotta: Affiliation:
Dipartimento di Scienze Zootecniche, Università di Sassari, via De Nicola, 9-07100 Sassari, Italy
*: †E-mail: dimauro@uniss.it

Article contents

Abstract
References

Get access

Abstract

High-density single nucleotide polymorphism (SNP) platforms are currently used in genomic selection (GS) programs to enhance the selection response. However, the genotyping of a large number of animals with high-throughput platforms is rather expensive and may represent a constraint for a large-scale implementation of GS. The use of low-density marker (LDM) platforms could overcome this problem, but different SNP chips may be required for each trait and/or breed. In this study, a strategy of imputation independent from trait and breed is proposed. A simulated population of 5865 individuals with a genome of 6000 SNP equally distributed on six chromosomes was considered. First, reference and prediction populations were generated by mimicking high- and low-density SNP platforms, respectively. Then, the partial least squares regression (PLSR) technique was applied to reconstruct the missing SNP in the low-density chip. The proportion of SNP correctly reconstructed by the PLSR method ranged from 0.78 to 0.97 when 90% and 50%, respectively, of genotypes were predicted. Moreover, data sets consisting of a mixture of actual and PLSR-predicted SNP or only actual SNP were used to predict genomic breeding values (GEBVs). Correlations between GEBV and true breeding values varied from 0.74 to 0.76, respectively. The results of the study indicate that the PLSR technique can be considered a reliable computational strategy for predicting SNP genotypes in an LDM platform with reasonable accuracy.

Keywords

genomic selection SNP prediction genotype imputation

Type: Full Paper
Information: animal , Volume 5 , Issue 6 , June 2011 , pp. 833 - 837

DOI: https://doi.org/10.1017/S1751731110002600 [Opens in a new window]
Copyright: Copyright © The Animal Consortium 2011

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abdi, H 2003. Partial least squares (PLS) regression. In Encyclopaedia of social sciences research methods (ed. M Lewis–Beck, A Bryman and T Futing), pp. 1–7. Sage Publication, Thousand Oaks, CA.Google Scholar

Browning, SR, Browning, BL 2007. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. American Journal of Human Genetics 81, 1084–1097.CrossRef Google Scholar PubMed

De Jong, S 1993. SIMPLS: an alternative approach to partial least squares regression. Chemometrics and Intelligent Laboratory Systems 18, 251–263.CrossRef Google Scholar

Draper, NR, Smith, H 1981. Applied regression analysis. John Wiley and Sons, New York.Google Scholar

Druet, T, Georges, M 2010. Hidden Markov model combining linkage and linkage disequilibrium information for haplotype reconstruction and quantitative trait locus fine mapping. Genetics 184, 789–798.CrossRef Google Scholar PubMed

Habier, D, Fernando, RL, Dekkers, JCM 2007. The impact of genetic relationship information on genome-assisted breeding values. Genetics 177, 2389–2397.CrossRef Google Scholar PubMed

Habier, D, Fernando, RL, Dekkers, JCM 2009. Genomic selection using low-density marker panels. Genetics 182, 343–353.CrossRef Google Scholar PubMed

Hayes, BJ, Goddard, ME 2001. The distribution of the effects of genes affecting quantitative traits in livestock. Genetics Selection Evolution 33, 209–229.CrossRef Google Scholar PubMed

Hayes, BJ, Goddard, ME 2008. Technical note: prediction of breeding values using marker-derived relationship matrices. Journal of Animal Science 86, 2089–2092.CrossRef Google Scholar PubMed

Hoeskuldsson, A 1988. Partial least squares PLS methods. Journal of Chemometrics 88, 211–228.Google Scholar

Hubert, M, Branden, KV 2003. Robust methods for partial least squares regression. Journal of Chemometrics 17, 537–549.CrossRef Google Scholar

Lund, MS, Sahana, D, De Koning, DJ, Su, G, Carlborg, Ö 2009. Comparison of analyses of QTLMAS XII common dataset I: genomic selection. BMC proceedings 3 (suppl. 1), S1.CrossRef Google Scholar PubMed

Macciotta, NPP, Dimauro, C, Bacciu, N, Fresi, P, Cappio-Borlino, A 2006. Use of a partial least-squares regression model to predict test day of milk, fat and protein yields in dairy goats. Animal Science 82, 463–468.CrossRef Google Scholar

Macciotta, NPP, Gaspa, G, Steri, R, Nicolazzi, E, Dimauro, C, Pieramati, C, Cappio-Borlino, A 2010. Use of principal component analysis to reduce the number of predictor variables in the estimation of genomic breeding values. Journal of Dairy Science 93, 2765–2774.CrossRef Google Scholar PubMed

Meuwissen, THE, Hayes, BJ, Goddard, ME 2001. Prediction of total genetic values using genome-wide dense marker maps. Genetics 157, 1819–1829.CrossRef Google Scholar PubMed

Solberg, TR, Sonesson, AK, Woolliams, J, Meuwissen, THE 2009. Reducing dimensionality for prediction of genome-wide breeding values. Genetics Selection Evolution 41, 29–36.CrossRef Google Scholar PubMed

VanRaden, PM, Van Tassell, CP, Wiggans, GR, Sonstengard, TS, Schnabel, RD, Taylor, JF, Schenkel, FS 2009. Reliability of genomic predictions for north American Holstein bulls. Journal of Dairy Science 92, 16–24.CrossRef Google Scholar PubMed

Weigel, KA, Van Tassell, CP, O'Connell, JR, VanRaden, PM, Wiggans, GR 2010. Prediction of unobserved single nucleotide polymorphism genotypes of Jersey cattle using reference panels and population-based imputation algorithms. Journal of Dairy Science 93, 2229–2238.CrossRef Google Scholar PubMed

Weigel, KA, De Los Campos, G, González-Recio, O, Naya, H, Wu, L, Long, N, Rosa, GJ, Gianola, D 2009. Predictive ability of direct genomic values for lifetime net merit of Holstein sires using selected subsets of single nucleotide polymorphism markers. Journal of Dairy Science 92, 5248–5257.CrossRef Google Scholar PubMed

Wold, S, Michael Sjöström, M, Eriksson, L 2001. PLS-regression: a basic tool of chemometrics. Chemometrics and Intelligent Laboratory Systems 58, 109–130.CrossRef Google Scholar

Article contents

Use of partial least squares regression to predict single nucleotide polymorphism marker genotypes when some animals are genotyped with a low-density panel

Abstract

Keywords

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests