AlleleCoder: a PERL script for coding co-dominant polymorphism data for PCA

Angela M. Baldo; David M. Francis; Martina Caramante; Larry D. Robertson; Joanne A. Labate

doi:10.1017/S1479262111000839

AlleleCoder: a PERL script for coding co-dominant polymorphism data for PCA

Published online by Cambridge University Press: 22 July 2011

Angela M. Baldo ,

David M. Francis ,

Martina Caramante ,

Larry D. Robertson and

Joanne A. Labate

Show author details

Angela M. Baldo*: Affiliation:
USDA, ARS, Plant Genetic Resources Unit, 630 W. North St., Geneva, NY14456, USA
David M. Francis: Affiliation:
Department of Horticulture and Crop Science, The Ohio State University, Ohio Agricultural Research and Development Center, 1680 Madison Ave., Wooster, OH44691, USA
Martina Caramante: Affiliation:
Dipartimento di Scienze del Suolo, della Pianta, dell'Ambiente e delle Produzioni Animali, Università degli Studi di Napoli ‘Federico II’, 80055Portici, Napoli, Italy
Larry D. Robertson: Affiliation:
USDA, ARS, Plant Genetic Resources Unit, 630 W. North St., Geneva, NY14456, USA
Joanne A. Labate: Affiliation:
USDA, ARS, Plant Genetic Resources Unit, 630 W. North St., Geneva, NY14456, USA
*: *Corresponding author. E-mail: angela.baldo@ars.usda.gov

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

A useful biological interpretation of diploid heterozygotes is in terms of the dose of the common allele (0, 1 or 2 copies). We have developed a PERL script that converts FASTA files into coded spreadsheets suitable for principal component analysis. In combination with R and R Commander, two- and three-dimensional plots can be generated for visualizing genetic relationships. Such plots are useful for characterizing plant genetic resources. This method nicely illustrated the spectrum of genetic diversity in tomato landraces and the varieties categorized according to human-mediated dispersal.

Keywords

genetic diversity principal component analysis single nucleotide polymorphism SNP

Type: Short Communication
Information: Plant Genetic Resources , Volume 9 , Issue 4 , December 2011 , pp. 528 - 530

DOI: https://doi.org/10.1017/S1479262111000839 [Opens in a new window]
Copyright: Copyright © NIAB 2011

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Fox, J (2005) The R commander: a basic statistics graphical user interface to R. Journal of Statistical Software 14: 1–44.Google Scholar

Horne, BD and Camp, NJ (2004) Principal component analysis for selection of optimal SNP-sets that capture intragenic genetic variation. Genetic Epidimiology 26: 11–21.Google Scholar

Labate, JA, Sheffer, SM, Balch, T and Robertson, LD (2011) Diversity and population structure in a geographic sample of tomato accessions. Crop Science. doi: 10.2135/cropsci2010.05.0305 (in press).CrossRef Google Scholar

Lin, Z and Altman, RB (2004) Finding haplotype tagging SNPs by use of principal components analysis. American Journal of Human Genetics 75: 850–861.CrossRef Google Scholar PubMed

Peakall, R and Smouse, PE (2006) GENALEX 6: genetic analysis in Excel. Population genetic software for teaching and research. Molecular Ecology Notes 6: 288–295.Google Scholar

Pearson, WR and Lipman, DJ (1988) Improved tools for biological sequence comparison. Proceedings of the National Academic Sciences USA 85: 2444–2448.Google Scholar

R Development Core Team (2011) A Language and Environment for Statistical Computing. Vienna, Austria. R Foundation for Statistical Computing. http://www.R-project.org.Google Scholar

Rohlf, FJ (2002) NTSYSpc: Numerical Taxonomy System, Version 2.1. Setauket, NY: Exeter Publishing, Ltd.Google Scholar

Stajich, JE, Block, D, Boulez, K, Brenner, SE, Chervitz, SA, Dagdigian, C, Fuellen, G, Gilbert, JG, Korf, I, Lapp, H, Lehväslaiho, H, Matsalla, C, Mungall, CJ, Osborne, BI, Pocock, MR, Schattner, P, Senger, M, Stein, LD, Stupka, E, Wilkinson, MD and Birney, E (2002) The Bioperl toolkit: Perl modules for the life sciences. Genome Research 12: 1611–1618.Google Scholar

Baldo Supplementary Material

PDF 249.2 KB

Article contents

AlleleCoder: a PERL script for coding co-dominant polymorphism data for PCA

Abstract

Keywords

Access options

References

Baldo Supplementary Material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests