a1 Genetic Epidemiology, Queensland Institute of Medical Research, Brisbane, Australia. firstname.lastname@example.org
a2 Wellcome Trust Centre for Human Genetics, University of Oxford, United Kingdom.
a3 International Diabetes Institute, Caulfield, Victoria, Australia.
a4 Genetic Epidemiology, Queensland Institute of Medical Research, Brisbane, Australia.
The prioritization of genes within a candidate genomic region is an important step in the identification of causal gene variants affecting complex traits. Surprisingly, there have been very few reports of bioinformatics tools to perform such prioritization. The purpose of this article is to investigate the performance of 3 positional candidate gene software tools available, PosMed, GeneSniffer and SUSPECTS. The comparison was made for 40, 20 and 10 Mb regions in the human genome centred around known susceptibility genes for the common diseases breast cancer, Crohn's disease, age-related macular degeneration and schizophrenia. The known susceptibility gene was not always ranked highly, or not ranked at all, by 1 or more of the software tools. There was a large variation between the 3 tools regarding which genes were prioritized, and their rank order. PosMed and GeneSniffer were most similar in their prioritization gene list, whereas SUSPECTS identified the same candidate genes only for the narrowest (10 Mb) regions. Combining 2 or all of the candidate gene finding tools was superior in terms of ranking positional candidates. It is possible to reduce the number of candidate genes from a starting set in a region of interest by combining a variety of candidate gene finding tools. Conversely, we recommend caution in relying solely on single positional candidate gene prioritization tools. Our results confirm the obvious, that is, that starting with a narrower positional region gives a higher likelihood that the true susceptibility gene is selected, and that it is ranked highly. A narrow confidence interval for the mapping of complex trait genes by linkage can be achieved by maximizing marker informativeness and by having large samples. Our results suggest that the best approach to classify a minimum set of candidate genes is to take those genes that are prioritized by multiple prioritization tools.
(Received June 20 2007)
(Accepted June 27 2007)
c1 Address for correspondence: Tobias Thornblad, Genetic Epidemiology, Queensland Institute of Medical Research, 300 Herston Road, Brisbane 4029, Australia.