TY - GEN
T1 - Minimum parent-offspring recombination haplotype inference in pedigrees
AU - Zhang, Qiangfeng
AU - Chin, Francis Y.L.
AU - Shen, Hong
PY - 2005
Y1 - 2005
N2 - The problem of haplotype inference under the Mendelian law of inheritance on pedigree genotype data is studied. The minimum recombination principle states that genetic recombinations are rare and haplotypes with fewer recombinations are more likely to exist. Given genotype data on a pedigree, the problem of Minimum Recombination Haplotype Inference (MRHI) is to find a set of haplotype configurations consistent with the genotype data having the minimum number of recombinations. In this paper, we focus on a variation of the MRHI problem that gives more realistic solutions, namely the k-MRHI problem, which has the additional constraint that the number of recombinations in each parent-offspring pair is at most k. Although the k-MRHI problem is NP-hard even for k = 1, the k-MRHI problem with k > 1 can be solved efficiently by dynamic programming in O(nm03k+12m0) time by adopting an approach similar to the one used by Doi, Li and Jiang [4] on pedigrees with n nodes and at most mo heterozygous loci in each node. In particular, the 1-MRHI problem can be solved in O(nm042m0) time. We propose an O(n2m0) algorithm to find a node as the root of the pedigree tree so as to further reduce the time complexity to O(momin(t R)), where tR is the number of feasible haplotype configuration combinations in all trios in the pedigree tree when R is the root. If the pedigree has few generations, the 1-MRHI problem can be solved in O(min{nm042m0, nm0 l+12μR+l}) time, where μR is the number of heterozygous loci in the root, and 1 is the maximum path length from the root to the leaves in the pedigree tree. Experiments on both real and simulated data confirm the out-performance of our algorithm when compared with other popular algorithms. In most real cases, our algorithm gives the same haplotyping results but runs much faster. In some real cases, other algorithms give an answer which has the least number of recombinations, while our algorithm gives a more credible solution and runs faster.
AB - The problem of haplotype inference under the Mendelian law of inheritance on pedigree genotype data is studied. The minimum recombination principle states that genetic recombinations are rare and haplotypes with fewer recombinations are more likely to exist. Given genotype data on a pedigree, the problem of Minimum Recombination Haplotype Inference (MRHI) is to find a set of haplotype configurations consistent with the genotype data having the minimum number of recombinations. In this paper, we focus on a variation of the MRHI problem that gives more realistic solutions, namely the k-MRHI problem, which has the additional constraint that the number of recombinations in each parent-offspring pair is at most k. Although the k-MRHI problem is NP-hard even for k = 1, the k-MRHI problem with k > 1 can be solved efficiently by dynamic programming in O(nm03k+12m0) time by adopting an approach similar to the one used by Doi, Li and Jiang [4] on pedigrees with n nodes and at most mo heterozygous loci in each node. In particular, the 1-MRHI problem can be solved in O(nm042m0) time. We propose an O(n2m0) algorithm to find a node as the root of the pedigree tree so as to further reduce the time complexity to O(momin(t R)), where tR is the number of feasible haplotype configuration combinations in all trios in the pedigree tree when R is the root. If the pedigree has few generations, the 1-MRHI problem can be solved in O(min{nm042m0, nm0 l+12μR+l}) time, where μR is the number of heterozygous loci in the root, and 1 is the maximum path length from the root to the leaves in the pedigree tree. Experiments on both real and simulated data confirm the out-performance of our algorithm when compared with other popular algorithms. In most real cases, our algorithm gives the same haplotyping results but runs much faster. In some real cases, other algorithms give an answer which has the least number of recombinations, while our algorithm gives a more credible solution and runs faster.
UR - http://www.scopus.com/inward/record.url?scp=33646181254&partnerID=8YFLogxK
U2 - 10.1007/11567752_7
DO - 10.1007/11567752_7
M3 - Conference contribution
AN - SCOPUS:33646181254
SN - 3540294015
SN - 9783540294016
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 100
EP - 112
BT - Transactions on Computational Systems Biology II
PB - Springer Verlag
T2 - International Workshop on Bioinformatics Research and Applications, IWBRA 2005
Y2 - 22 May 2005 through 24 May 2005
ER -