Multiple Accessions' Distinguishment, Hierarchical Decision Tree and MAD-HiDTree
The study of mutagenesis and the HapMap project always require to generate hundreds even thousands of individual lines/accessions, The deep investigation of the sequence variation among the multiple accessions can explain the molecular mechanism for their phenotypic variation, which can be realized by the association mapping between the genotypic data and quantitative phenotypic data. Due to the limit of sequencing technology, the genotype of some genetic variants can not be clearly called and are recorded as Unknown. Additionally, the heterogeneous genotypes are also very common, herein, the segregation of the genetic materials can be observed in their descendants, which degrade the specificity of the accession that it belongs to.
Currently, there are several association tools have been developed, which can well address the narrow sense (additive only) genetic effect, and also dominance effect, even epistatic effects. However, the method to distinguish or identify an accession from the multiple accessions (resulted by the mutagenesis/HapMap project) is lacked.
The homogenous genotypes can be stably passed into the descendants, and the different genotypes of homogenous marker can be used as a decision to classify the accession set into different sub-sets. Several related homogenous markers are promising to construct a hierarchical decision tree and distinguish each specific accession from other accessions.
Based on this concept, the approaches for set partitioning and hierarchical tree constructing are adopted, and a tool entitled as MAD-HiDTree was developed for the multiple accessions' distinguishing and identification.
It' s very simple to use MAD-HiDTree, only the genotypic matrix file is required. Once the genotype matrix file is submitted, MAD-HiDTree will return 3 text files which correspond to the selected marker list, the partition result for the output sub-sets, and the hierarchical decision tree.
1. Yang, Jian, et al. "GCTA: a tool for genome-wide complex trait analysis." The American Journal of Human Genetics 88.1 (2011): 76-82.
3. Said, Amir, and William A. Pearlman. "A new, fast, and efficient image codec based on set partitioning in hierarchical trees." IEEE Transactions on circuits and systems for video technology 6.3 (1996): 243-250.
4. Zhang W, Dai X, Wang Q, Xu S, Zhao PX, "PEPIS: A Pipeline for Estimating Epistatic Effects in Quantitative Trait Locus Mapping and Genome-Wide Association Studies", 2016. PLoS Comput Biol, 12(5).