Changes between Version 18 and Version 19 of ImputationTool

Sep 19, 2011 5:12:30 PM (10 years ago)



  • ImputationTool

    v18 v19  
    115115=== The !TriTyper Format ===
    116116!TriTyper is a binary format to store genotype information, including insertion, deletion and expression data,  providing very efficient read/write/seek methods.
     118=== Filtering ===
     119In the '''ttpmh''' mode, !ImputationTool applies the following filtering between a study and a reference dataset:
     121The filtering steps imputation tool does when comparing to reference:
     123* assesses alleles and swaps SNP if needed
     124               ref: C/T
     125               GWAS: A/G --> needs to be swapped and inverted to become C/T
     126* checks Hardy-Weinberg equilibrium <= 0.0001, MAF < 0.01, callrate < 0.95. If above threshold, SNP is removed
     127* checks if SNP is present in reference data, if not, SNP is removed from GWAS data
     128* checks if SNP has null alleles, if so, SNP is removed from GWAS data
     129* checks if allele frequency is comparable to reference. If not (>25% difference), SNP is removed from GWAS data.
     130* Assesses if the haplotype structure is comparable between reference and GWAS data. This is performed by pairwise comparison of r-squared between SNPs in both reference and GWAS. For SNPs in LD (r-squared > 0.1), the allele frequencies are compared. SNPs are removed from the GWAS data when the major allele differs more often than it is identical.