Changes between Version 2 and Version 3 of TrioAwarePhasingPipeline


Ignore:
Timestamp:
Sep 26, 2010 9:12:22 PM (14 years ago)
Author:
Yurii Aulchenko
Comment:

change in structure in accord with other pipelines

Legend:

Unmodified
Added
Removed
Modified
  • TrioAwarePhasingPipeline

    v2 v3  
    1 = TrioAwarePhasingPipeline =
     1[[TOC()]]
    22
    3 Call improvement and phasing
     3= Trio aware phasing =
    44
    5 This is ''phase 2'' project.
     5In this project, we will establish an infrastructure and perform phasing of the sequence data.
    66
    7 '''Background.''' We plan to use phased genotypes from GvNL for further imputations. In this, we need high quality of both genotypes and the phasing.
     7== Summary ==
    88
    9 '''Problems.''' It is well recognized that at 12x there is an essential chance that a heterozygous genotype will not be called (estimated roughly as ~1%). Furthermore, for a given individual a certain proportion of the genome will not be covered well; the genotypes at these regions can not be called or will be called with low quality. The effects of such errors and missing data onto further imputations may be large. Other factor affecting quality of further imputations is quality of phasing.
     9'''Status''': under development
    1010
    11 '''Proposed solution.''' All above problems can be address in the same framework. Basically, phasing information provides us with the means to fill in missing genotypes and correct erroneously called ones. For example, if in a person coverage is low at a certain regions, we can use information from the first degree relative to figure out what genotypes are there. Sequencing errors can be detected in very much the same way. Thus, phasing and imputations provide us with an attractive opportunity to minimize sequencing errors and proportion of missing data.
     11'''Contributors''': Yurii, TBA, TBA
    1212
    13 This work package ''aims to'':
     13'''Timeline''': January 2011 – April 2011 (phased genotypes using already existing solutions) - July 2011 (phased genotypes using own solution) - December 2011 (release of software)
    1414
    15 ·         Improve quality of sequence genotypes data by fixing errors and filling in missing values
     15'''Resources''': !PostDoc at 1.0 fte + BI/programmer at 0.5 fte (ideally, the same as the one on MendelianQcPipeline) + experienced supervisor at 0.1 fte
    1616
    17 ·         Phase the genotypes
     17'''Depends on''': availability of QC'ed VCF data (ChipBasedQcPipeline and MendelianQcPipeline)
    1818
    19 ''Detailed workflow'' is summarized in a separate document.
     19'''Other projects depending on this''':  imputations / ImputationPipeline (hard), population genetics (LD, hard), functional variants / SnpAnnotationPipeline (final catalogue, soft), novel variants discovery (final catalogue, soft).
    2020
    21 ''Estimated costs'': 12 months of experienced !PostDoc at 1.0 fte + BI/programmer at 0.5 fte + supervisor at 0.1 fte.
     21== Aims and Deliverables ==
    2222
    23 ''Suggested timeline:'' January 2011 – April 2011 (phased genotypes using already existing solutions) -  July 2011 (phased genotypes using own solution) - December 2011 (release of software)
     23 * In concert with developments from MendelianQcPipeline, improve quality of sequence genotypes data by fixing errors and filling in missing values
     24 * Phase the genotypes
    2425
    25 ''Depends on:'' availability of QC’ed genotypes from phase 1
     26== Idea ==
    2627
    27 ''Other projects depending on this:'' imputations / ImputationPipeline (hard), population genetics (LD, hard), functional variants / SnpAnnotationPipeline (final catalogue, soft), novel variants discovery (final catalogue, soft).
     28A principal idea of what questions should be addressed (without saying how) is summarized in TrioAwarePhasingPipelineIdea.
    2829
    29 '''Major deliverables'''
    3030
    31  * Novel      methods and software
    32  * Improved      genotypes
    33  * Phasing      information
     31== Workflow ==
     32
     33Automated workflow (will be) provided in TrioAwarePhasingPipelineWorkflow page.