| 18 | | Gunzip fasta file. Build BWA index. Tar-gzip the results. |
| 19 | | |
| 20 | | == Split fastq file == |
| 21 | | |
| 22 | | [[Image(splitFastq.png, 50%)]] |
| 23 | | |
| 24 | | Splits a large fastq file (gzipped) into several smaller files with the unix command 'split'. The results are uploaded to the directory that is specified in 'gridOutputDir' |
| 25 | | |
| 26 | | == Alignment with BWA on each split file == |
| 27 | | |
| 28 | | [[Image(BWAparam.png, 50%)]] |
| 29 | | |
| 30 | | Runs BWA with adjustable parameter settings. |
| 31 | | * Matches sequence reads to a reference database |
| 32 | | * Convert sai to sam |
| 33 | | * Convert sam to bam |
| 34 | | * Sort bam file |
| 35 | | * Index sorted bam file |
| 36 | | * Tar-gzip all results. Also the intermediate files |
| 37 | | |
| 38 | | == Merge bam files == |
| 39 | | |
| 40 | | [[Image(MergeIndexSNPcall.png, 50%)]] |
| 41 | | |
| 42 | | * Downloads all bai, bam, sam and tar.gz files from the gridInputDirectory |
| 43 | | * Gunzip tar the tar.gz files if they are present |
| 44 | | * Gunzip the reference file (fasta format) |
| 45 | | * Merge all _sorted.bam files |
| 46 | | * Build index on this merged file |
| 47 | | * Call SNPs and make selection. Output in pileup format. |
| 48 | | * Convert pileup format to bed format |
| 49 | | |
| 50 | | == SNP calling with varscan, determine coverage == |
| 51 | | |
| 52 | | [[Image(Coverage_Varscan_BaseCoverage.png)]] |
| 53 | | |
| 54 | | * Creates a pileup file (with samtools pileup -f) Sends the output to Varscan. Calls SNPs, indels and copy number variations. |
| 55 | | * Calculates coverage per 50kbp |
| 56 | | * Calculates coverage per base |
| | 19 | '''Status:''' Implemented on grid. Source code is made available. |