| 14 | | === results === |
| 15 | | Here is all the data that has gone through any kind processing at UMCG |
| 16 | | */bam/umcg/ |
| 17 | | ** A4 trio complete bam files |
| 18 | | ** pilot chromosomes 19, 20, X, Y, MT bam files |
| 19 | | * /snp/hg18 |
| 20 | | ** Pilot cleaned up VCF files from the BGI on hg18(sorted, updated to VCF4.0) |
| 21 | | * /snp/hg19 |
| 22 | | ** Pilot initial unfiltered calls from UMCG |
| 23 | | ** Lifted-over files from BGI |
| | 5 | === /target/gpfs2/gcc/groups/gonl/sftp/ === |
| | 6 | Root of the SFTP. |
| | 7 | |
| | 8 | === /target/gpfs2/gcc/groups/gonl/sftp/A4 === |
| | 9 | Contains all the information about the A4 test trio, including all the raw and aligned data. |
| | 10 | |
| | 11 | === /target/gpfs2/gcc/groups/gonl/sftp/BGI === |
| | 12 | Contains all the data coming from BGI, including their variant calls. The data is organized by batch in the batchX subfolders. Each of the subfolders typically contains the following: |
| | 13 | * batchX/ |
| | 14 | ** A set of compressed files containing the plain text data and md5 files for downloading purpose. These are named as follows: timestamp.BGI.batchX.data_type.hg1X.data_format.tar.bz2. All plain text data should be available as a compressed file, including but not limited to: CNV, InDel, InDel annotations, SNP, SNP annotation. Some of these are available in multiple formats; see BGI data page for more explanation about the BGI data and its formats. |
| | 15 | ** md5 checksum files for all files. |
| | 16 | * batchX/bam OR batchX/alignment |
| | 17 | ** The BAM files aligned by BGI |
| | 18 | * batchX/CNV |
| | 19 | ** CNVs in CNV Detector format. If you want to download for all samples, please download the compressed archive from batchX/ |
| | 20 | * batchX/indel |
| | 21 | ** InDels in samtools pileup format. If you want to download for all samples, please download the compressed archive from batchX/ |
| | 22 | * batchX/indel_annotation |
| | 23 | ** Indels annotations in GFF format. If you want to download for all samples, please download the compressed archive from batchX/ |
| | 24 | * batchX/SNP |
| | 25 | ** SNP in SOAPsnp format. If you want to download for all samples, please download the compressed archive from batchX/ |
| | 26 | * batchX/SNP_annotation |
| | 27 | ** SNP annotations in GFF format. If you want to download for all samples, please download the compressed archive from batchX/ |
| | 28 | * batchX/vcf_format/CNV |
| | 29 | ** CNV in VCF format. If you want to download for all samples, please download the compressed archive from batchX/ |
| | 30 | * batchX/vcf_format/indel |
| | 31 | ** Indel in VCF format. If you want to download for all samples, please download the compressed archive from batchX/ |
| | 32 | * batchX/vcf_format/SNP |
| | 33 | ** SNP in VCF format. If you want to download for all samples, please download the compressed archive from batchX/ |
| | 34 | |
| | 35 | NOTES: |
| | 36 | * Unless specified otherwise, all data is aligned on hg19 |
| | 37 | * Some of the folder/filenames are inconsistent from one batch to the other. This is because the original names as found on the BGI HD have been kept. |
| | 38 | |
| | 39 | === /target/gpfs2/gcc/groups/gonl/sftp/pilot === |
| | 40 | Data fro the pilot, including aligned BAMs and SNPs. |
| | 41 | |
| | 42 | === /target/gpfs2/gcc/groups/gonl/sftp/resources === |
| | 43 | GoNL resources tarball (Thanks Freerk!) |
| | 44 | |
| | 45 | === /target/gpfs2/gcc/groups/gonl/sftp/upload === |
| | 46 | This is where everyone has write permissions. This directory should be used for data exchange. |