Accepted input files

The accepted input files to run analyses on VarSome Clinical are either:

  • FASTQ files only from Illumina or MGI sequencers. Files must be compressed and have the .fastq.gz extension.

Paired-end reads must be provided in two different files, e.g.: sample_R1.fastq.gz and sample_R2.fastq.gz.

We expect files that conform to Illumina's naming convention. We can deal with multiple files, but we need the file names of each pair of paired-end files to be the same and only differ in the _N (or _RN).  When parsing names of paired-end read files, we look for "foo_1.bar.fastq" and "foo_2.bar.fastq" where foo and bar need to be the same for the two pairs, and the only difference is the number. Alternatively, we also handle "foo_R1.bar" and "foo_R2.bar". 

  • VCF files with standards-compliant format, regardless of sequencing platform. There is also support directly for VCF files from IonTorrent platform. Please, check this link to find more details about requirements for submitted VCFs.