Launching new analysis

To start a new analysis, click on 'Launch analysis'. Then select one the available options:

  • Analysis from FASTQ
  • Analysis from VCF
  • Tumour - Normal analysis

From FASTQ

launch-fastq

Fill in the form:

  • Sample Identifier: Write the name of your sample here. The sample ID must not contain information about patient identity.
  • Description: A description of your analysis. This field is optional.
  • Select files to use: Click on Select File(s) to use to choose the sample(s) that you want to analyse. Usually, two files if you are uploading paired-end FASTQ files (FASTQ files must be compressed and have the .gz extension).
  • Phenotype and Disease Information: A description of your sample phenotype/disease using HPO terms. These terms, can be used to create a gene list and filter the variant analysis for the genes matching the selected phenotypes and/or diseases.
launch-fastq-pheno
  • Launch: Choose whether to analyse i) a single sample; ii) several independent samples (each of them will be shown as a separate analysis, they are just launched together for convenience) or iii) a multi-sample analysis, run a joint variant calling tool on all samples at once, which can increase sensitivity and decrease false positive rate over analysing each sample separately. The results of all samples will be displayed together in a single table (e.g. family trio)
  • Sample type: Are you analysing data from Germline (e.g. blood sample) or from a Tumour (e.g. biopsy)? Different pipelines will be used for each option.
  • Capture/Amplicon Kit: Select the capture, amplicon, whole genome library preparation method or kit corresponding to your analysis. If you are uploading fastq files, this is important information. The capture/amplicon kit details will be used to calculate the coverage of the coding regions included in the kit. This information will be shown later in the Quality Control (QC) Report. If you do not know what kit was used to prepare the capture/amplicon library, you should choose "generic other capture kit" or "generic other amplicon kit".
  • Ethnicity: If needed, specify one of the ethnicities proposed by Gnomad. E.g. when European is chosen, the allele frequency shown in the final results will correspond to the European population.
  • Sequencer: This is the Illumina Sequencer machine model (e.g. MiSeq, HiSeq 2500 ...) used to produce the data. This information will be reported later in the Quality Report and in the Sample information. It can be useful to, for example, write the diagnostic report.
  • Reference Genome: Choose the reference genome the reads will be aligned to.
  • Sample usage: This option is only available if you have access to both Clinical and Research accounts.
  • Variant list: Do you really want to see all the variants? If you also need the variants that did not pass the quality filters, then you should choose "Variant list will contain all variants". By default, only the high quality variants will be reported. Selecting "Variant list will contain all variants" will increase the amount of annotated variants and it could slow down the analysis.
  • Full or Gene list analysis: Do you need to restrict your in-silico variant analysis to a gene list? If you choose "Gene list analysis", the final results will only contain variants found in gene transcripts contained in that list (500 bp upstream and downstream of each transcript are included).
  • Click Start to launch your analysis.

From VCF

launch-vcf

Fill the form:

  • Sample information: Write the name of your sample here. The sample ID must not contain information about patient identity.
  • Description: A description of your analysis. This field is optional.
  • Click on Select File(s) to use to choose the sample(s) that you want to analyse. Usually, one file if you are uploading VCF files. Multi-sample VCFs are allowed. The VCF files must be compressed and have the .gz extension.
  • Ethnicity: If needed, specify one of the ethnicities proposed by Gnomad. E.g. when European is chosen, the allele frequency shown in the final results will correspond to the European population.
  • Sample type: Are you analysing data from Germline (e.g. blood sample) or from a Tumour (e.g. biopsy)? Different pipelines will be used for each option.
  • Reference Genome: Choose the reference genome used to create the vcf file.
  • Sample usage: This option is only available if you have both Clinical and Research accounts.
  • Full or Gene list analysis: Do you need to restrict your in-silico variant analysis to a gene list? If you choose "Gene list analysis", the final results will only contain variants found in gene transcripts contained in that list (500 bp upstream and downstream of each transcript are included).
  • Phenotype and Disease Information: A description of your sample phenotype/disease using HPO terms. These terms, can be used to create a gene list and filter the variant analysis for the genes matching the selected phenotypes and/or diseases.
  • Click Start to launch your analysis. Your samples will be uploaded to VarSome’s servers and the analysis will start automatically.

Tumour-normal analysis

This will run a tumour-normal analysis, using a somatic sample as a control and showing variants found only in the tumour sample. Fill in the fields as described above, giving the information for the tumour and normal sample in the respective sections.