Launch new analysis

To start a new analysis, click on 'Launch analysis'. Υοu can select one of the available options:

  • Germline Analysis from FASTQ
  • Germline Analysis from VCF
  • Somatic Analysis from FASTQ (AMP Included)
  • Somatic Analysis from VCF (AMP Included)
  • Tumor - Normal analysis (AMP Included)

Analyses starting from FASTQ

Overview of the Launch analysis page:

Uploaded FASTQ files are recognized in pairs (or groups) under the same filename prefix and are separated in single samples to be available for launching analyses. The preselected list of samples is displayed in an alphabetical order.

All enlisted samples have the option to be removed or edited, except for the first on the list.

If you would like to edit each pre selected sample, you can click on the arrow icon on each of the available samples to expand the sample submenu:

Click on the rubbish bin icon (🗑) next to each sample to remove the samples you do not want to include in your analysis.

Click on “Clear preselected samples” to clear all the preselected samples from the launching menu.

  • Germline analysis from FASTQ

    • Select files to use: Click on "Select File(s)to use" to choose the sample(s) that you want to analyse. Usually, two files if you are uploading paired-end FASTQ files. Files must be compressed and have the .gz extension.
    • Sample Identifier: Write the name of your sample here. The sample ID should not contain information about patient identity. 
    • Description: A description of your analysis. This field is optional.
    • Select VCF file with CNVs to use (optional): You can choose along with the FASTQ files of your sample a VCF file that contains only CNVs for the same sample to be annotated. The CNVs file will be displayed as a sub-analysis under the main analysis of the sample.
    • Phenotype names from (optional): To select phenotypes of interest, start typing a term and the associated phenotypes will appear as a list that can be selected. Phenotypes shown in the dropdown list can be limited based on their source. When selecting “All” you will get terms from HPO, MONDO and OMIM® databases, while selecting “Only OMIM®” they will be retrieved solely from the OMIM® database.These terms can be used to create a gene list and filter the variant analysis for the genes matching the selected phenotypes. For more details see: Phenotype Matching.

    • Launch: Choose whether to analyse:
      • A single sample.
      • Several independent samples, with the same settings (each of them will be shown as a separate analysis, they are just launched together for convenience).
      • Family Trio. Select this option to display the three sample submenus: child, mother and father. You can also run this analysis using a previously analyzed sample (click on “Use an Existing Analysis” and fill in with one of your analyzed samples).
      • Couple for carrier risk analysis. Select this option to display the two sample submenus: male and female. You can also run this analysis using a previously analyzed sample (click on “Use an Existing Analysis” and fill in with one of your analyzed samples).
      • Generic multi-sample analysis combining several samples (e.g. extended family): You will need to fill in the files and identifiers for the samples that you would like to include in your multi analysis.
      Please note that the samples launched with any of the multi-sample analysis (ii, iii, iv, v) must be sequenced using the same assay.
    • Assay: Select the capture, amplicon, whole genome library preparation method or kit corresponding to your analysis. If you are uploading FASTQ files, this is important information. The assay’s details will be used to calculate the coverage of the coding regions included in the kit. This information will be shown later in the Quality Control (QC) Report. If you do not know what kit was used to prepare the capture library, you should choose "Generic capture kit". 
    • Run analysis in targeted mode: Yes/No. Targeted mode: only call variants in assay's target regions. Untargeted mode: include also any variants found outside the target regions (off target sequences).
    • Ethnicity: If needed, specify one of the ethnicities proposed by Gnomad. E.g. when the  European ethnicity is chosen, the allele frequency shown in the final results will correspond to the European population.
    • Sources to be used for mode of inheritance: Decide which source you want to be used for the identification of the mode of inheritance of a gene, which consequently affect the Germline Variant Classification. “All” includes OMIM®, CGD, GenCC, Gene2Phenotype and Clingen disease validity and “only OMIM®” includes only data from OMIM®. In case no information about the mode of inheritance is available in the aforementioned databases, the mode of inheritance is selected from Domino
      PM2, BS2 and BP1 rules may change between both modes of annotation because the Germline Variant Classification uses different thresholds to evaluate these rules depending on the mode of inheritance.
    • Sequencer: Select the sequencing technology used to obtain the FASTQ files. Two available options: Illumina or MGI. 
    • Reference Genome: Choose the reference genome the reads will be aligned to (hg19 or hg38). When users run an analysis from FASTQ files on VarSome Clinical, using either hg19 or hg38, any mitochondrial sequences will be aligned to the rCRS.
    • Variant list: Do you really want to see all the variants? If you also need the variants that did not pass the quality filters, then you should choose "Variant list will contain all variants". By default, only the high quality variants will be reported. Selecting "Variant list will contain all variants" will increase the amount of annotated variants and it could slow down the analysis.
    • Full or Gene list analysis: Do you need to restrict your in-silico variant analysis to a gene list? If you choose "Gene list analysis", the final results will only contain variants found in gene transcripts contained in that list (500 bp upstream and downstream of each transcript are included). 
    • Tags: Tags are used to organize your analyses. Use this field to add a tag (tags that don’t already exist will be created) to your analysis. See article Sample tags.

Click Start to launch your analysis.

  • Somatic analysis from FASTQ (AMP Included)

In the launch page dialogue most fields are common between germline and somatic requirements and the ones that differ are listed below:

      • Launch: Choose whether to analyse i) a single sample; ii) several independent samples (each of them will be shown as a separate analysis, they are just launched together for convenience). Currently, somatic samples cannot be analyzed as  multi-sample analyses.
      • Tissue Type: Specify the tissue type of your sample. This field is optional, however we encourage you to fill it in, as this information gets considered for the annotation with Cancer databases. 
      • Cancer Type: Specify the sample’s type of cancer. This field is optional, however we encourage you to fill it in, as this information gets considered for the annotation with Cancer databases. 
      • Age: Specify the sample’s individual age (in years). This field is optional, however we encourage you to fill it in, as this information gets considered for the annotation with Cancer databases.
      • Sex: Specify the individual’s sex.  This field is optional, however we encourage you to fill it in, as this information gets considered for the annotation with Cancer databases. 

    Analyses starting from VCF

    • Germline analysis from VCF

      • Select files to use: Click on ”Select File(s)to use” to choose the sample(s) that you want to analyse. Files  must be compressed and have the .gz extension.
      • Sample Identifier: Write the name of your sample here. The sample ID should not contain information about patient identity. 
      • Description: A description of your analysis. This field is optional.
      • Select VCF file with CNVs to use (optional): You can choose along with the FASTQ files of your sample a VCF file that contains only CNVs for the same sample to be annotated. The CNVs file will be displayed as a sub-analysis under the main analysis of the sample.
      • Phenotype names from(optional): To select phenotypes of interest, start typing a term and the associated phenotypes will appear as a list that can be selected. Phenotypes shown in the dropdown list can be limited based on their source. When selecting “All” you will get terms from HPO, MONDO and OMIM® databases, while selecting “Only OMIM®” they will be retrieved solely from the OMIM® database. These terms can be used to create a gene list and filter the variant analysis for the genes matching the selected phenotypes. For more details see: Phenotype Matching.
      • Launch: Choose whether to analyse:
        • A single sample.
        • Several independent samples, with the same settings (each of them will be shown as a separate analysis, they are just launched together for convenience).
        • Family Trio.
        • Couple for carrier risk analysis.
        • Generic multi-sample analysis combining several samples (e.g. extended family). The VCF files will be merged into a single multi-sample VCF file.
        • Assay: Select the capture, amplicon, whole genome library preparation method or kit corresponding to your analysis. If you are uploading FastQ files, this is important information. The assay’s details will be used to calculate the coverage of the coding regions included in the kit. This information will be shown later in the Quality Control (QC) Report (see Section 5.1 “Variant list” options). If you do not know what kit was used to prepare the capture library, you should choose the option of "Generic capture kit". 
        • Ethnicity: If needed, specify one of the ethnicities proposed by Gnomad. E.g. when the European ethnicity is chosen, the allele frequency shown in the final results will correspond to the European population. 
        • Reference Genome: Choose the reference genome the reads will be aligned to. For VCF-based analyses run against hg19, if the VCF file contains variants reported on "chrM" (the name of the mitochondrial  sequence in the hg19 genome), then those will be annotated with respect to the NC_001807.4 sequence, the original mitochondrial sequence of hg19. If the variant is instead reported on "MT" (the name of the mitochondrial sequence in the hg38 genome) then they will be annotated with respect to the rCRS.
        • Full, Gene or Region list analysis: Do you need to restrict the variant annotation to a gene list ? If you choose "Gene list analysis", the final results will only contain variants found in gene transcripts contained in that list (500 bp upstream and downstream of each transcript are included). 
        • Tags: Tags are used to organize your analyses. Use this field to add a tag (tags that don’t already exist will be created) to your analysis. See article Sample tags.

      Click Start to launch your analysis.

      • Somatic analysis from VCF (AMP included)

      In the launch page dialogue most fields are common between germline and somatic requirements and the ones that differ are listed below:

        • Launch: Choose whether to analyse i) a single sample; ii) several independent samples (each of them will be shown as a separate analysis, they are just launched together for convenience). Currently, somatic samples cannot be analyzed as  multi-sample analyses.
        • Tissue Type: Specify the tissue type of your sample. This field is optional, however we encourage you to fill it in, as this information gets considered for the annotation with Cancer databases. 
        • Cancer Type: Specify the sample’s type of cancer. This field is optional, however we encourage you to fill it in, as this information gets considered for the annotation with Cancer databases. 
        • Age: Specify the sample’s individual age (in years). This field is optional, however we encourage you to fill it in, as this information gets considered for the annotation with Cancer databases.
        • Sex: Specify the individual’s sex.  This field is optional, however we encourage you to fill it in, as this information gets considered for the annotation with Cancer databases. 

      Tumor-normal analysis

      A tumor - normal analysis requires a tumor sample and a somatic normal as a control, coming from the same individual. The results only show variants present in the tumor sample that are absent from the control (AMP annotation is included). Fill in the fields as described above, giving the related information for the tumor and normal sample in the respective sections.