Launching new analysis

To start a new analysis, click on 'Launch analysis'. Launching a germline or somatic analysis has been separated as different options on the user interface. Υοu can select one of the available options:

  • Germline Analysis from FASTQ
  • Germline Analysis from VCF
  • Somatic Analysis from FASTQ (AMP Included)
  • Somatic Analysis from VCF (AMP Included)
  • Tumor - Normal analysis (AMP Included)

 

Analysis from FASTQ

 

Overview of the Launch analysis page:

Uploaded FASTQ files are recognized in pairs (or groups) under the same filename prefix  and are separated in single samples to be available for launching analyses. The preselected list of samples is displayed in an alphabetical order.

All enlisted samples have the option to be removed or edited, except for the first on the list.

If you would like to edit each pre selected sample, you can click on the arrow icon on each of the available samples to expand the sample submenu:

 

Click on the rubbish bin icon (🗑) next to each sample to remove the samples you do not want to include in your analysis.

Click on “Clear preselected samples” to clear all the preselected samples from the launching menu.


  • Germline analysis from FASTQ

  • Select files to use: Click on ”Select File(s)to use” to choose the sample(s) that you want to analyse. Usually, two files if you are uploading paired-end FastQ files. Files  must be compressed and have the .gz extension.
  • Sample Identifier: Write the name of your sample here. The sample ID should not contain information about patient identity. 
  • Description: A description of your analysis. This field is optional.
  • Select VCF file with CNVs to use (optional): You can choose along with the FASTQ files of your sample a VCF file that contains only CNVs for the same sample to be annotated. The CNVs file will be displayed as a sub-analysis under the main analysis of the sample.
  • Phenotype and Disease Information: A description of your sample phenotype/disease using HPO terms. These terms can be used to create a gene list and filter the variant analysis for the genes matching the selected phenotypes and/or diseases. For more details see: Phenotype Matching.


  • Launch: Choose whether to analyse:
    • A single sample.
    • Several independent samples, with the same settings (each of them will be shown as a separate analysis, they are just launched together for convenience).
    • Family Trio. Select this option to display the three sample submenus: child, mother and father. You can also run this analysis using a previously analyzed sample (click on “Use an Existing Analysis” and fill in with one of your analyzed samples).
    • Couple for carrier risk analysis. Select this option to display the two sample submenus: male and female. You can also run this analysis using a previously analyzed sample (click on “Use an Existing Analysis” and fill in with one of your analyzed samples).
    • Generic multi-sample analysis combining several samples (e.g. extended family): You will need to fill in the files and identifiers for the samples that you would like to include in your multi analysis.
    Please note that the samples launched with any of the multisample analysis (ii, iii, iv, v) must be sequenced using the same assay.
  • Assay: Select the capture, amplicon, whole genome library preparation method or kit corresponding to your analysis. If you are uploading FastQ files, this is important information. The assay’s details will be used to calculate the coverage of the coding regions included in the kit. This information will be shown later in the Quality Control (QC) Report. If you do not know what kit was used to prepare the capture/amplicon library, you should choose "generic other capture kit" or "generic other amplicon kit". 
  • Ethnicity: If needed, specify one of the ethnicities proposed by Gnomad. E.g. when the  European ethnicity is chosen, the allele frequency shown in the final results will correspond to the European population. 
  • Sequencer: This is the Illumina Sequencer machine model (e.g. MiSeq, HiSeq 2500 ...) used to produce the data. This information will be reported later in the Quality Report and in the Sample information. It can be useful to, for example, write the diagnostic report. 
  • Reference Genome: Choose the reference genome the reads will be aligned to.
  • Sample usage: This option is only available if you have access to both ClinicalDiagnostic and Research VarSome Clinical accounts. 
  • Variant list: Do you really want to see all the variants? If you also need the variants that did not pass the quality filters, then you should choose "Variant list will contain all variants". By default, only the high quality variants will be reported. Selecting "Variant list will contain all variants" will increase the amount of annotated variants and it could slow down the analysis.
  • Full or Gene list analysis: Do you need to restrict your in-silico variant analysis to a gene list? If you choose "Gene list analysis", the final results will only contain variants found in gene transcripts contained in that list (500 bp upstream and downstream of each transcript are included). 
  • Tags: Tags are used to organize your analyses. Use this field to add a tag (tags that don’t already exist will be created) to your analysis. See article Sample tags.

Click Start to launch your analysis.

  • Somatic analysis from FASTQ (AMP Included)

     

    In the launch page dialogue most fields are common between germline and somatic requirements and the ones that differ are listed below:

    • Launch: Choose whether to analyse i) a single sample; ii) several independent samples (each of them will be shown as a separate analysis, they are just launched together for convenience). Currently, somatic samples cannot be analyzed as  multi-sample analyses.
    • Tissue Type: Specify the tissue type of your sample. This field is optional, however we encourage you to fill it in, as this information gets considered for the annotation with Cancer databases. 
    • Cancer Type: Specify the sample’s type of cancer. This field is optional, however we encourage you to fill it in, as this information gets considered for the annotation with Cancer databases. 
    • Age: Specify the sample’s individual age (in years). This field is optional, however we encourage you to fill it in, as this information gets considered for the annotation with Cancer databases.
    • Sex: Specify the individual’s sex.  This field is optional, however we encourage you to fill it in, as this information gets considered for the annotation with Cancer databases. 

    Analyses starting from VCF

    • Germline analysis from VCF

    • Select files to use: Click on ”Select File(s)to use” to choose the sample(s) that you want to analyse. Files  must be compressed and have the .gz extension.
    • Sample Identifier: Write the name of your sample here. The sample ID should not contain information about patient identity. 
    • Description: A description of your analysis. This field is optional.
    • Select VCF file with CNVs to use (optional): You can choose along with the FASTQ files of your sample a VCF file that contains only CNVs for the same sample to be annotated. The CNVs file will be displayed as a sub-analysis under the main analysis of the sample.
    • Phenotype and Disease Information: A description of your sample phenotype/disease using HPO terms. These terms can be used to create a gene list and filter the variant analysis for the genes matching the selected phenotypes and/or diseases. For more details see: Phenotype Matching.
    • Launch: Choose whether to analyse:
      • A single sample.
      • Several independent samples, with the same settings (each of them will be shown as a separate analysis, they are just launched together for convenience).
      • Family Trio.
      • Couple for carrier risk analysis.
      • Generic multi-sample analysis combining several samples (e.g. extended family).
    • Assay: Select the capture, amplicon, whole genome library preparation method or kit corresponding to your analysis. If you are uploading FastQ files, this is important information. The assay’s details will be used to calculate the coverage of the coding regions included in the kit. This information will be shown later in the Quality Control (QC) Report (see Section 5.1 “Variant list” options). If you do not know what kit was used to prepare the capture/amplicon library, you should choose "generic other capture kit" or "generic other amplicon kit". 
    • Ethnicity: If needed, specify one of the ethnicities proposed by Gnomad. E.g. when the  European ethnicity is chosen, the allele frequency shown in the final results will correspond to the European population. 
    • Reference Genome: Choose the reference genome the reads will be aligned to.
    • Full, Gene or Region list analysis: Do you need to restrict the variant annotation to a gene list? If you choose "Gene list analysis", the final results will only contain variants found in gene transcripts contained in that list (500 bp upstream and downstream of each transcript are included). 
    • Tags: Tags are used to organize your analyses. Use this field to add a tag (tags that don’t already exist will be created) to your analysis. See article Sample tags.

    Click Start to launch your analysis.

    • Somatic analysis from VCF

    In the launch page dialogue most fields are common between germline and somatic requirements and the ones that differ are listed below:

    • Launch: Choose whether to analyse i) a single sample; ii) several independent samples (each of them will be shown as a separate analysis, they are just launched together for convenience). Currently, somatic samples cannot be analyzed as  multi-sample analyses.
    • Tissue Type: Specify the tissue type of your sample. This field is optional, however we encourage you to fill it in, as this information gets considered for the annotation with Cancer databases. 
    • Cancer Type: Specify the sample’s type of cancer. This field is optional, however we encourage you to fill it in, as this information gets considered for the annotation with Cancer databases. 
    • Age: Specify the sample’s individual age (in years). This field is optional, however we encourage you to fill it in, as this information gets considered for the annotation with Cancer databases.
    • Sex: Specify the individual’s sex.  This field is optional, however we encourage you to fill it in, as this information gets considered for the annotation with Cancer databases. 

    Tumor-normal analysis

    A tumor - normal analysis requires a tumor sample and a somatic normal as a control, coming from the same individual. The results only show variants present in the tumor sample that are absent from the control (AMP annotation is included). Fill in the fields as described above, giving the related information for the tumor and normal sample in the respective sections.