CNV/SV Variant Table and Cards

CNV/SV Variant Table

The CNV/SV analysis Variant Table contains the following information:


    • Length: the length in bp of the region considered as a structural variation.
    • Call quality: Three quality control metrics were collected for CNV analyses starting from FASTQ data (e.g. WES or gene panel).  Each CNV call is assigned ticks,  green, and “X”s, red, so at a glance you can see which has passed and failed the quality checks. The first and second metrics will have a grey dash for CNV results of analyses starting either from VCF or WGS data. From left to right, these are:
      • Test sample coverage: this quality control metric ensures a minimum coverage of the test sample at the CNV call region when calling duplications. Green: duplications with coverage at least equal or higher than the minimum coverage threshold* (Please note that all deletions pass this filter and therefore will always have a green colour). Red: duplications with coverage lower than the minimum coverage threshold.
      • Number of reference samples: this is to ensure that a minimum number of samples from the reference set have a minimum coverage* in the CNV call region. Green: the reference sample set has at least two samples with coverage higher than the minimum coverage threshold, in the CNV region. Red: fewer than two reference samples with sufficient coverage in the CNV call region*.
      • CNV call overlapping camouflaged region: this is to check whether an overlap exists between the region of the CNV and the Camouflaged Regions. Camouflaged regions contain duplicated genomic sequences where confidently aligning short reads to a unique location is not possible.  Green: no overlap. Red: overlap with a Camouflaged Region (Ebbert et al., 2019).

 

*Minimum coverage threshold (number of reads): the lowest value between 10 or sample median coverage/10.

 

  • User CNV classification: custom classification for CNV variants for ACMG and AMP rules. For user submitted VCFs with CNVs, only variants with a copy number value can be manually classified.
  • Estimated Copy Number: estimated copy number of the CNV call calculated from the reads expected vs reads observed ratio assuming a diploid state.
    • Type: type of CNV, can be either deletion duplication
  • Genes: genes overlapping the CNV call region.
  • Number of genes: number of genes overlapping the CNV call region.
  • Quality Score: A measure of statistical support for each CNV call. Specifically, it is the log10 of the likelihood ratio of data for the CNV call divided by the null (normal copy number). The higher the Quality Score the more confident one can be about the presence of a CNV. While it is difficult to give an ideal threshold, and for short calls the scores may be unconvincing, the most obvious large calls should be easily flagged by ranking them according to this score.
    • For WGS CNV analyses, the quality score is given by delly, if it is a single sample, or by ExomeDepth for multiple samples.
    • For CNVs from VCF, provided for annotation only, the Quality Score displays the QUAL value from the VCF (if included).

  • ACMG CNV class and CNV rules: the ACMG CNV classification and the set of triggered ACMG rules. These rules are displayed in clickable bubble icons that include the rule’s description and explanation for triggering.
  • Number of exons: number of exons overlapping the CNV region.
  • Reads expected, reads observed and reads ratio: these columns contain the values for each CNV call of the reads expected, the reads observed, and the read ratio The column read ratio is calculated by dividing the number of observed reads by the number of expected reads. Since the number of expected reads is calculated from the reference set of samples, we highlight the importance of having an appropriate reference set, as we mentioned in the QC report section. Given a good reference set of samples with a good correlation between them, and all three call quality metrics passing the filters, the read ratio value can be used to rank the variants according to the strength of the signal.

  • Frequency: frequency of overlapping CNVs in the same genomic region. The gnomAD database is used to get the general population frequencies for a given structural variant. Depending on the type of variant, the frequencies are calculated as follows:
    • Deletions: we use gnomAD variants if they fully overlap with the given variant.
    • Duplications in coding regions: we compare at the gene level, and we use those gnomAD variants that encompass the same coding genes as the given variant.
    • Duplications in non-coding regions: we use gnomAD variants if they are at least covering 85% of the variant region.
  • Cytoband: The cytoband of each CNV is displayed. In case of long CNVs spanning more than one cytoband, then they are displayed as a range.
  • Zygosity: this column is displayed for analyses conducted with Delly (WGS single sample CNV analysis).
  • Delly precision: this column is displayed for analyses conducted with Delly (WGS single sample CNV analysis). It takes two possible values: “PRECISE” or “IMPRECISE”. The Delly caller leverages both read depth and split read support to call CNVs. CNVs called based only on read depth data are IMPRECISE, while calls based on both read depth and split reads are PRECISE.

CNV/SV Cards

  • Genes: the gene information for all the genes overlapping the CNV region is available at the right side of the window under the “Gene” option.


  • CNV Details: Summary information about the selected variant (position, type, overlapping genes etc)
  • Sample View: Sample's region browser which presents information about the overlapping transcripts in the CNV region, conservation scores per position and SNVs of the sample. For further information please refer to section Sample View.
  • Transcripts: A list of all the affected transcript positions that overlap with each CNV is displayed on the right of the Variant Table, under the "Transcripts" tab. Transcripts can be filtered based on coding status and/or gene name. 

 

  • CNV Classification: In this tab we show the ACMG classification for each CNV and the set of triggered ACMG rules. Click on “Show full detail” to find out the criteria not met.
  • Publications: publications from PubMed related to the selected CNV or genes where variants, diseases, phenotypes, chemical compounds, drugs (if exist) are tagged by our internal AI tool as well as from the VarSome community users and our curation team verifies them .
  • CNV Browser: an interactive browser showing a wider region around the position of the CNV call as well as its location on the chromosome level. The user can zoom in and out using the mouse scroll and select among different chromosomes, genomic positions, samples and CNV calls. Data points represent read ratios (observed/expected read counts). These are colored blue or red, depending if they fall within the purple-shaded area 95% confidence interval or not, respectively. Call genomic location is indicated by coordinates and annotated for overlapping gene structures (exons/introns). The coverage track, at the bottom of the interactive plot, shows the trend of the coverage on a logarithmic or linear scale across all cohort samples. By hovering the browser, there is useful CNV call information including genomic location and span, as well as links to the same region in other analyses of the same cohort. You can find further information regarding CNV visualization.





  • CNV plot: we provide a CNV plot, showing how the observed read depth in the area of the CNV differs from the expected. The CNV plots are generated using a modified version of the ExomeDepth tool. You can find further information regarding CNV visualization.
  • Known CNVs: we display only the relevant CNVs for the classification according to the following criteria: 
    • CNV deletions: we retain those that fully overlap with the given CNV for gnomAD variants. For CNVs coming from clinical sources (Decipher, DBVar, ClinVar CNVs) we use the overlapping CNVs if they are benign and the contained CNVs if they are pathogenic.
    • CNV duplications: we keep only the CNVs encompassing the same coding genes. If the CNV is non-coding, then we retain the CNVs that have at least 85% of overlap. 

Warnings are being displayed under the variant table to inform the user of the reliability of CNV calls of the sample: if (1) the correlation of the sample to its reference samples is low and (2) the number of reference samples is low.

Searching through CNV results

As you inspect the CNV results of your sample, you can search by a known, or previously detected from the main analysis of the sample, SNV or small INDEL and see if it overlaps with any detected CNV.

Reads alignment visualization for CNVs

You can view the alignment of the reads in the regions of the detected CNVs on JBrowse.

Once you have selected a variant on the Variant Table you can see the alignment of the reads by clicking on the Jbrowse icon on the top left of the screen. The CNV call region is highlighted in yellow.

Browsing through the samples of a CNV analysis

You can browse through the samples analyzed under the same CNV/SV analysis by visiting the results page of one and using the red arrows you can be directed to the next or previous sample:

You also have the option to download your filtered CNV results, as it has been possible for SNP/small INDEL analyses, from the upper right corner of the Variant Table: