Which reference genome is being used to align the reads?

Reference genome

Reference genomes hg19 and hg38 are used for standard chromosomes. The preferred reference genome can always be selected when launching a new analysis.

The following alternate loci were removed from hg19 :

  • chr4_ctg9_hap1
  • chr6_apd_hap1
  • chr6_cox_hap2
  • chr6_dbb_hap3
  • chr6_mann_hap4
  • chr6_mcf_hap5
  • chr6_qbl_hap6
  • chr6_ssto_hap7
  • chr17_ctg5_hap1

Mitochondrial genome

With regard to the mitochondrial genome, in VarSome Clinical when the analysis is launched from fastq sample(s), using either hg19 or hg38, any mitochondrial sequences will be aligned to the standard mitochondrial genome (GenBank number: NC_120920.1), which is included in the hg38 human genome. For more details please see the Mitochondrial genome versions

Pseudoautosomal regions

The pseudoautosomal regions are homologous regions on the X and Y chromosomes. These regions are identical and contain multiple genes. Thus, the alignment process having both them available could result in mis-mapped reads and erroneous variant calling. For this reason, the pseudoautosomal regions are masked of the Y chromosome to ensure a reliable analysis.