Advice on reusing a standard set of reference samples in multiple CNV analyses
It is recommended that all samples processed in VarSome Clinical CNV analyses have been sequenced by the same instrument, experimental methodology, in the same batch. However, for practical and cost-related reasons, it may not be possible to repeatedly sequence control samples in every batch, or each time one or more test samples require sequencing. In that case, it is possible to re-use a set of control samples that were sequenced previously. The resulting CNV calls are likely to be less accurate.
In-house benchmarking studies have attempted to quantify this loss in performance using a dataset of 96 samples (66 samples with CNV calls, 30 control). Comparison of a complete CNV run to multiple runs using a fixed limited set of 10 control samples as a pool to select references, showed a single digit drop in sensitivity and a small number of changes in CNV limits/breakpoints.
It is important to always check which samples have been selected as reference in the analysis QC report, and their count, available from Analysis Actions Menu. This is because samples of the reference set should not contain the same CNV as the queried test sample, otherwise such CNV will not be called. Ideally, there should be 5-10 reference samples of high correlation (>0.97 for panels, >0.98 for exomes) for each test sample analyzed. For further detail in assessing CNV run and call quality, users may consult CNV Quality Control: tools and guidelines.