This document describes how to monitor the storage space and how to possibly reduce it.
When running analyses on VarSome Clinical, the different types of data, such as FASTQ files, BAM/BAI files, or the classification/annotation results are stored in data centers managed by Saphetor or on third-party clouds (e.g. GCP, AWS). Storing these data incurs direct costs to Saphetor, which are then passed through to its customers. Since Saphetor doesn’t make any profit on storage, it is in everyone’s best interest to minimize the storage space used and the associated storage costs.
In the following document when we mention results, we are referring to the analysis results table where users can view the classification/annotation of the variants on VarSome Clinical (see example below).
If you purchase VarSome Clinical services from a local distributor, and not directly from Saphetor, they will set up and manage your storage preferences. Please contact your distributor to check and edit your settings.
Storage fees are calculated at the end of each calendar month, based on the volume of data stored in VarSome Clinical for your organization. They reflect the storage costs of Saphetor’s data centers or cloud providers. There is no storage fee for on-premises installations.
The volume of data stored for your organization can be defined as all the files you have uploaded into VarSome Clinical to run analyses + the results of the analyses you have run, which have not yet been deleted or archived. If a new sample (VCF or FASTQ file) has been analyzed during a given month, we will only charge storage fees from the day the analysis was launched in VarSome Clinical. We don’t charge storage for VCF or FASTQ files before the analysis is launched. If the uploaded VCF or FASTQ files are not analyzed within 30 days these will be deleted. More information about your storage fees and your volume of GB stored are available at How are my storage fees calculated?
Saphetor is not responsible for monitoring or managing the storage costs of any customer. This should be done by each end customer, possibly in coordination with the local distributor.
Managing the storage space
There are 5 different storage option, giving you the option to limit the volume, and hence the related costs, of sample data storage:
- Keep all the data.
-
The Storage Fee applies to all the data available in the client’s account. This is the default option applicable to all VarSome Clinical customers who have enrolled to our platform before September 20th 2024, with no time limit, and is applied by default.
-
CNV analyses can be run.
-
Algorithmic filters, gene lists, dynamic filters can be used.
-
Read alignments visualization, JBrowse and IGV can be used.
-
-
Remove FASTQ; keep the BAM and results.
-
Space occupied is lower than option 1 by approximately 25%.
-
CNV analyses can not be run anymore.
-
Algorithmic filters, gene lists, dynamic filters can be used.
-
Read alignments visualization, JBrowse and IGV can be used.
-
-
Remove BAM; keep the FASTQ and results.
-
Space occupied is lower than option 1 by approximately 50%.
-
CNV analyses can not be run anymore.
-
Algorithmic filters, gene lists, dynamic filters can be used.
-
Read alignments visualization, JBrowse and IGV, cannot be used.
-
- Remove all the raw data (FASTQ and/or BAM); keep the results only.
-
Space occupied is approximately 75% lower than option 1, as the annotation and classification results do not take up much space.
-
CNV analyses can not be run anymore.
-
Algorithmic filters, gene lists, dynamic filters can be used.
-
Read alignments visualization, JBrowse and IGV, cannot be used.
-
- Archive the sample.
-
No Storage Fee will be charged anymore.
-
Samples results/annotations are not available for browsing in VarSome Clinical.
-
CNV analyses can not be run.
-
Algorithmic filters, gene lists, dynamic filters can not be used.
-
Read alignments visualization, JBrowse and IGV, cannot be used.
-
The archived sample is still cross-referenced with other samples and sample links are still shown in the tab with cross-referenced samples, along with the patient's phenotypes.
-
Custom Variant Classifications set up initially for the archived sample will remain available when browsing other active analyses. The phenotypes assigned to the archived sample
are still available in the sample information. -
All sample data (FASTQ, BAM/BAI and results) are deleted and not available anymore. Only the VCF file is stored for possible sample re-annotation when unarchiving the sample.
-
The client may unarchive the sample. When doing so, the re-annotation fees (50% of the normal fees) will apply. For more details, please contact sales@varsome.com.
-
Non-token based users can unarchive their samples. VarSome Clinical will annotate the sample (VCF file) against the latest annotation data, which may cause annotation and classification differences between the original sample and the unarchived sample. Un-archiving of an analysis will be charged as a re-annotation. Your archiving settings will use the un-archiving date as the new creation date.
-
Options 1 to 4 can be used for a certain amount of time (in days, months or years), based on the specific needs of the customer. The minimum time period is 1 day.
Setting up VarSome Clinical storage preferences
Effective September 20th 2024, your VarSome Clinical account’s storage preferences are set by default and are applicable only to new VarSome Clinical accounts with the following options:
- Keep FASTQ and BAM file for 1 month
- Keep results for months 1 to 3
- Archive the sample after 3 months
If no change is made to these default settings, data will be deleted at the end of the set time period and will not be retrievable.
Please consider this as an important matter, to avoid the risk of losing some of your data once the default storage period in VarSome Clinical has expired.
If you wish to change these default storage preferences, your designated VarSome Clinical Group Supervisor VarSome Clinical Group Supervisor will have to modify them from your VarSome Clinical account, as per the instructions indicated on this page.
They can be modified within VarSome Clinical to define for how long the customer wants to store the FASTQ and BAM/BAI files and when their analyses should be archived.
Only the group supervisor (one by account) will have permission to modify these preferences.
If you do not have a group supervisor yet, please contact support@varsome.com and let them
know who should be your group supervisor.
In the VarSome Clinical platform, the group supervisor can hover over their username on the right top of the window, then select "Preferences":
The group supervisor will then be able to set up the storage preferences for each type of file (FASTQ vs BAM/BAI):
This shows the time periods from the day of the analysis.
Example 1:
The customer selects “keep BAM/BAI for 3 months” + “keep FASTQ for 9 months” + “archive sample after 9 months”.
-
Month 1 to 3: all data (BAM/BAI, FASTQ and annotations/classification) will be available in the client’s account and will be invoiced. This corresponds to storage option 1.
-
Month 4 to 9: only FASTQ files and annotations/classification will be available. This corresponds to storage option 3. Monthly storage fees should represent approximately 50% of the previous months.
-
Month 10 onwards: samples will be archived. This corresponds to storage option 5. No storage fees will be charged anymore.
Here is how this selection should be made in VarSome Clinical Preferences:
Example 2:
The customer purchases a token with 1 year of storage included and selects “keep FASTQ and BAM/BAI for 1 year” + “archive sample after 2 years”.
-
Year 1: all data (BAM/BAI, FASTQ and annotations/classification) will be available in the client’s account. This corresponds to storage option 1. No storage fees will be invoiced during that first year since they were already included in the token price charged upfront.
-
Months 13 - 24 - only annotations/classification will be available. This corresponds to storage option 4.
-
Month 25 onwards: sample will be archived. This corresponds to storage option 5. No storage fees will be charged anymore.
Here is how the selection should be made in VarSome Clinical Preferences:
Example 3:
The customer selects “keep FASTQ for 1 day” + “keep BAM/BAI for 1 day” + “archive sample after 1 day”.
-
Day 1: all data (BAM/BAI, FASTQ and annotations / classification) will be available in the client’s account and will not be invoiced. This corresponds to storage option 1.
-
Day 2: Samples will be archived. This corresponds to storage option 5. No storage fees will be charged.
This is the cheapest option for storage fees, but you have to complete your work in 1 day.
WARNING: Please, bear in mind that updating the storage preferences might lead to the imminent deletion of data. For example, if you select to keep FASTQ and BAM/BAI files for one year, all the FASTQ and BAM/BAI files of samples analyzed more than a year ago will be removed. If BAM/BAI files are deleted, certain functionality that depends on the presence of those files will not be available any more e.g. (JBROWSE/IGV) read alignments visualization.
A warning will be displayed in VarSome Clinical when updating your preferences.
Deleting / archiving / un-archiving data
Within VarSome Clinical it is also possible to immediately delete some FASTQ and BAM files, or also archive some analyses, without waiting for the time period defined previously.
For each single sample analysis that was launched from a FASTQ file, the user can click on the three horizontal bars on the right side of the sample name. They will then find the following options:
Delete FASTQ/BAM files and Archive sample data
Un-archive sample data