Abstract
Whole exome / genome sequencing (WES/WGS) is poised to become a cornerstone of genetic testing for diagnosis in clinical practice, at population scale. The Cloud-e-Genome project, started in late 2013, addresses three architectural requirements in support of WES-based diagnosis, namely (i) scalability of the storage and computing resources required to extract variants from sequences, (ii) flexibility in the design and evolution of WES processing pipelines, and (iii) reproducibility of the results. Our approach involves using a scientific workflow model to program the pipelines for flexibility, deploying the workflows on the Azure cloud for scalability, and recording the provenance of workflow execution, for reproduciblity. In this discussion paper we elaborate on our design choices, the associated challenges, and the expected benefits.
Original language | English |
---|---|
Title of host publication | 22nd Italian Symposium on Advanced Database Systems, SEBD 2014 |
Publisher | Universita Reggio Calabria and Centro di Competenza (ICT-SUD) |
Pages | 201-208 |
Number of pages | 8 |
ISBN (Electronic) | 9781634391450 |
Publication status | Published - 2014 |
Event | 22nd Italian Symposium on Advanced Database Systems, SEBD 2014 - Castellammare di Stabia, Italy Duration: 16 Jun 2014 → 18 Jun 2014 |
Publication series
Name | 22nd Italian Symposium on Advanced Database Systems, SEBD 2014 |
---|
Conference
Conference | 22nd Italian Symposium on Advanced Database Systems, SEBD 2014 |
---|---|
Country/Territory | Italy |
City | Castellammare di Stabia |
Period | 16/06/14 → 18/06/14 |
Bibliographical note
Publisher Copyright:Copyright © (2014) by Universita Reggio Calabria & Centro di Competenza (ICT-SUD) All rights reserved.
ASJC Scopus subject areas
- Software