Universitat Politècnica de Catalunya. Departament d'Estadística i Investigació Operativa
Universitat Politècnica de Catalunya. ADBD - Anàlisi de Dades Complexes per a les Decisions Empresarials
2015-11-15
VC The Author 2015. Published by Oxford University Press
Motivation: Designing an RNA-seq study depends critically on its specific goals, technology and underlying biology, which renders general guidelines inadequate. We propose a Bayesian framework to customize experiments so that goals can be attained and resources are not wasted, with a focus on alternative splicing. Results: We studied how read length, sequencing depth, library preparation and the number of replicates affects cost-effectiveness of single-sample and group comparison studies. Optimal settings varied strongly according to the target organism or tissue (potential 50–500% cost cuts) and, interestingly, short reads outperformed long reads for standard analyses. Our framework learns key characteristics for study design from the data, and predicts if and how to continue experimentation. These predictions matched several follow-up experimental datasets that were used for validation. We provide default pipelines, but the framework can be combined with other data analysis methods and can help assess their relative merits.
Peer Reviewed
Postprint (published version)
Article
English
Àrees temàtiques de la UPC::Matemàtiques i estadística::Investigació operativa; Classificació AMS::90 Operations research, mathematical programming
Oxford University Press
https://academic.oup.com/bioinformatics/article/31/22/3631/241556
http://creativecommons.org/licenses/by-nc-nd/4.0/
Open Access
Attribution-NonCommercial-NoDerivatives 4.0 International
E-prints [73020]