Falco: a quick and flexible single-cell RNA-seq processing framework on the cloud.

Yang, Andrian and Troup, Michael and Lin, Peijie and Ho, Joshua W K (2017) Falco: a quick and flexible single-cell RNA-seq processing framework on the cloud. Bioinformatics, 33 (5). pp.767-769. ISSN 1367-4811 (PP OA)

[thumbnail of Yang 2016 _Falco RNA seq _Bioinformatics _PP OA.pdf]
Preview
Text
Yang 2016 _Falco RNA seq _Bioinformatics _PP OA.pdf

Download (348kB) | Preview

Abstract

: Single-cell RNA-seq (scRNA-seq) is increasingly used in a range of biomedical studies. Nonetheless, current RNA-seq analysis tools are not specifically designed to efficiently process scRNA-seq data due to their limited scalability. Here we introduce Falco, a cloud-based framework to enable paralellization of existing RNA-seq processing pipelines using big data technologies of Apache Hadoop and Apache Spark for performing massively parallel analysis of large scale transcriptomic data. Using two public scRNA-seq datasets and two popular RNA-seq alignment/feature quantification pipelines, we show that the same processing pipeline runs 2.6-145.4 times faster using Falco than running on a highly optimized standalone computer. Falco also allows users to utilize low-cost spot instances of Amazon Web Services, providing a ∼65% reduction in cost of analysis.

AVAILABILITY AND IMPLEMENTATION

Falco is available via a GNU General Public License at https://github.com/VCCRI/Falco/ CONTACT: j.ho@victorchang.edu.auSupplementary information: Supplementary data are available at Bioinformatics online.

Item Type: Article
Subjects: R Medicine > R Medicine (General)
Depositing User: Repository Administrator
Date Deposited: 02 Jan 2017 23:44
Last Modified: 15 Jan 2018 00:04
URI: https://eprints.victorchang.edu.au/id/eprint/529

Actions (login required)

View Item View Item