Cloud Computing for Genome-wide Association Analysis

With the increasing availability and affordability of genome-wide genotyping and sequencing technologies, biomedical researchers are faced with increasing computational challenges in managing and analyzing large quantities of genetic data. Previously, this data intensive research required computing and personnel resources accessible only to large institutions. Cloud computing allows researchers to analyze their data without a local computing infrastructure. We evaluated the feasibility of cloud computing for association analysis of genome-wide data. Our approach utilized the MapReduce model which divides the analysis into independent units and distributes the work to a computing cloud. We evaluated our approach by modeling the relationships between genetic variants and disease in a simulated genome-wide association study. We generated several data sets of 100,000 subjects and various number of genetic variants, and demonstrated that our analysis approach is scalable and provides an attractive alternative to establishing and maintaining a local computing cluster.

International Congress on Computer Applications and Computational Science, vol. 1, pp. 377-383, 2011

Conference: 2011 2nd International Congress on Computer Applications and Computational Science, At Bali, Indonesia.

James W. Baurley, Christopher K. Edlund, Bens Pardamean

Read Full Paper