Faculty Use Cases

Below are some example use cases of projects using AWS. 

Tackling Wilms Tumor

Dr. Andy Hong and his lab are focused on identifying new therapeutic targets in high-risk pediatric cancers, such as Wilms Tumor, the most common kidney cancer in children. While research efforts over the years have identified genetic alterations in Wilms Tumor, there have been a limited number of patient-derived models and functional studies in vitro. Click Here to read the full use case. 

Big Data Annotation Project 

The team at the Nell Hodgson Woodruff School of Nursing Center for Data Science, led by Dr. Xiao Hu and Del Bold, developed a web platform to host individual data annotation projects. Individual annotation projects are able to share common underneath data structures and a consistent graphic user interface so that they can be launched quickly. AWS at Emory enabled them to deploy an internet-facing web application to support recruiting data annotators from outside of Emory to use the tool. Click Here to read the full use case. 

Emory Clinical Biomarkers Laboratory

The Emory Clinical Biomarkers Laboratory, directed by Dr. Dean P. Jones, operates a high-resolution metabolomics platform to analyze clinical samples, e.g. plasma and tissue samples from HIV patients, for identification of metabolic biomarkers and to improve the understanding of disease mechanisms. Under the technical guidance of Karan Uppal, PhD, Director of Computational Metabolomics of the laboratory, AWS is used to support various data processing workflows including: 1) data extraction from instrument files using R packages apLCMS, xcms, and xMSanalyzer; 2) quality evaluation using xMSanalyzer; 3) annotation using xMSannotator; 4) statistical, network and pathway analysis using R and Python based tools. The lab generally uses AWS EC2 spot instances (with low risk of out-bidding) for computing to maximize cost saving and uses S3 for data transfer and sharing.  Overall, the lab is enjoying the flexibility and scalability of AWS for most of their computation needs.

Emory Integrated Computational Core

Rich Johnston, PhD, director of the Emory Integrated Computation Core (EICC) has recently mapped and called 1027 genomes in less than a week with AWS using PEMapper and PECaller.  Each of these 30x genomes was about 60 to 100 GB compressed in size, and the mapping would typically take 12 hours on a 64-core workstation. While the on-premises Emory Department of Genetics’ HPC compute cluster could only handle 2 genomes per day, Rich was able to launch 200 AWS EC2 instances in parallel to process 200 genomes per day. The whole workflow was scripted and took advantage of AWS APIs to enable the automation. S3 was used to stage the genomes and handles the transfer of the results. In short, AWS’s scalability allows Rich to meet his project deadline fast. For additional information, Rich is available to share his AWS experience. You can contact him at eicc@emory.edu.

Citation

Richard Johnston, Pankaj Chopra, Thomas S. Wingo, Viren Patel, International Consortium on Brain and Behavior in 22q11.2 Deletion Syndrome, Michael P. Epstein, Jennifer G. Mulle, Stephen T. Warren,Michael E. Zwick, and David J. Cutlera, “PEMapper and PECaller provide a simplified approach to whole-genome sequencing”. Proc Natl Acad Sci U S A. 2017 Mar 7;114(10) 

Machine Learning to Predict Psychosis

Using a Machine Learning approach on AWS at Emory, Dr. Phil Wolff, Professor in Emory's Psychology Department, studies the prediction of psychosis using semantic density and latent content analysis. Click this link for more information. 

Citation

Rezaii, N., Walker, E. & Wolff, P. A machine learning approach to predicting psychosis using semantic density and latent content analysis. npj Schizophr 5, 9 (2019). https://doi.org/10.1038/s41537-019-0077-9

Understanding Transplant Rejection and Infection

The goal of a transplant is having one transplant last a lifetime. While most patients do well with their transplants, there are those whose bodies get infections or reject the transplant completely. This story highlights how Dr. Chris Larsen, Emory transplant surgeon and immunologist, and his team use AWS at Emory to help predict those outcomes.

Studying the Microbiome through Genomic Analysis

Studying the microbiome involves the analysis of complex gene data of bacteria and other microorganisms, often by processing a large number of gene sequences with powerful compute resources. Learn how Dr. Irene Yang, Assistant Professor in the Nell Hodgson Woodruff School of Nursing, utilized AWS at Emory for her microbiome research at this link.

Multi-omics through AWS at Emory

Biomedical research’s data science demands are expanding rapidly, and the evolution of multi-omics is at the forefront. Multi-omics is single-cell RNA sequencing in which multiple parameters of each cell are measured and then the test is repeated on thousands of cells. Click this link to learn how Dr. Eliver Ghosn’s team is partnering with AWS at Emory to lead the charge.

Training Large-Scale Deep Neural Networks

Dr. Liang Zhao, an assistant professor in the Department of Computer Science, has focused research on data mining, artificial intelligence, and machine learning for the past several years. His most recent project is centered on training of large-scale deep neural networks (DNNs) and its applications to graph-structured data such as large-scale social networks and commercial networks. Due to the complexity and scale of this endeavor, Dr. Zhao chose to partner with AWS at Emory to gain the tools needed for his work.

Predicting Cardiovascular Outcomes with CRADLE

Dr. Joyce Ho is an assistant professor in the Computer Science Department at Emory University. Her NIH funded research focuses on leveraging modern machine learning algorithms to predict cardiovascular complications in patients with diabetes. Early identification of patients at high-risk of developing cardiovascular complications is crucial for providing effective interventions. To build accurate and robustmachine learning models to identify high-risk patients, a large patient population is needed.  AWS at Emory enabled Dr. Ho to leverage the Emory Clinical Research Analytics Data Lake Environment (CRADLE) in her research.

Deciphering Glial Development and Its Impacts

Dr. Steven Sloan is an Assistant Professor in the Department of Human Genetics at the Emory University School of Medicine. His current lab studies glial development and the role these cells play in neurodevelopmental and neuropsychiatric disease. His team’s work is instrumental in determining how human glia develop as well as how abnormal glia might contribute to neurodevelopmental disorders. This work may prove pivotal in deciphering new mechanisms and therapeutic targets to advance human health.

Keeping Tabs on Big Tech

David Schweidel is Rebecca Cheney McGreevy Endowed Chair and Professor of Marketing at Emory University’s Goizueta Business School and an expert in the areas of customer relationship management and social media analytics. His research focuses on the development and application of statistical models to understand customer behavior and inform managerial decisions. David has extensively leveraged AWS at Emory in his research including projects focused on language generation, social network analyses, and search engine optimization (SEO) studies. One of his latest and most fascinating efforts explores the complex data privacy issues associated with Apple iOS updates.