Dissecting virus with HPC
Dr. Rafael Medina Silva and Mauricio Morales in the Emory Department of Pathology and Laboratory Medicine conduct basic as well as translational interdisciplinary research using systems biology approaches to understand the molecular basis of disease, pathogenesis and host responses induced by RNA viruses, such as Influenza virus and SARS-CoV-2. Research projects include studying viral zoonosis, viral immune evasion and using several experimental and Omics tools to dissect molecular host-pathogen interactions leading to resilience versus severe disease.
Their research projects involve the use of various omics analyses tools to analyze large datasets, which requires substantial computational power to perform multiple tasks. This can sometimes take several days or more. Hosting on AWS at Emory, the HyPER C3 cluster have been critical in overcoming these challenges by enabling parallel analyses, significantly reducing the time needed to complete each task through efficient distributed computing capacities and fast iteration cycles. In addition, most of their pipelines are based on the Nextflow platform, which allows the integration of multiple tools into a single defined workflow. The HPC provides them with a safe and easy-to-manage environment, typically using virtual environments and/or containers, eliminating the need for complex installations.
The cluster has provided significant advantages in the integration of complex workflows. For example, some of their previous tasks required whole-genome alignment of two mammalian species, which is a computationally intensive process that can take up to a month in a low performance computer. By using the HPC, they were able to perform this alignment in less than one week.
Another project has involved the use of AlphaFold, a tool that predicts protein structures based solely on their sequences. This tool requires access to graphical processing units (GPU) to function properly. The GPU capabilities of the HPC allow these processes to run smoothly, and in cases where additional computational power is needed, the HPC has been able to provide fast and robust support.
According to Mauricio, “We are highly satisfied with the cluster’s performance, reliability, and support. The HPC staff provided prompt assistance with quick responses and helpful debugging guidance. Having support from a software engineer for these types of workflows or architectures significantly reduced complexity, which greatly improved our productivity. The capacities of the HPC have become a crucial tool to support our increasing computing power needs for data analyses. This is already having a significant impact in the preparation of manuscripts and grant proposals.”