How Amazon’s cloud unit is helping researchers analyze genetics
As healthcare becomes increasingly digitized, scientists, physicians and researchers must attempt to decipher unprecedented amounts of data in order to adequately personalize care. The excess of information available to these experts often exceeds their ability to consume and analyze it. AmazonThe cloud unit of has been working to fill this gap.
Amazon Web Services recently rolled out the general availability of Amazon Omics, which helps researchers store and analyze omic data such as DNA, RNA, and protein sequences. The service provides customers with the underlying infrastructure they need to make sense of big data so they can spend more time making new scientific discoveries.
AWS generates a significant portion of Amazon’s revenue and generated $20.5 billion in the third quarter. The cloud computing business has expanded into healthcare, and while AWS doesn’t disclose revenue forecasts for specific services, the global genomic data analysis market size is expected to reach $2.15 billion by 2030, according to a report by Straits Research.
dr Taha Kass-Hout, chief medical officer at AWS, said that the vast majority of healthcare data is inherently unstructured, meaning about 97% of it goes unused. Indexing and understanding this information is challenging, especially when researchers are collecting omics data from tens of thousands of patients.
Prior to Amazon, Kass-Hout served two terms under President Barack Obama and was the first Chief Health Information Officer at the US Food and Drug Administration.
Sequencing a human genome can require anywhere from 80 to 150 gigabytes of storage space, Kass-Hout said, and some research projects are looking at petabytes and exabytes of genomic information.
“You’re talking about almost nine Harry Potter values if you want to print it on a printer,” Kass-Hout told CNBC. “And that’s just for one person.”
Amazon Omics helps researchers organize their data by providing them with three components that they can use individually or collectively. Omics-enabled object storage helps researchers store and share raw sequence data; Omics Workflows helps run workflows that process raw sequence data at scale; and Omics Analytics simplifies the sequence processing output.
More than a dozen customers and partners have tested a beta version of the service and are already using Amazon Omics.
For Jeffrey Pennington, chief research informatics officer at Children’s Hospital of Philadelphia, the impact is already being felt.
Pennington works in the Department of Biomedicine and Health Informatics, which uses data and technology to solve problems in children’s health. He said the department spent five years expanding the infrastructure to analyze omics data and now they don’t have to build or maintain it themselves.
“We’re a large pediatric academic medical center, but we’re still not big enough to learn and build everything needed to use omic data productively,” said Pennington. “Our time and energy, our efforts and our financial resources are far better spent putting the puzzle together than creating those pieces in the first place.”
Amazon Omics also encourages collaboration between large research groups, smaller clinical groups, and intelligence and pharmaceutical companies, said Boris Oklander, co-founder and chief technology officer of C2i Genomics.
C2i is a biotechnology company working to use genomic data to develop personalized treatments for cancer. Oklander said the company participated in the beta for Amazon Omics after trying to develop its own data analysis technology.
He said Amazon Omics has created a collaborative ecosystem that eliminates the need for researchers to build a complex technology from scratch.
“We’re just democratizing,” he said. “This kind of service is something that makes it possible [us] to unlock the value of the investments that different players are making in this space.”
Other big technology companies have developed similar tools. MicrosoftMicrosoft’s Azure cloud computing platform introduced Microsoft Genomics in 2018 to help researchers interpret data generated by genomic technologies. GoogleCloud Life Sciences technology also enables researchers to process biomedical data at scale.
Pennington said the Broad Institute and DNAnexus also offer popular genomic data analysis services, but said they’re difficult to maintain and can analyze fewer data types than Amazon Omics.
Given the sensitive and deeply personal nature of Omic data, Kass-Hout said protecting privacy and patient data is “job zero” for AWS. He said AWS uses more than 300 security, compliance, and governance services and supports 98 security standards and compliance certifications. In doing so, AWS goes “well beyond” regulatory compliance, Kass-Hout said, and also makes best-practice resources and encryption tools available to its customers.
Customers are also responsible for building secure applications on top of Amazon Omics services that prevent AWS from seeing or using the data.
Kass-Hout said the ultimate purpose of Amazon Omics is to efficiently index information so researchers can focus on making real advances in precision medicine.
“If the past decade has been about the digitization of the healthcare and life sciences industry, I firmly believe that the next decade is about understanding that data in a way that is now [where] we can find new therapeutics, new diagnostics and more targeted therapies,” he said.