ITMI Partners with Cloudera to Learning Initiatives and Save Lives

Cloudera announced that Inova Translational Medicine Institute (ITMI), a global leading medical research institute,

DQC Bureau
Updated On
New Update

Cloudera announced that Inova Translational Medicine Institute (ITMI), a global leading medical research institute, has deployed Cloudera Enterprise to securely analyze massive collections of clinical and genomic data at unprecedented speeds and scale for faster innovations in translational medicine research.


As part of the Inova Center for Personalized Health (ICPH), ITMI’s team of leading scientists, researchers, analysts and collaborators use machine learning algorithms on terabytes of clinical and genomic information to identify the genetic links to diseases. They make discoveries from the data insights and, in collaboration with the treating physician, develop personalized treatment plans for patients. This approach is also known as precision medicine and has the power to help patients live longer, healthier lives.

Genetics plays a role in the majority of leading causes of death in the United States, including heart disease, cancer and diabetes. The Institute collects clinical data from thousands of Inova patients born from over 110 countries. Just one person’s unique DNA contains six billion bits of information. Mapping individual's DNA codes into genome sequences helps scientists determine the cause of diseases and discover transformative treatments. As part of this process, ITMI is also assembling what is expected to be one of the world’s largest whole genome sequence databases connected to patient information in a healthcare system.

 “The challenge for ITMI researchers and scientists was to analyze our highly complex, massive collection of raw data faster and more efficiently and translate insights into practical patient care. We’re now able to get answers in minutes and seconds and can find correlations that we couldn’t see before,” said Aaron Black, chief data officer of ITMI. “Our researchers used to spend 80 percent of their time on data wrangling and only a sliver of time on the analytics. We’re in the process of reversing that. We can now accelerate the pace of genomic discovery and dramatically change the way we interact with our research teams.  We believe that will improve our ability to provide the right treatments to the right patients and ultimately, improve outcomes. What Cloudera has done is made this imminently possible.”


The Cloudera platform enabled ITMI to streamline their genomic data analysis for discovery. This genomic data analysis allows a bioinformatics scientist to study genomic correlations from people with conditions like arthritis, autoimmune diseases or cancer. In the past, given the massive size of whole genomes, this process could take ITMI about two months to accomplish. Using Cloudera, ITMI can accomplish end-to-end data analysis in one week. In the future, ITMI expects to do these data analysis in just hours.

Working with Cloudera, ITMI built a world-class bioinformatics infrastructure for the Institute's massively growing data collection of genomes paired against the clinical record. The infrastructure was designed to store and process this convergence of biological data, at speeds and scale, well into the future.

While one genome equals more than three billion DNA base pairs, ITMI currently tracks approximately 9,000 whole sequenced genomes, scaling to 15,000 in the future. Cloudera’s modern analytic database powered by Apache Impala (incubating) brings high-performance SQL analytics to big data. With the flexibility, scale and speed Cloudera provides, ITMI’s team will apply multi-user concurrency and high-performance analysis of genomic data gathered from mothers, fathers and infants enrolled in various familial base studies.  For example, ITMI has been able to leverage its clinical and genomic analysis expertise to help discover previously undiagnosed congenital anomalies in infants.  This is a time consuming and iterative process, but with tools like Cloudera, ITMI anticipates accelerating these discoveries to help these families.

“Inova’s unique and leading edge big data architecture matches the diversity in their patient community and their breadth of innovation. Cloudera is proud to work with these pioneers in clinical genetics at scale, who are advancing genomic research and personalized healthcare,” said Shawn Dolley, industry leader, health and life science at Cloudera. “ITMI is advancing the way researchers and clinicians can consume and manage genomic and molecular data. Combining clinical and genetic data and layering in machine learning is how we will transform the decisions we make in patient care, disease prevention and precision public health.”

cloudera itmi icph learning-initiatives