News » Understanding Data to Predict Outcomes

Understanding Data to Predict Outcomes

The researchers at the George Washington University (GW) Biomedical Informatics Center have made it their mission to leverage biomedical information to predict and improve patient outcomes by developing and using cutting-edge computational technology.

According to Qing Zeng, PhD, director of the GW Biomedical Informatics Center at the GW School of Medicine and Health Sciences, the latest methodologies promise to tame a monstrous pool of available medical data. Once it’s made more manageable, the information could open the door to exciting possibilities, such as predictive algorithms to guide clinical decision making.

One project, using Cerner’s Health Facts©, is a collaboration between Zeng, who is also a professor of clinical research and leadership at the GW School of Medicine and Health Sciences, and a team at Children’s National Health System. Together they are working to develop and analyze a data set containing 10s of millions of patient IDs and a rich pool of information, such as demographics, vital signs, medications, procedures, and lab tests.

Initially, the project was designed to try and predict pediatric ICU deaths, but then broadened to explore condition severity and outcome in the ICU.

“Using this data set, we can see patterns that we were not able to see before and catch the variants in practice,” Zeng said. “We could classify people based on if they were ICU patients or non-ICU patients based on the treatments they were receiving, their vital signs. This evolved into trying to predict and determine severity.”

Understanding the severity, she explained, allows clinicians to better understand how to allocate resources and develop the right treatment plan for a patient and whether a patient should be treated in the ICU.

As part of the ICU project, Zeng’s team also employs the deep learning method on clinical data sets. Deep learning is part of a broad family of machine learning methods based on learning data representations as opposed to task-specific algorithms. This method, she said, does come with some struggle.

It’s a black box model, explained Zeng, a device, system, or object that can be viewed in terms of inputs and outputs, without knowledge of its internal workings. “Most people, understandably, like to see outcomes from algorithms,” explained Zeng. “It isn’t clear how [deep learning] works. It just works.” Her team is tackling this problem and has developed a method called “impact score” to help people understand the relationship between patient clinical features and outcome.
The team also is applying deep learning to several other projects, including one looking at patient frailty as a means of predicting postoperative outcomes among the elderly.

Zeng and her research team have also started using a method called “weakly supervised learning,” a technique used to learn from limited and imperfect labeled data.

“With supervised learning, you are given an outcome or annotation, and you learn from that,” explained Zeng. “Weakly supervised learning has been around [a long time] and it’s a fascinating area. When we were doing it before, we weren’t thinking about trying to see the accuracy of the data set. Now we can see that it’s possible to out-perform the accuracy of the data set.”

Weakly supervised learning, she added, gives researchers validation that the data does not have to be totally correct. If the desire is perfect data, then the sample size is often small because it requires a lot of curation.

“In some cases, like with Health Facts©, which has a lot of data and is de-identified, you can’t go back to verify. You need to assume that there is an error rate and you don’t know what that rate is,” said Zeng.

Overall, Zeng is positive about the projects that the Biomedical Informatics Center has in the works. “People are very interested in what machine learning, artificial intelligence, and clinical outcome prediction can do,” she said. “It’s a very promising area.”