Hod Lipson | Data Smashing: Uncovering Order in Data Stream
From speech recognition to the discovery of new stars, almost all automated tasks involve comparing streams of data for similarities and outliers. Automated discovery methods, however, have not kept pace with the exponential growth in data. One reason is that most algorithms depend on humans to define what features to compare. Here, we propose a new way to match multiple sources of data streams without any prior learning. We show how this principle can be applied to challenging problems, including the interpretation of EEG patterns in epileptic seizures, the detection of abnormal heartbeats in ECG data and classifying astronomical objects from light measurements. Our data smashing principles produce results as accurate as algorithms developed by domain experts, and could open the door to understanding increasingly complex observations that experts don’t yet know how to interpret.
David Madigan | Observational Studies: Promise and Peril
Randomized experiments are the gold standard in measuring the effects of interventions in medicine, education, social science and other areas. In reality, researchers often rely on observational studies, leading to vast numbers of contradictory findings published in scholarly journals and widely disseminated through the media. Decision makers and the public assume that a rigorous peer-review process guarantees that these results are valid. This is not always so. Well-intentioned analysts make design choices, run analyses and publish their results overlooking the possibility that different choices may have produced entirely different results. I will provide an overview of the current state of the art in observational studies in healthcare and describe some promising research directions.
Olena Mamykina | Predicting Blood-Glucose Levels to Manage Diabetes
Advances in personal health tracking promise to help individuals gain deep insights into their health and behavior. Yet, most health apps still rely on humans to identify trends, make discoveries and take action. In this research, we are building computational models and interactive decision-support tools to help type 2 diabetics improve their nutritional choices. Our decision-support tool forecasts how a planned meal will influence blood-glucose levels based on an individual’s physiology and past data. Early results suggest that this automated prediction tool may produce more accurate assessments than individuals or their healthcare providers can.
Adler Perotte | Predicting Kidney Disease Progression with Large-Scale Patient Data
Columbia University coordinates a global network of health databases known as the Observational Health Data Science and Informatics (OHDSI) collaborative. With hundreds of millions of patient records, OHDSI allows researchers to look for large-scale patterns that can reveal new ways to identify and treat disease. In a recent study, my colleagues and I used observational health data to build a model to predict how likely a patient with stage 3 kidney disease, in which the kidney has lost half of its function, will progress to stage 4, with up to 90 percent loss. Our model, which incorporated patient lab test results and clinical records, outperformed models that did not include this information. Identifying patients at high risk for disease progression allows doctors to customize treatment that can stall or prevent its progression.