CMU-CS-20-139
Computer Science Department School of Computer Science, Carnegie Mellon University
Sample-Specific Models for Precision Medicine Benjamin Lengerich Ph.D. Thesis December 2020
Modern applications of artificial intelligence are often characterized by traininglarge machine learning (ML) models on large datasets. These datasets are composed of overlapping groups of samples, either explicitly (e.g. the large dataset is createdby combining multiple datasets) or implicitly (e.g. the samples belong to latent sub-populations). Population models prefer weakly-predictive global patterns overhighly-predictive localized effects, a problem because localized effects are critical to understanding complex processes such as in applications to computational biology (in which samples come from latent cell types) and precision medicine (in which patients come from latent disease subtypes). In this thesis, we propose that: The performance of intelligent computer systemscan be improved by treating different samples as different tasks. This is especially helpful in domains such as computational biology and precision medicine, in which we care about understanding the highly specific context of each sample. We propose to solve this problem by estimating a collection of many small models. For large collections, each model is responsible for only a small number of samples, enabling simultaneous interpretability and accuracy. As we show in this thesis, this framework can be scaled to estimate different model parameters for every sample. This thesis begins by studying the challenges of characterizing real-world datawith population-level models. Next, we develop the methodology of PersonalizedRegression. Finally, we apply sample-specific inference to computational biologyand precision medicine by: (1) Identifying Discriminative Subtypes of Cancers from Histopathology Images and (2) Cell-Specific Transcriptomic Regulatory Network Inference.
103 pages
Thesis Committee:
Srinivasan Seshan, Head, Computer Science Department
| |
Return to:
SCS Technical Report Collection This page maintained by [email protected] |