Application and Characterization of the Multiple Instance Learning Framework in Flow Cytometry
Abstract
For decades, flow cytometry has allowed for single-cell profiling based on selected biomarkers and is widely used in both clinical and research settings. One major limitation of most conventional flow cytometry analyses is the dependency on a mostly manual gating process. This generally involves sequentially selecting biomarkers to isolate phenotype-associated cell populations, an approach that is both labor-intensive and prone to bias. To address this challenge, we introduce the application of a series of multi-instance learning frameworks for automated flow cytometry data analysis. Our models demonstrate strong performance across diverse biomedical applications, including cancer subtyping based on tumor-infiltrating immune cells, HIV survival stratification, AML minimal residual disease prediction, and COVID-19 severity assessment. We further examine how network architecture affects predictive performance and the detection of rare but clinically significant cell populations. Notably, our models utilize attention mechanisms to directly identify phenotype-associated cell subsets, serving as an interpretable, data-driven alternative to fully manual gating. These findings underscore the potential of multi-instance learning as a scalable and interpretable framework for flow cytometry, with broad applications in precision medicine and translational immunology.
Related articles
Related articles are currently not available for this article.