Unify: Learning Cellular Evolution with Universal Multimodal Embeddings

This article has 0 evaluations Published on
Read the full article Related papers
This article on Sciety

Abstract

Integrating single-cell RNA-sequencing (scRNA-seq) data across species is hindered by evolutionary divergence, technical batch effects, and the reliance on one-to-one orthologs. We present Unify, a transfer learning methodology that learns universal cell embeddings by defining functionally coherent, multi-modal macrogenes. This is achieved by combining RNA expression with embeddings from protein language models and general-purpose language models. Unify transcends species boundaries, enabling cross-species comparisons beyond strict gene-level homology. Unify corrects batch effects while preserving conserved biological signals across vast evolutionary distances and enables more accurate prediction of perturbation responses across species, such as from mouse to human. Applied to species separated by over 700 million years, Unify reconstructs more accurate multi-species cell-type evolutionary trees and uncovers convergent gene programs. Together, these results establish Unify as a powerful method for comparative single-cell genomics and evolutionary biology.

Related articles

Related articles are currently not available for this article.