Bioinformatics Seminar
Time: 11AM
Venue: Davis Auditorium and Online
8 July 2025
Machine Learning Analysis of Spatial Transcriptomics Data for Digital Pathology Applications
Quan NguyenInstitute for Molecular Bioscience (IMB)/The University of Queensland
For over a century, cancer patient tissue biopsies are stained by Haematoxylin and Eosin (H&E) for pathologists to examine using light microscopy. Our research focuses on improving H&E analysis by multiple approaches. Recently, spatial transcriptomic (ST) imaging and sequencing data enable us to link tissue morphological features in a H&E image with thousands of unseen gene expression values, opening a new horizon for understanding tissue biology and achieving breakthroughs in digital pathology. We developed STimage as a comprehensive suite of models for predicting gene expression and classifying tissue regions and cell types. For robustness, STimage predicts gene expression based on parameter distributions rather than fixed data points and estimates uncertainty from the data (aleatoric) and from the model (epistemic). STimage achieves interpretability by analysing model attribution at a single-cell level, in the context of histopathological annotation and functional genes, as well as characterising latent representation. Using diverse datasets from three cancers and one chronic disease, we assessed the model’s performance on in-distribution and out-of-distribution samples, across platforms, data types, and sample preservation methods. Further, we implemented an ensemble approach, incorporating pre-trained foundation models, to improve performance and reliability, especially in cases with small training datasets. Finally, we showed that using STimage-predicted values based solely on imaging input, we could stratify patient survival groups. Overall, STimage enables the prediction of molecular and cellular information from histopathological images, opening a new direction to advance digital pathology applications.