A pioneering collaboration is unlocking the secrets of disease by zooming in on individual cells. Using single-cell and spatial genomics, scientists are discovering new cell types, revealing new drug targets, and harnessing artificial intelligence to transform how we understand and treat complex conditions – from COPD to liver and kidney disease.
In the quest to solve the mysteries of human disease, a new technology is handing researchers unprecedented detail.
While decoding the human genome has advanced our understanding, DNA only tells part of the story. Researchers may know that a gene variant is linked to an illness but often do not know how and when it affects the disease process.
Now, new cutting-edge techniques are enabling scientists to zoom in on single cells – the fundamental units of biology – and gain a far clearer picture of the molecular intricacies of disease.
That promise – and its power for drug discovery and development – is at the heart of a new collaboration between GSK and the Teichmann Laboratory at the Cambridge Stem Cell Institute, a world-leading centre for these new techniques.
“We have been able to isolate the cell type that's responsible for the organ dysfunction and the timing of its involvement. That's incredibly powerful for drug discovery and what we need to know to prevent and change the course of disease,” says Kaivan Khavandi, SVP & Global Head, Respiratory, Immunology & Inflammation R&D at GSK.
Advances in single-cell and spatial genomics, along with computational power, are allowing researchers to untangle the differences between individual cells, making it possible to selectively sequence the genes that are “switched on” in a given cell and build a detailed picture of the cell from that data.
“In the past, metaphorically speaking, we've basically been trying to work with a smoothie and then trying to work out what fruit went into the smoothie,” says David Michalovich, VP of Translational Sciences at GSK’s Respiratory, Immunology and Inflammation Research Unit. “But now we're able to understand precisely what fruit is in the smoothie, and can work out how many bananas, apples, blueberries, blackberries, and so on, are in there. Translating that back to our work at GSK, by using these cutting-edge scientific approaches, we're now able to get to that very fine review of what cells are in a tissue.”
Studying tissues at different stages of disease, cell by cell, makes it possible to identify the processes involved in unprecedented detail. In turn, that work generates new ideas and stronger evidence for drug-mediated mechanisms that might work to stop or delay disease processes at the molecular level. Confirming a drug’s mechanism using single-cell and spatial genomics could double the chances of its eventual success in clinical trials on top of the already-significant confidence boost from establishing a genetic link.
Deep new insights
This new approach has unearthed a vast trove of new information on the operations of the human body, says Sarah Teichmann, who splits her time as Professor at the University of Cambridge’s Stem Cell Institute and VP of Translational Research at GSK’s Respiratory, Immunology and Inflammation Research Unit.
“Almost every single tissue that we looked at had new and unexpected cell types and cell states,” says Teichmann. Before the introduction of single-cell and spatial genomics, a few hundred cell types were known. Teichmann and her community of hundreds of collaborators around the world who are working to build a cell-by-cell model of the healthy human body, have discovered thousands more.
GSK’s newest partnership with the Teichmann Laboratory focuses on discovering and developing drugs for lung conditions such as COPD, as well as liver and kidney diseases. It builds on earlier collaborations, such as a research project that identified previously unknown cell types present in asthma, generating new potential targets for novel drugs.
Khavandi says these types of disease his teams are focusing on could benefit enormously from the insights of single-cell and spatial genomics because, historically, they have been so poorly understood and challenging to treat.
“This collaboration now gives us access to the world leading laboratory for single-cell technologies and data sets, enabling us to apply them to our drug development,” he says.
Single-cell technologies in drug discovery
Single-cell and spatial genomics are already creating valuable insights for GSK during drug development. With its collaborators at the Teichmann Lab, GSK scientists are tracking the cell-by-cell effects of a drug on patients with liver disease. That work has revealed which particular type of diseased liver cell is most strongly targeted by the drug.
“As a scientist, you get really excited about this,” says Michalovich. “We’re bringing multiple evidence streams together from large-scale human genetics all the way to the single-cell data, and they’re all converging on this set of genes. That gives us confidence that those are the ones you really want to focus on for drug development.”
This information sheds light on which individuals are most likely to benefit from the drug. It also provides clues about what kinds of drug combinations might be most effective: knowing what is happening at the cellular level means that researchers can combine drugs that target different types of diseased cells rather than going after the same targets.
“When evaluating and ranking multiple research programmes and potential medicines, we select those backed by strong scientific evidence from human data”, says Khavandi, who chairs GSK’s investment board from early concepts to advanced development stages. “By connecting genetic information with detailed insights into how cells work, we can clearly see how treatments might target the root causes of diseases and address them effectively. This is very powerful.”
Diagnostics also stand to benefit from these new capabilities. Studying cells up close enables researchers to identify the molecules that are secreted by disease-state cells, and which of those go on to be present in the blood and urine and could therefore be measured by straightforward tests.
A key role for artificial intelligence
Single-cell sequencing generates expression profiles (measurements of gene activity) across tens of thousands of genes for each individual cell. The challenge lies not just in the data volume, but in its structure: these are highly sparse, high-dimensional datasets where the relationships between genes, cell types and disease states are non-linear and context-dependent. Extracting biological meaning from this complexity is challenging, but modern machine learning algorithms are proving a valuable addition to GSK's computational toolkit.
The breakthrough has come from foundation models – large-scale AI systems built on the transformer architecture, which uses an 'attention mechanism' to weigh the importance of different pieces of information relative to each other. While the first attention mechanisms were introduced nearly a decade ago, recent advances in model architecture and computational hardware have enabled these systems to scale dramatically.
“What makes these modern attention-based architectures powerful for single-cell analysis is their ability to learn which genes matter in context”, says Finnian Firth, a Director of Artificial Intelligence and Machine Learning at GSK. “The models can now identify that certain genes are only significant when expressed alongside specific combinations of others; they can help identify coordinated programs that define cell identity and disease states across tens of thousands of measurements."
This computational power has become essential for GSK's analysis of single-cell data at scale, enabling researchers to extract biological meaning from the huge volumes of information contained within each cell.
“There's so much data to handle that it’s now beyond the stage the human scientists can manage,” says Michalovich. “We really need the AI and machine learning tools that help us to recognise patterns in the data and surface them for scientists to look at further.”
For example, GSK uses these AI methods to compare expression data from healthy lung tissue with samples from COPD patients, revealing differences in cell type composition, gene expression patterns and cellular interactions that characterize disease progression. The models can also integrate single-cell data with medical imaging and clinical measurements such as blood tests, revealing how differences at the cellular level manifest across different data modalities and scales – from molecular signatures to tissue architecture to patient outcomes.
“The granularity of the measurement means that we can detect subtle differences – between cell states, for instance – that simply don't get picked up in other datasets,” says Firth.
And while cutting-edge technology – both in sequencing and artificial intelligence – plays a crucial role in single-cell and spatial genomics, the insights gained from these methods wouldn’t be possible without a contribution that only humans can provide, notes Michalovich.
“This is real people being highly altruistic, providing their samples, and data and information that allows us to really understand disease processes,” he says. “We're able to really bring those insights directly from people with disease, and that means we're more likely to be able to treat the right people, in the right way.”





