Google unveils AI tool to probe the mysteries of the human genome

A Google logo is seen at a company’s research center in Mountain View, California, the United States, May 13, 2025. — Reuters

Google on Wednesday unveiled an artificial intelligence tool that its scientists say will help unravel the mysteries of the human genome and could one day lead to new treatments for diseases.

The AlphaGenome deep learning model has been hailed by outside researchers as a “breakthrough” that would allow scientists to study and even simulate the roots of difficult-to-treat genetic diseases.

While the first complete map of the human genome in 2003 “gave us the book of life, reading it remained a challenge,” Pushmeet Kohli, vice president of research at Google DeepMind, told reporters.

“We have the text,” he said, which is a sequence of three billion nucleotide pairs represented by the letters A, T, C and G that make up DNA.

However, “understanding the grammar of this genome – what is coded in our DNA and how it governs life – is the next critical frontier for research,” said Kohli, co-author of a new study in the journal Nature.

Only about 2% of our DNA contains instructions for making proteins, which are the molecules that build and operate the body.

The remaining 98% was long considered “junk DNA,” with scientists struggling to understand what it was used for.

However, this “non-coding DNA” is now thought to act as a conductor, directing the functioning of genetic information in each of our cells.

These sequences also contain many disease-associated variants. It is these sequences that AlphaGenome aims to understand.

A million letters

The project is just part of Google’s AI-driven scientific work, which also includes AlphaFold, winner of the 2024 Nobel Prize in Chemistry.

AlphaGenome’s model was trained on data from public projects measuring non-coding DNA in hundreds of different cell types and tissues in humans and mice.

A DNA double helix is ​​visible in an undated artist's illustration released by the National Human Genome Research Institute to Reuters on May 15, 2012. — Reuters
A DNA double helix is ​​visible in an undated artist’s illustration released by the National Human Genome Research Institute to Reuters on May 15, 2012. — Reuters

The tool is able to analyze long DNA sequences and then predict how each pair of nucleotides will influence different biological processes within the cell.

This includes when genes start and stop and how much RNA (molecules that carry genetic instructions inside cells) is produced.

There are already other models that pursue a similar goal. However, they must make compromises, either by analyzing much shorter DNA sequences or by decreasing the level of detail of their predictions, known as resolution.

DeepMind scientist and lead author of the study, Ziga Avsec, said long sequences – up to a million DNA letters – were “necessary to understand the complete regulatory environment of a single gene”.

And the high resolution of the model allows scientists to study the impact of genetic variants by comparing the differences between mutated and non-mutated sequences.

“AlphaGenome can accelerate our understanding of the genome by helping to map where functional elements are found and what their roles are at the molecular level,” said Natasha Latysheva, co-author of the study.

The model has already been tested by 3,000 scientists in 160 countries and can be used by anyone for non-commercial purposes, Google said.

“We hope researchers will expand it with more data,” Kohli added.

‘Breakthrough’

Ben Lehner, a University of Cambridge researcher who was not involved in the development of AlphaGenome but has tested it, said the model “works very well indeed.”

“Identifying the precise differences in our genomes that make us more or less susceptible to thousands of diseases is a key step toward developing better treatments,” he explained.

However, AlphaGenome “is far from perfect and there is still much work to be done,” he added.

“AI models are only as good as the data used to train them” and existing data is not very suitable, he said.

Robert Goldstone, head of genomics at the UK’s Francis Crick Institute, cautioned that AlphaGenome was “not a silver bullet for all biological questions”.

This is partly because “gene expression is influenced by complex environmental factors that the model cannot see,” he explained.

However, the tool still represents a “breakthrough” that would allow scientists to “study and simulate the genetic roots of complex diseases”, Goldstone added.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top