New AI decodes DNA, revolutionises genomics

GROVER predicts DNA sequences, extracts contextual information, and identifies gene promoters and protein binding sites.

Scientists develop GROVER, an AI model decoding human DNA, transforming genomics and personalised medicine.

Scientists have developed GROVER, an AI model trained to decode human DNA. This innovative tool, created by a team at the Biotechnology Center of Dresden University of Technology, treats DNA as a text, learning its rules and context to draw out functional information from sequences. Published in Nature Machine Intelligence, GROVER has the potential to revolutionise genomics and accelerate personalised medicine.

Understanding DNA’s complex language has been a longstanding challenge. While only 1–2% of the genome consists of genes that code for proteins, the rest contains sequences with multiple functions, many of which remain a mystery. Dr. Anna Poetsch and her team believe AI can help unravel these non-coding regions. GROVER, trained on a reference human genome, has shown the ability to predict DNA sequences and extract contextual information, such as identifying gene promoters and protein binding sites.

GROVER’s development involved creating a DNA dictionary. Using a method inspired by compression algorithms, the team analysed the genome to find common multi-letter combinations, fragmenting the DNA into ‘words’ that improved GROVER’s predictive accuracy. This approach distinguishes GROVER from previous attempts and enhances its ability to decode the genetic language.

Dr. Poetsch and her colleagues are optimistic about GROVER’s impact on genomics. By understanding the rules of DNA through a language model, they hope to uncover deeper biological meanings, advancing both genomics and personalised medicine. GROVER promises to unlock the layers of genetic code, revealing crucial information about human biology, disease predispositions, and treatment responses.