NEW DELHI, May 12:
Researchers have released a draft of the first human pangenome – a new, usable reference for genomics that combines the genetic material of 47 individuals from different ancestral backgrounds to allow for a deeper, more accurate understanding of worldwide genomic diversity.
The pangenome was produced by the Human Pangenome Reference Consortium (HPRC), a government-funded collaboration between more than a dozen research institutions in the United States and Europe, launched in 2019.
By adding 119 million bases – “letters” in DNA sequences – to the existing genomics reference, the pangenome represents human genetic diversity not possible with a single reference genome.
The researchers from the Rockefeller University, US, involved in the project called the single reference genome a “flawed tool”.
One of its biggest problems, they said, was that about 70 per cent of its data came from a single man of predominantly African-European background whose DNA was sequenced during the Human Genome Project, the first effort to capture all of a person’s DNA.
As a result, they said, it can tell us little about the 0.2 to one per cent of genetic sequence that makes each of the seven billion people on this planet different from each other, creating an inherent bias in biomedical data believed to be responsible for some of the health disparities affecting patients today.
The researchers from University of California Santa Cruz (UCSC), US, also part of the collaboration, said that this reference was nearly 20 years old and fundamentally limited in that it could not represent the wealth of genetic variations present in the human population.
They said that the human pangenome was highly accurate, more complete and dramatically increased the detection of variants in the human genome.
A collection of research papers on the human pangenome is published in the journals Nature and Nature Biotechnology.
“We are introducing more diversity and equity into the reference by sampling diverse human beings and including them in this structure that everyone can use,” said UCSC’s Associate Professor of Biomolecular Engineering, Benedict Paten, who is the senior author on the main marker paper.
“One genome isn’t enough to represent everybody – the pangenome will ultimately be something that is inclusive and representative,” said Paten.
The pangenome is a reference that combines the genomes of 47 individuals from various ancestral backgrounds. It looks like a linear reference in areas where the sequences have the same bases, and expands to show the areas where there are differences. It represents many different versions of the human genome sequence at the same time, and gives scientists a more accurate point of comparison for variation that is present in some populations but not others, the UCSC researchers said.
The pangenome was made possible, they said, through the development of advanced computational techniques to align the multiple genome sequences into one, usable reference in a structure called a pangenome graph.
Paten and researchers in the UCSC Computational Genomics lab helped develop the algorithmic methods needed to create this pangenome graph structure.
Because of the methods used in this project, all of the genomes within the pangenome reference are of extremely high quality and accuracy, covering more than 99 percent of each human genome with more than 99 percent accuracy, they said. (PTI)