A More Diverse Human Genome: The 'Pangenome'
Last year, gene researchers made news by announcing the completion of the first complete sequence of the human genome.
That effort has now been expanded, with researchers using that success as a springboard to create a comprehensive and sophisticated collection of genome sequences that more accurately captures human diversity.
The new “pangenome” includes the genome sequences of 47 different people, according to a report published May 10 in the journal Nature.
Since everyone carries a paired set of chromosomes, the reference holds data from 94 distinct genome sequences, according to the report.
Researchers with the international Human Pangenome Reference Consortium want to increase the number of participants to 350 by mid-2024, boosting the pangenome's contents to include 700 distinct genome sequences.
A genome is the set of DNA instructions that helps each living creature develop and function, the researchers explained in background notes.
The genomes between any two people are on average more than 99% identical, but the small differences contribute to each person's uniqueness. This includes their susceptibility to disease and their response to medical treatments.
To understand these genomic differences, scientists create reference human genome sequences to which they can refer as a “standard.”
The original reference human genome sequence is nearly 20 years old. It has been regularly updated as gene technology improves and researchers uncover more regions of the human genome, and was completed last year thanks to technological advances that allowed scientists to fill in gaps that reflected missing information.
But that sequence is limited in its representation of the diversity of the human species, because it consists of genomes from only about 20 people. In fact, most of the reference sequence is derived from just one person's genetics.
“Everyone has a unique genome, so using a single reference genome sequence for every person can lead to inequities in genomic analyses,” explained co-researcher Adam Phillippy, a senior investigator in the Computational and Statistical Genomics Branch with the National Human Genome Research Institute's Intramural Research Program, in Bethesda, Md.
“For example, predicting a genetic disease might not work as well for someone whose genome is more different from the reference genome,” Phillippy said in an institute news release.
Building upon that first complete human genome, researchers used advanced computational techniques to align the additional genome sequences and construct a new human pangenome reference.
While the previous reference genome sequence was single and linear, the new pangenome represents many different versions of the human genome sequence at the same time. This gives researchers a wider range of options for using the pangenome in analyzing other human genome sequences.
“By using the pangenome reference, we can more accurately identify larger genomic variants called structural variants,” said co-researcher Mobin Asri, a doctoral student at the University of California, Santa Cruz. “We are able to find variants that were not identified using previous methods that depend on linear reference sequences."
Until now, researchers have been unable to identify the majority of structural variants that exist in each human genome, because they only had a single sequence for reference.
“The human pangenome reference will enable us to represent tens of thousands of novel genomic variants in regions of the genome that were previously inaccessible,” said co-researcher Wen-Wei Liao, a doctoral student at Yale University in New Haven, Conn. “With a pangenome reference, we can accelerate clinical research by improving our understanding of the link between genes and disease traits.”
The Human Pangenome Reference Consortium is projected to cost about $40 million over five years, as researchers continue to add more genome sequences and improve the quality of the pangenome reference. The work is funded by the National Human Genome Research Institute (NHGRI), a part of the U.S. National Institutes of Health.
“Basic researchers and clinicians who use genomics need access to a reference sequence that reflects the remarkable diversity of the human population. This will help make the reference useful for all people, thereby helping to reduce the chances of propagating health disparities,” said NHGRI director Dr. Eric Green.
The Human Pangenome Reference Consortium has more about the new pangenome.
SOURCE: National Human Genome Research Institute, news release, May 10, 2023