Diagnostic Product Developer
Non-coding DNA is central to understanding human gene regulation and complex diseases, and analyzing evolutionarily constrained sequences can reveal the functional relevance of regulatory elements in the human genome.Comparative studies between the human genome and other mammalian genomes have revealed a large number of constrained genes and regulatory elements.However, due to the short evolutionary distance between primates, identifying primate-specific constrained sequence elements is highly challenging.Currently, the selective constraints unique to the phylogenetic branch that ultimately led to the emergence of the human species remain largely undetermined.

The article was published inNature
Kyle Kai-How Farh, Vice President of Artificial Intelligence at Illumina"We have discovered tens of thousands of regulatory elements that have only emerged in recent evolutionary history, which are unique to primates and humans and do not exist in other mammals."
Whole Genome Comparison of 239 Primate Species
Primate Constraint Protein Coding Sequence
Increasing the number of primate species available in the MSA to 239 extends the phylogenetic branch length by 2.8 times compared to the 43 primate species previously available in the Zoonomia study.Researchers used phyloP to evaluate the base constraint of MSA regions across the entire genome and found3.1% of the bases in the human genome are constrained across all primates., while at the same threshold,7.1% of bases are constrained in 240 mammalian species。
At the same time, the research team used phasstcon to detect 157Mb of constrained sequence elements in primates and found that protein-coding DNA (including exons, start codons, and stop codons) was highly enriched in phascons elements; cis-regulatory elements (CRE) containing transcription regions and accessible chromatin, or non-coding DNA occupied by transcription factors, were also significantly enriched.Codon constraints exhibit a periodic pattern that can distinguish exons from nearby intron sequences at the nucleotide level.(Figure 1e). The researchers identified in primates179,329 exons with constrained evidence, where 99% of exons are widely constrained in non-primate mammals and vertebrates, and in primates, there is2,178 specific constraint exonsMost of the constrained exons in primates (72%) are annotated as protein-coding in the homologous regions of the mouse genome, suggesting that they are not newly evolved coding sequences but have undergone evolutionary changes under selective constraints in primates. The above results indicate that,The evolution of new protein-coding genes or exons from existing sequences is rare, while the increased functional importance of previously existing exons is a relatively common but still rare event.

Constraint Cis-Regulatory Elements (CRE) in Primates
Researchers estimated the average sequence constraint of primates and mammals in a high-resolution map of 1.2 million DNase I hypersensitive sites (DHS) across 438 cell types.The results showed that 42% of the species that diverged 100 million years ago exhibited evidence of sequence constraint, while 11% showed significant constraint evidence in primates but lacked such evidence in mammals or vertebrates (Fig. 2a, b).
Among these DHS elements, the occupation by transcription factors prevents DNase I cleavage, thereby generating transcription factor binding or transcription factor binding events (TFBS) at nucleotide resolution.Among 3.6 million TFBS footprints, 30% exhibit broad mammalian constraint evidence, and 8% show primate-specific constraint.. Notably, 66% of primate-specific constrained DHS elements harbor TFBS that are conserved in mammals, indicatingThe regulatory function initially evolved in a common ancestor.(Figure 2c).

Figure 2. Identification of non-coding regulatory elements with primate-specific constraints, Source:Nature
Next, the researchers explored the evidence of selection for mutations disrupting primate constraint regulatory elements in modern human populations. The results showed that loss-of-function mutations in predicted primate-specific element target genes were significantly fewer than expected (Fig. 3a). Additionally, increased mutational constraint was observed in non-coding primate-specific constraint elements (Fig. 3b), indicatingPrimate-specific constraint regulatory elements have important cis-regulatory functions in humans.

Figure 3. Characteristics of constraint regulatory elements, Source:Nature
Researchers have also identified 74.6 million sites in the human genome that are fully constrained across 239 primate species.Further analysis revealed that,Fine-mapping mutations of clinical phenotypes and complex traits are enriched in all categories of distal accessible chromatin elements and footprints, including mutations with primate-specific constraints.(Figure 4a).Impact HeightConstraintVariation in gene expression tends to occur at greater depth.ConstraintDHS Elements and Footprints, while mutations that affect gene expression with fewer constraints tend to reside in elements with newer constraints.(Figure 4b).12% in CREFine MappingShot MutationOnly in primatesRestricted in China, andNot restricted in placental mammals,There are 93 possibilities.Pathogenic regulatory mutationsAndHuman Complex Traits and Clinical Phenotypes。

The research team identified hundreds of thousands of constrained non-coding sequence elements by analyzing the genomes of 239 primate species, including 187 newly assembled primate genomes.These CREs are unique evolutionary records, providing a perspective to observe the mechanisms leading to the recent evolution of species.The study found,Many human CREs that previously showed no evidence of sequence constraint are in fact restricted to primates only, substantially expanding the number of known constrained noncoding elements in the human genome.
Popular Articles Recommendation
New Achievements in Multi-Omics Research of cfDNA+cfRNA for Gastrointestinal Cancers
Comprehensive Epigenetic Atlas of Cancer Obtained from Just 1mL of Blood
Circulating Cell-Free Mitochondrial DNA Combined with ctDNA Enhances Cancer Detection Performance
Multimodal cfDNA Analysis Enables High-Sensitivity Detection of Early-Stage Cancer
