Home Nature | AI-Designed 'Synthetic' Genome Editor OpenCRISPR-1 Could Reshape the Gene Editing Landscape

Nature | AI-Designed 'Synthetic' Genome Editor OpenCRISPR-1 Could Reshape the Gene Editing Landscape

Aug 16, 2025 14:30 CST Updated 14:30
Profluent

Protein Designer

Image

Image

Writing |Magnolia Oar

With SpCas9The CRISPR-Cas system, represented by Cas, has brought revolutionary breakthroughs to the field of gene editing, demonstrating great potential in both fundamental biomedical research and clinical translational applications. Apart from Cas9In addition to nuclease-mediated classical gene editing, researchers have also developed methods based on Cas.9Develop a series of new derivative tools. For example,Through FusionBaseDeaminase, reverse transcriptase and other functional components, researchers have developed a precise genome editing system capable of achieving various single-base conversions as well as fragment insertions and deletions — base editors.(BEs)And Guide Editor(PEs)【1】. However, existing CRISPR-CasSystemStill limited by the inherent constraints of protein natural evolution, and centered around Cas9Patent disputes over nucleases also constrain the widespread application of gene editing technology.【2】. Despite researchers employing directed evolution and structure-guided protein engineering strategies toCas9A series of modifications were carried out, but such strategies rely on clear structural hypotheses and mechanistic studies, thus having significant limitations. In recent years, artificial intelligence(AI)The rapid development has brought revolutionary progress to the field of protein design.【3-4, which also allows researchers to see the possibility of an "artificial" CRISPR system, providing a new direction for breaking through the bottleneck of gene editing technology.

July 30, 2025Recently, from the American AI protein design company ProfluentAli MadaniTeamInNaturePublished in the journal titledDesign of highly functional genome editorsby modelling CRISPR–Cas sequencesThe PaperThe article designs millions of types of Cas based on artificial intelligence and large language models.9Effector proteins. Among them, various novel Cas9-like proteins represented by OpenCRISPR-1 have low sequence similarity to the classical SpCas9 nuclease.(Differences exist at 400 amino acid sites), but it exhibits comparable activity and specificity. The development of this new tool injects fresh ideas and vitality into the field of gene editing; its free and open-source nature in research and commercial applications is expected to provide a new pathway to bypass the Cas9 patent restrictions.

Image

Generative protein language models are pre-trained on a vast number of natural protein sequences with diverse phylogenetic origins and functions, making it difficult to directly adapt them for designing proteins of specific types and functions. To more efficiently develop CRISPR gene-editing tools, researchers have constructed a dataset containing over 1.2 million CRISPR-Cas operons through extensive data mining.CRISPR–Cas AtlasDataset. Based on this dataset, researchers fine-tuned the ProGen2 large language model to adapt it to the design requirements of CRISPR-Cas proteins and subsequently performed Cas-like operations.9Protein design, the final generated protein sequence will be native CRISPR-CThe diversity of as proteins expands 4.8Times. Subsequent analysis of the sequence novelty and structure prediction of the artificially designed proteins showed that their sequence novelty can rival natural evolution, and over 80% of the artificially designed proteins can be processed by AlphaFold2.(AF2)Predicting structures with high confidence suggests that these artificially designed proteins may possess biological functions.

Considering IIType CRISPR-Cas9The widespread application in the field of gene editing has led researchers to base their work on large language models, combined with nearly 24Myriad Natural Cas9Sequence, artificially designed nearly a million types of Cas9Sequence. These artificially designed Cas-like9Comparable to natural proteins in terms of sequence diversity and novelty, while also exhibiting significant sequence differences from natural proteins, with an overall sequence identity of only 5.6.8%. The study also found that the sequence length of artificially designed proteins closely matches that of natural proteins, and the majority of artificially designed proteins have AF2The predicted structure has high confidence and contains native Cas.9The core structural domain, including HNHAnd RuvCNuclease Domain, PAMInteraction Domain and RECDomain. Notably, the high conservation of this protein structure is not evidently associated with sequence conservation: some artificially designed proteins exhibit less than 40% homology, yet their structures remain highly conserved. Beyond the protein itself, researchers can further design functional gRNA components through large language models.

Subsequently, the researchers employed the N-terminal fragment of the SpCas9 protein or the C-terminal PAM interaction domain.(PID)The constraint-based generation strategy was employed to design Cas9-like proteins, which are compatible with the sgRNA system of SpCas9 and thus suitable for functional validation in human cells. Among 350,000 artificially designed Cas9-like proteins, researchers selected 209 for functional validation. Results showed that multiple artificial proteins exhibited activity comparable to or even better than SpCas9 at three endogenous sites in HEK293T cells, and their activity correlates with the language model.(LM)The scores showed correlations. Researchers further conducted in-depth evaluations of 48 novel proteins constructed by combining N-terminal fragments with PID domains. They found that these proteins exhibited significant sequence differences from both natural and engineered Cas9 in patent databases. Moreover, several artificially designed proteins demonstrated high editing activity and specificity, with some even outperforming SpCas9. To validate the effectiveness of large language models in Cas9-like protein design, researchers also analyzed the activities of various natural Cas9 and other Cas9-like proteins generated through different design strategies. The results confirmed that the design approach based on large language models offers significant advantages.

Among the characterized Cas9-like proteins, PF-CAS-182(Renamed later as OpenCRISPR-1)Not only does it exhibit target site editing activity comparable to SpCas9, but it also has significantly reduced off-target efficiency; SITE-Seq off-target analysis further validates its high editing specificity. The study also notes that OpenCRISPR-1 has lower immunogenicity. Sequence analysis shows that OpenCRISPR-1 contains 1,380 amino acids and is markedly different from SpCas9.(403 sites mutated), and it is also significantly different from known natural proteins.(At least 182 site mutations)AF2 structure prediction shows that these sequence differences are mostly concentrated on the solvent-exposed surface of the protein, with a small number located at the protein-nucleic acid interaction interface. Additionally, the REC1 domain and HNH domain of OpenCRISPR-1 contain two loop insertions, which may help optimize the interaction between the protein and sgRNA as well as conformational stability. Researchers further validated the high editing activity of OpenCRISPR-1 at more endogenous sites and developed an efficient base editor by fusing it with the base deaminase ABE8.20. Moreover, researchers artificially designed the base deaminase TadA based on large language models, successfully generating "artificial" base deaminases with high editing activity. Finally, researchers also designed sgRNA components to adapt to Cas9-like proteins and found that at least 31 artificially designed sgRNAs exhibited higher activity than the classic sgRNA of SpCas9.

In summary,This StudyProtein large language model based on directional training,DevelopmentOpenCRISPR-1Representing a new type of Cas9A system whose editing activity and specificity can match or even surpass that of the classic SpCas.9On this basis, the researchers further artificially designed the core components of sgRNA and base editors—base deaminases—achieving full-component design of the gene editing system. This study provides a new strategy for the development of more novel gene editing tools, and the free and open-source nature of OpenCRISPR-1 also offers new possibilities for breaking through the patent barriers that restrict the widespread application of gene editing technologies.

Original link
https://doi.org/10.1038/s41586-025-09298-z


Plate Maker: Eleven



References


1.Pacesa, M., Pelea, O. & Jinek, M. Past, present, and future of CRISPR genome editingtechnologies. Cell 187, 1076–1100 (2024).
2.Court reignites CRISPR patent dispute. Nat Biotechnol 43, 835 (2025).
3.Nijkamp, E., Ruffolo, J. A., Weinstein, E. N., Naik, N. & Madani, A. Progen2: exploring theboundaries of protein language models. Cell Syst. 14, 968–978 (2023).
4.Ruffolo, J. A. & Madani, A. Designing proteins with language models. Nat. Biotechnol. 42,200–202 (2024).


Academic Cooperation Organization

(*The ranking is不分先后)


Image


Strategic Partner

(*The ranking is不分先后)

·


Reprint Notice


【Original Article】BioArt Original Article,Welcome to share and forward individually, but reprinting without permission is prohibited. The copyright of all published works belongs to BioArt. BioArt reserves all legal rights, and violators will be held accountable.








Image
Image
Image
Image

BioArt

Med

Plants

Talent Recruitment

Recommended Live Streams Recently



Image

Click HomeRecommended Activities

Follow more latest activities!

Image


Image