“Previously, we provided data products to pharmaceutical companies. In the course of our interactions with them, we observed that the sales of innovative drugs are no longer limited to the ‘productized sales’ model driven by general relationship-building services typical of conventional medicines. Instead, there is a greater need to understand the pharmacology and treatment regimens of innovative drugs, thereby linking pharmaceutical companies’ medical perspectives and promotional strategies with physicians’ diagnostic and therapeutic approaches. Faced with such highly personalized ‘sales solutions,’ pharmaceutical commercialization teams have not yet fully adapted in the short term, resulting in persistently low efficiency.”
Gu Fei, Founder and CEO of Zero Hypothesis, was interviewed by VCBeat.
“Number of Newly Approved Drugs,” “Number of Marketing Authorization Applications,” and “Number of First-Time Marketing Authorization Applications”—these three key indicators clearly demonstrate that China’s innovative drug industry is developing rapidly with strong momentum, on the verge of an explosion in new drug approvals.

Research Report: The Tide of the Innovation Era Arrives, China Enters a Harvest Period for New Drugs - 20220104 (Image provided by Null Hypothesis)
Clinical trial data show that the number of Phase I clinical trial projects increased by 118, 54, and 172 in 2019, 2020, and 2021, respectively. In recent years, numerous niche sectors in China, such as mRNA tumor neoantigen vaccines and AI-enabled CNS drugs, have seen their first pipeline candidates receive IND acceptance, with a large number of innovative drug pipelines poised for launch.
“About five or six years ago, through our communications with clients, we recognized that the ‘inflection point’ for innovative drugs was inevitably approaching, and the ensuing shift in business models would present a significant opportunity,” shared Gu Fei, offering his insights on the innovative drug industry.
In recent years, domestically developed innovative drugs have been launched successively. The substantial financial resources, effort, and time invested in the R&D phase have significantly heightened the urgency of recouping costs and ensuring positive cash flow from research and development. However, the challenges and issues associated with the commercialization of innovative drugs are only just beginning to receive due attention.
Academic content related to generic drugs is relatively simpler than that of innovative drugs, making it easier for clinicians to make decisions. Therefore, traditional pharmaceutical companies have consistently adopted a “product-based sales” model for generic drugs. However, the clinical application scenarios for innovative drugs are more complex and involve a broader range of considerations, rendering “product-based sales” insufficient in supporting physicians’ decision-making. Consequently, a “solution-based” sales system is becoming the primary model for the promotion of innovative drugs.
With the transformation of sales models, the complexity of related information processing and workflows has increased exponentially. Pharmaceutical companies are in urgent need of a large number of medical representatives equipped with specialized knowledge of new drugs, while frontline sales teams also seek robust academic support to ensure efficient information delivery and concept guidance to physicians.
Behind these pain points, it is evident that pharmaceutical companies currently need to establish a scalable, flexible, and efficient “academic support system for innovative drugs.” This system should ensure the automation and high efficiency of the entire workflow of academic content—from design, production, and circulation to distribution and secondary utilization—thereby enabling marketing, sales, and other relevant personnel to communicate with physicians efficiently and accurately, and to rapidly reach peak sales targets within limited timeframes.
Gu Fei’s original intention in founding Zero Hypothesis was to build an automated production line for the large-scale generation and dissemination of professional medical content, centered on a “generative AI engine for medical academic content,” thereby providing comprehensive content and solution support for the commercialization of innovative drugs.
Can recently hyped general-purpose large language models, such as “ChatGPT,” be applied to medical academia in the field of innovative drug development? In empirical tests based on the null hypothesis, GPT-4-powered ChatGPT could only provide relatively broad guidance on the use of innovative drugs, falling considerably short of the accuracy required for case-specific analysis and practical application. How, then, can AIGC in this domain overcome its “lack of fit” with local clinical needs?
Academic content on innovative drugs is fundamentally different from generalized public information, most notably in terms of data sources, model training methods, and application scenarios.
The continuous acquisition of complete, high-quality data is a critical foundation for the accuracy of large language models. The medical sector’s stringent demand for precision necessitates that source data used for AI training in this field possess strong “evidentiary” value. Generally, such data comprises publicly and semi-publicly available sources, as well as internal data held by enterprises or institutions, which are often inaccessible to general-purpose large language model developers.
Furthermore, optimizing AI models requires professional personnel to perform expert annotation of datasets, thereby enhancing the accuracy of the generated information. For innovative drugs, physicians with clinical experience in this field or medical professionals within pharmaceutical companies can carry out dataset annotation. However, it is difficult for general-purpose large language models to assemble experts from various specialized fields to annotate relevant information one by one, which consequently undermines the professionalism and credibility of the generated content.
Overall, medical scenarios involve human life and health; therefore, applications in this field, particularly AI applications in the academic domain of innovative drugs, have extremely high requirements for precision. On one hand, all data must be based on specialized disease knowledge graphs; on the other hand, ensuring the accuracy and reliability of each data node is essential to guarantee the usability of the generated information.
AIGC that integrates general-purpose large language models with specialized small models will generate immense value. Large language models can facilitate the generation of non-medical information and enable more comfortable human-computer interaction. For professional medical content, particularly in the innovative drug sector where precision is paramount, academic small models tailored to innovative drugs and medical devices, coupled with supporting AI engineering, will serve as the optimal support infrastructure.
Through rationalized division of labor and collaboration between large and small models, intelligent and precise services can be better delivered to diverse professional users in this field.
Faced with “a thousand treatment regimens for a thousand patients,” innovative pharmaceutical companies are overwhelmed by complexity, while clinicians are dazzled by the array of options.
Starting from the core demands of innovative drug commercialization—namely, “high-quality integration” and “automated generation” of medical content—the null hypothesis leverages a self-developed specialized small language model to build a first-class “Generative AI Engine for Medical Academic Content.” This engine automatically analyzes and deconstructs primary source materials, such as medical literature, guidelines and consensus statements, conference reports, clinical studies, and case reports, as well as secondary source materials, including product dossiers, conference presentation slides, and WeChat articles. By integrating knowledge graphs, sample sets, and parameter sets, it constructs a library of professional academic content elements. Subsequently, based on specific scenarios such as new patient acquisition, case discussions, and patient follow-ups, the engine automatically generates the specialized academic content required by pharmaceutical companies at various stages of commercialization.
Through this engine, users can efficiently and rapidly generate content required for scenarios such as product strategy, market sales, clinical academia, and physician engagement. This facilitates the internal accumulation, organization, and utilization of medical content within innovative pharmaceutical and medical device companies. Furthermore, it enables key opinion leaders to quickly access the latest comprehensive information on complex and controversial issues, significantly enhancing the efficiency of their clinical practice and academic research, while fostering bidirectional resource linkage between physicians and enterprises.
In terms of efficiency, the use of the Zero Hypothesis AI platform can reduce the time cost involved in steps such as literature search, screening and filtering, content extraction, and editing from days or even weeks to just one day or even a few hours.
After five years of iterative development, Zero Hypothesis’s AI platform now covers 10 therapeutic areas and 45 specific disease subtypes, features a tagging system with over 120,000 labels, and has built more than 1 million knowledge points and content blocks, gradually strengthening its professional “moat.”
Null Hypothesis’s product portfolio primarily comprises Express, Promote, and KnowS. The first two services are mainly targeted at pharmaceutical companies, while the third is primarily designed for clinicians.
Generally speaking, the medical affairs department of a pharmaceutical company is primarily responsible for the comprehensive aggregation and processing of academic content, serving as the “academic content engine” of the enterprise; meanwhile, the sales department needs to facilitate physicians’ understanding of the drug’s mechanism of action and clinical application scenarios by disseminating relevant academic information. Based on these two premises, Zero Hypothesis Express and Promote were launched.
Express assists the medical department in digitally, efficiently, cost-effectively, and scalably organizing content related to innovative drugs across various stages, including theory, R&D, and clinical trials. Promote is responsible for automatically generating sales communication scripts tailored to specific interaction scenarios, based on the academic content produced by Express.
“One-way efforts” are often less efficient than “two-way engagement,” giving rise to the null hypothesis KnowS. This service provides automated content support tailored to scenarios such as frontier academic hotspots and critical clinical issues faced by physicians. On one hand, it strengthens clinicians’ academic advancement in their respective fields; on the other, it helps lower the communication barriers between doctors and pharmaceutical companies, potentially serving as a key leverage point for reaching consumer-end audiences.
In the long run, KnowS enables Express and Promote to form a true information closed loop, while increasingly raising the information barriers and switching costs between the two products.
Compared with traditional medical media and pharmaceutical IT companies, Zero Hypothesis achieves high-quality data deconstruction, reconstruction, and automated content generation in areas related to innovative drugs, including academia, clinical practice, and marketing. This ensures that every stakeholder encountered and every scenario faced during the commercial translation of innovative drugs receives accurate content support.
Currently, more than 20 leading global pharmaceutical companies have begun adopting Zero Hypothesis’s digital commercial solutions. The footprints of Zero Hypothesis can be seen in the strategies of top-tier global pharmaceutical giants such as Pfizer, Novartis, Roche, Bristol Myers Squibb, Sanofi, AstraZeneca, GSK, and Takeda, while biotech firms including BeiGene, Biogen, FibroGen, and Sumitomo Pharma have also chosen Zero Hypothesis.
Among the 45 subcategories across 10 therapeutic areas now covered by the null hypothesis are oncology drugs for breast cancer, ovarian cancer, non-small cell lung cancer, glioma, and acute lymphoblastic leukemia, as well as treatments for immune-mediated diseases such as atopic dermatitis, neurological disorders such as Alzheimer’s disease, psychiatric conditions such as bipolar disorder, and rare diseases such as spinal muscular atrophy.
With over 1 million knowledge points and content blocks, plus more than 100,000 data sources, Zero Hypothesis has helped users reduce content delivery time by 90% while improving delivery quality by over 60%. Zero Hypothesis estimates that, with the automated production of academic content, 60% of pharmaceutical companies’ commercialization activities hold significant digital potential. By leveraging its AI engine and the “1+3” solution comprising three core products, pharmaceutical companies can restructure their commercialization budgets and achieve substantial cost reductions.
“With the end in mind, we are confident in becoming the most efficient platform connecting pharmaceutical companies and physicians in the commercialization of innovative drugs,” said Mr. Gu Fei.