From Neural Language Mechanisms to Mandarin Brain-Computer Interfaces: An Exclusive Interview with Prof. Yuanning Li of ShanghaiTech University

Oct 31, 2023 09:49 CST Updated 09:49

Brain-Computer Interface (BCI) is a technology that directly connects the human brain with external devices by monitoring and recording central nervous system activity, translating it into signals or commands understandable by these devices, thereby enabling direct communication and control between the human brain and external equipment.

Over the past decade, with continuous technological innovations, brain-computer interfaces (BCIs) have entered a critical phase of industrialization exploration. The emergence of companies represented by Neuralink has also demonstrated to the market the diverse possibilities for translating this technology into commercial products.

Traditional motor brain-computer interfaces (BCIs) can help patients with paraplegia control electric wheelchairs and prosthetic limbs, closed-loop BCIs based on electrical stimulation can be used to regulate emotions and treat mental disorders, and speech BCIs can restore language communication abilities in patients with paralysis and speech impairments. Compared with traditional motor BCIs, speech BCIs involve more complex research subjects, and there are still many gaps in the basic research on the neural mechanisms of language.

In recent years, international research on language brain-computer interfaces has achieved breakthroughs, exemplified by a series of papers published in Nature by Edward Chang’s team at the University of California, San Francisco, and Krishna Shenoy’s team at Stanford University, which have preliminarily realized the decoding and synthesis of speech and text from brain signals for English.

As the dual foundation for the development of brain-computer interface (BCI) technology and the expansion of its application scenarios, the Chinese market plays a pivotal role. However, as a tonal language, Chinese conveys different semantic meanings through tones, and its writing system and syntactic structures differ significantly from those of English. Consequently, English-based decoding mechanisms and algorithms cannot be directly applied to the Chinese language.

Human speech and writing systems share the same physiological basis, yet languages across different cultures and ethnic groups exhibit substantial specificity. This raises fundamental, unresolved scientific questions in language cognition: How does the human language system both share similar brain network architectures and underlying input features, while also displaying highly differentiated linguistic specificity, thereby transforming perceptual inputs into distinct linguistic information? What are the significant differences between Chinese and English? What are the neural signatures of Chinese speech processing? And how should the brains of 1.3 billion users operating on the “Chinese language system” interact with the external world through new media?

Professor Li Yuanning of ShanghaiTech University seeks answers at the intersection of neuroscience and computer science.

1Joint Research

In layman’s terms, Li Yuanning’s experimental approach involves studying the fundamental neural mechanisms underlying Chinese listening and speaking by recording intracranial electroencephalographic (EEG) signals. These neural mechanisms are then applied to investigate how human language can be decoded, reconstructed, and synthesized from neural signals. In terms of engineering applications, this technology can establish the foundational framework for Chinese brain-computer interface (BCI) systems and facilitate the synthesis of speech directly from thought.

Specifically, he collaborates with neurosurgeons to leverage the opportunities presented by neurosurgical procedures. By implanting high-density electrodes intracranially, he directly records the electrophysiological activity of neural populations and applies machine learning and artificial intelligence methods to analyze and model the processes underlying language information processing in the brain. This research direction not only advances our understanding of fundamental questions in neuroscience but also holds significant translational clinical value for elucidating the neural mechanisms of language communication disorders such as aphasia and alexia, thereby improving their diagnosis and treatment. Furthermore, it contributes to the precise preservation of critical language and cognitive functions during neurosurgery. Moreover, a deeper understanding of the neural mechanisms of human language can inspire the development of more robust and efficient next-generation artificial intelligence models for natural language processing, as well as the design of novel brain-computer interaction and neuromodulation systems.

In June this year, Li Yuanning’s team, in collaboration with the teams of Professors Wu Jinsong and Lu Junfeng from the Department of Neurosurgery at Huashan Hospital Affiliated to Fudan University, as well as Professors Ming Dong and Xu Minpeng from Tianjin University, published an article titled “Decoding and synthesizing tonal language speech from brain activity” in the Science sub-journal Science Advances.

Li Yuanning’s team developed a deep learning algorithm model specifically designed for Mandarin Chinese tones. By combining this model with high-density electrocorticography (high-density ECoG) in clinical settings, they analyzed neural activities in the brain associated with Mandarin tones and syllable structures. Ultimately, they achieved the world’s first end-to-end synthesis of Mandarin monosyllabic tone speech directly from brain neural activity.

Utilizing an end-to-end multi-stream neural network to achieve end-to-end synthesis of Chinese speech from intracranial electroencephalography.

This study successfully achieved the direct decoding and synthesis of Chinese, a tonal language, using intracranial electroencephalography (iEEG) for the first time. It provides a feature engineering analysis and processing solution based on neuroanatomical and electrophysiological characteristics for future potential “implantable Chinese brain-computer interfaces,” and proposes a widely applicable deep learning framework, thereby laying a dual foundation in both theory and technology.

Traditional brain-computer interface (BCI) approaches based on motor decoding can enable thought-controlled keyboard typing or cursor writing, thereby indirectly converting electroencephalographic (EEG) signals into text. In contrast, this study directly captures and decodes the electrical signals in the brain that control speech production, achieving true end-to-end direct synthesis from EEG to speech with a “what you think is what you get” capability. This approach holds greater promise for helping patients with speech impairments regain high-speed, efficient natural language expression.

2Connecting the Dots

Li Yuanning, now an Assistant Professor at the School of Biomedical Engineering at ShanghaiTech University, did not receive formal training in biology or medicine; he did not even select biology as an elective for the National College Entrance Examination. Yet today, he can guide you with the expertise of a seasoned navigator through the brain’s anatomical structures, exploring the unique neural circuits underlying language perception and production.

Li Yuanning pursued his undergraduate studies at Beihang University, majoring in Electronic Information Engineering. His daily routine primarily revolved around deriving mathematical formulas and writing code. During an interdisciplinary lecture series organized by the School of Advanced Engineering, Li Yuanning casually attended a talk on computational neuroscience delivered by Tao Letian from Peking University’s School of Life Sciences. However, this presentation did not produce any dramatic, lightning-bolt moment of inspiration. “I didn’t remember anything, nor did I understand much; all that stuck with me was the notion that mathematics could be applied to studying the brain.” This sums up Li Yuanning’s recollection of the lecture.

Although it did not directly influence Li Yuanning’s future research direction, this report provided a starting point for him to look back and connect the dots more than a decade later.

After completing his undergraduate studies, Li Yuanning enrolled at Carnegie Mellon University in the United States to pursue a degree in Electrical and Computer Engineering. During his master’s program, a course titled Neural Signal Processing captured his attention—not only because the instructor, Professor Byron Yu, was highly experienced and had studied under Krishna Shenoy, the aforementioned pioneer in brain-computer interfaces, but also because the course textbook, Pattern Recognition and Machine Learning, is considered a classic in the field of computer science.

Why Does Neuroscience Need Machine Learning? Li Yuanning’s curiosity was sparked and quickly spiraled out of control. When the course covered how machine learning methods could be used to study basic brain functions—such as how the motor cortex encodes movement processes and controls arm motion in three-dimensional space—he suddenly realized that his prior accumulation of knowledge in computer science, including supervised learning, unsupervised learning, Gaussian processes, and statistical inference, had led him step by step, like a game of Chinese checkers, to a fascinating application: neuroscience.

After spending a year in Professor Byron Yu’s laboratory, he chose to focus his doctoral research on neural computation and machine learning. This field is part of Carnegie Mellon University’s unique joint doctoral program, which adopts a computational perspective with neuroscience as its subject of study, aiming to cultivate researchers who specialize in the integration of brain science and artificial intelligence. During his doctoral studies, he received research training within a multidisciplinary team comprising cognitive neuroscientists, neurosurgeons, statisticians, and psychologists, collaborating on computational cognitive neuroscience research based on invasive intracranial electroencephalography (iEEG).

After earning his Ph.D., Li Yuanning joined the laboratory of Professor Edward Chang at the University of California, San Francisco (UCSF), to conduct postdoctoral research. Professor Chang is a member of the U.S. National Academy of Medicine and a leading authority in the fields of language neuroscience and neuroengineering. In this role, Li utilized invasive intracranial electrocorticography (ECoG) techniques combined with artificial intelligence methods to record, analyze, and model human auditory and language-related cognitive functions.

It was also during his tenure at UCSF that he began collaborating with the teams of Professor Wu Jinsong and Professor Lu Junfeng from the Department of Neurosurgery at Huashan Hospital, Fudan University. Using Chinese and English as comparative subjects, they explored the neural mechanisms underlying the universality and specificity of language perception and expression. This has become one of his ongoing research themes.

3Cross-Integration

On July 28 this year, Li Yuanning shared his latest research findings at a thematic forum co-hosted by the Chinese Society for Neuroscience and the Tianqiao and Chrissy Chen Institute.

When you hear him delivering presentations to both Chinese and international scholars in English that approaches native-level proficiency, you instantly grasp that his mastery and passion for language are far from accidental. The two linguistic systems, along with their underlying logic, resemble two operating systems within a single hardware unit: independent yet mutually influential.

It is not only the humanities and science that collide and intersect. During his seven years at Carnegie Mellon University, the unique atmosphere of Pittsburgh provided him with countless inspirations. Home to both Carnegie Mellon University and the University of Pittsburgh, Pittsburgh stands as a forefront hub for computer science and neuroscience nationwide and even worldwide.

The Center for the Neural Basis of Cognition, jointly established by two universities, brings together more than 100 researchers from over ten departments, including Carnegie Mellon University’s Departments of Statistics, Psychology, Computer Science, Electrical and Computer Engineering, and Biomedical Engineering, as well as the University of Pittsburgh Medical Center’s departments of Neuroscience, Neurology, Neurosurgery, and Psychiatry.

These more than 100 researchers and their doctoral students conduct neuroscience research from their respective areas of expertise. Spanning molecular and cellular biology, neural circuits, animal models, computation, psychology, cognition, and statistics, they have fostered a unique and vibrant atmosphere for interdisciplinary academic exchange between medicine and engineering in Pittsburgh.

Since returning to China, Li Yuanning has continued to advance his research through dynamic interactions that constantly generate fresh vitality. Shanghai’s world-class neurosurgical clinical center and robust neuroscience community have laid a solid foundation for this interdisciplinary work: based in the School of Biomedical Engineering, he collaborates primarily with frontline neurosurgeons, while his own students come from backgrounds in computer science or electronics.

In multi-party communications, he “translates” the most fluent language for interaction, which is also consistent with his research on Chinese phonetics.

This integration is also reflected in Li Yuanning’s observations on technology and industry after his return to China. Regarding the trend of a cohort of innovative brain-computer interface (BCI) companies in China engaging in open collaborations with universities and clinical institutions, he believes that, as a highly frontier sector where many BCI projects originate from academic settings, more frequent exchanges and greater transparency in competition are more conducive to technological iteration and development.

Beyond applying artificial intelligence technologies to decode electroencephalogram (EEG) signals for synthesizing speech and text, Li Yuanning and his collaborators are also focused on the interdisciplinary integration of AI models with human cognitive processes. His latest research findings reveal similarities between deep neural network models used for Chinese and English speech recognition and biological auditory processing, offering a novel approach to understanding neural coding in the auditory cortex and providing a biological perspective on the interpretability of large-scale self-supervised pre-trained deep neural networks. This work is forthcoming in Nature Neuroscience, a subsidiary journal of Nature. In the future, he will continue to explore the integration of pre-trained large language models with human language cognition.

4Event Introduction

In the future, VCBeat’s Orange Fruit Bureau and the Tianqiao and Chrissy Chen Institute for Neuroscience’s science communication column, “AI Asks the Brain,” will jointly launch an online live dialogue series featuring Professor Li Yuanning. If you wish to learn more about neuroscience, brain-computer interfaces, and Professor Li Yuanning’s research interests, you are welcome to search for and follow the WeChat Channels account “Next Question” or scan the QR code to contact the author and reserve your spot for viewing.