Non-Invasive, AI Enabled 'Semantic Decoder' Translates Brain Activity to Text
- By John K. Waters
Researchers at the University of Texas at Austin may have found a way to use AI to help people who are mentally conscious but unable to physically speak (such as those debilitated by strokes) communicate intelligibly again—without putting chips in their heads.
A research team led by Jerry Tang, a doctoral student in computer science, and Alex Huth, an assistant professor of neuroscience and computer science, both at UT Austin, have reported the results of their efforts to develop a non-invasive "decoder" that reconstructs continuous language from cortical semantic representations recorded using functional magnetic resonance imaging (fMRI). The researchers reported their results in the journal Nature Neuroscience.
The work relies in part on a transformer model, similar to the ones that power Open AI’s ChatGPT and Google’s Bard.
"Non-invasive" is the key word here. Unlike other language decoding systems currently in development, this system does not require subjects to have surgical implants. The system, which the researchers call a "semantic decoder," is trained on information from fMRI scans of the brains of subjects as they listen to hours of podcasts. The subject then listens to a new story or imagins telling a story, and the machine generates text from brain activity alone, the researchers said.
"For a noninvasive method, this is a real leap forward compared to what’s been done before, which is typically single words or short sentences," Huth said in a statement. "We’re getting the model to decode continuous language for extended periods of time with complicated ideas."
The result is not a word-for-word transcript, the researchers said. The decoder captures "the gist of what is being said or thought, albeit imperfectly."
"About half the time, when the decoder has been trained to monitor a participant’s brain activity, the machine produces text that closely (and sometimes precisely) matches the intended meanings of the original words," they said.
The researchers offered two examples of this phenomenon: A participant listening to a speaker say, "I don’t have my driver’s license yet" had their thoughts translated as, "She has not even started to learn to drive yet." Listening to the words, "I didn’t know whether to scream, cry, or run away. Instead, I said, 'Leave me alone!'" was decoded as, "Started to scream and cry, and then she just said, 'I told you to leave me alone.'"
In addition to having participants listen or think about stories, the researchers asked subjects to watch four short, silent videos while in the scanner. The semantic decoder was able to use their brain activity to accurately describe certain events from the videos, they found.
The results for individuals on whom the decoder had not been trained were unintelligible, the researchers said, and if participants on whom the decoder had been trained later resisted in some way—for example, by thinking other thoughts—the results were similarly unusable.
The researchers also took pains to address likely questions about the potential misuse of this kind of technology. The paper explains that the decoder training was provided by volunteers only who were willing to share their thoughts.
"We take very seriously the concerns that it could be used for bad purposes and have worked to avoid that," Tang said. "We want to make sure people only use these types of technologies when they want to and that it helps them."
The system is not practical currently for use outside the laboratory because of its reliance on the time needed on an fMRI machine, the researchers admitted, but they saw potential for this technology in portable brain-imaging systems, such as functional near-infrared spectroscopy (fNIRS).
"fNIRS measures where there’s more or less blood flow in the brain at different points in time, which, it turns out, is exactly the same kind of signal that fMRI is measuring," Huth said. "So, our exact kind of approach should translate to fNIRS. He added that the resolution with fNIRS would be lower.
This work was supported by the Whitehall Foundation, the Alfred P. Sloan Foundation, and the Burroughs Wellcome Fund. The study’s other co-authors are Amanda LeBel, a former research assistant in the Huth lab, and Shailee Jain, a computer science graduate student at UT Austin.
Huth and Tang have filed a PCT patent application related to this work.
John K. Waters is the editor in chief of a number of Converge360.com sites, with a focus on high-end development, AI and future tech. He's been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he's written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS. He can be reached at email@example.com.