In a breakthrough straight out of the world of science fiction, a team of researchers has used artificial intelligence (AI) to turn brain signals into computer-generated speech.
The feat was accomplished with the assistance of five epilepsy patients. All had been outfitted with various types of brain electrodes as part of their seizure treatment. This allowed the researchers to conduct very sensitive brain monitoring, called electrocorticography.
The end result represents a major leap forward towards the goal of brain-to-computer communication, investigators said.
Prior efforts in this direction, “focused on simple computer models that were able to produce audio that sounded kind of similar to the original speech, but not intelligible in any way,” explained study author Nima Mesgarani. He’s an associate professor with Columbia University’s Zuckerman Mind Brain Behavior Institute, in New York City.
However, the new study used “state-of-the-art” AI “to reconstruct sounds from the brain that were much more intelligible compared to previous research,” Mesgarani said. “This is a huge milestone, and we weren’t sure we could reach it.”
Brain activity was tracked while each participant listened to short stories and number lists, as read to them by four different speakers. Brain signal patterns that were recorded while the patients listened to the numbers were then fed into a computer algorithm blindly, meaning without any indication of which pattern matched which number.
An artificial intelligence program designed to mimic the brain’s neural structure then went to work “cleaning up” the sounds produced by the algorithm. This is the same technology used by Amazon Echo and Apple Siri, the team noted.
The final product was a sequence of robotic-voiced audio tracks — both male and female — that articulated each number between zero and nine.
On playback, a select group of 11 listeners found that the computer-generated sounds were recognizable roughly 75 percent of the time, which the team said is a much higher success rate than previously achieved.
“Our algorithm is the first to generate a sound that is actually intelligible to human listeners,” said Mesgarani. And that, he added, means that longstanding efforts to properly decode the brain are finally coming to fruition.
“Our voices help connect us to our friends, family and the world around us, which is why losing the power of one’s voice due to injury or disease is so devastating. This could happen due to various reasons such as ALS [amyotrophic lateral sclerosis]or stroke,” resulting in what is known as “locked-in-syndrome,” he added.
“Our ultimate goal is to develop technologies that can decode the internal voice of a patient who is unable to speak,” said Mesgarani.
Such innovations also mean better brain-computer interfacing, which would open up whole new platforms for man-machine communication, he added.
In that regard, Mesgarani said that future tests will focus on more complex words and sentence structure. “Ultimately,” he said, “we hope the system could be part of an implant, similar to those worn by some epilepsy patients, that translates the wearer’s imagined voice directly into words.”
Dr. Thomas Oxley is director of innovation strategy with the Mount Sinai Hospital Health System’s department of neurosurgery, in New York City. The ability of AI to read a person’s brain “raises significant ethical issues around privacy and security that research leaders need to be cognizant of,” he noted.