Thanks to a digital avatar Ann has been able to speak with facial expressions for the first time in 18 years.
Researchers at the University of California, San Francisco (UCSF) and the University of Berkeley have developed a brain-computer interface (BCI) that, thanks to artificial intelligence, has enabled a woman named Ann, who was severely paralyzed after a stroke, to speak again through a digital avatar. The researchers state in an article published in the journal Nature that this is the first time that speech and facial expressions have been synthesized from brain signals. The system can also decode these signals into text at nearly 80 words per minute, a vast improvement over commercially available technology.
A brain-computer interface translates the study participant's brain signals into the speech and facial movements of an animated avatar.
Dr. Edward Chang, professor of neurological surgery at UCSF, has been working on this BCI technology for more than a decade and hopes that this latest research breakthrough will lead in the near future to a U.S. Food and Drug Administration (FDA)-approved system that allows speech from brain signals. “Our goal is to reestablish a full, bodily form of communication, which is really the most natural way to talk to others. These advances bring us much closer to making it a real solution for patients”, says Chang, a member of UCSF's Weill Neuroscience Institute.
Chang's team previously demonstrated that it was possible to decode brain signals into text in a man who had also suffered a brainstem stroke many years earlier. The current study demonstrates something more ambitious: decoding brain signals into the richness of speech, along with the movements that animate a person's face during conversation.
The researchers claim in a paper published in the journal Nature that this is the first time speech and facial expressions have been synthesized from brain signals.
Chang implanted a thin rectangle of 253 electrodes on the surface of the woman's brain in areas that his team found to be critical for speech. The electrodes intercepted brain signals that, but for the stroke, would have gone to the muscles of the tongue, jaw and larynx, as well as the face. A cable, attached to a port fixed to his head, connected the electrodes to a bank of computers.
For weeks, the participant worked with the team to train the system's artificial intelligence (AI) algorithms to recognize her unique brain signals for speech. To do this, she repeated over and over different phrases from a conversational vocabulary of 1,024 words until the computer recognized the patterns of brain activity associated with the sounds. Instead of training the AI to recognize whole words, the researchers created a system that decodes words from phonemes. These are the subunits of speech that form spoken words in the same way that letters form written words. “Hello”, for example, contains four phonemes: “HH”, “E”, “L” and “OW”.
The participant worked with the team to train the system's artificial intelligence algorithms to recognize her unique brain signals for speech.
With this method, the computer only needed to learn 39 phonemes to decode any English word. This improved the accuracy of the system and made it three times faster. “Accuracy, speed and vocabulary are crucial”, asserts Sean Metzger, who developed the text decoder along with Alex Silva, both graduate students in the Joint Bioengineering Program at UC Berkeley and UCSF. “It's what gives a user the ability, over time, to communicate almost as fast as we do and have much more naturalistic and normal conversations”.
To create the voice, the team devised an algorithm to synthesize speech, which they customized to sound like Ann's voice before the injury, using a recording of Ann speaking at her wedding. In addition, the team animated the avatar with the help of software that simulates and animates facial muscle movements, developed by Speech Graphics, a company that makes AI-based facial animations.
The researchers created customized machine learning processes that allowed the company's software to mesh the signals sent by the woman's brain as she attempted to speak and convert them into the avatar's face movements, causing the jaw to open and close, the lips to protrude and pucker, and the tongue to go up and down, as well as facial movements of happiness, sadness, and surprise.
The team devised an algorithm to synthesize the speech, which they customized to sound like Ann's voice before the injury, using a recording of Ann speaking at her wedding.
“We are compensating for the connections between the brain and vocal tract disrupted by the stroke. When the subject first used this system to speak and move the avatar's face at the same time, I knew this was going to be something that would have a real impact”, explains Kaylo Littlejohn, a graduate student working with Chang and Dr. Gopala Anumanchipalli, professor of electrical and computer engineering at UC Berkeley.
Looking ahead, the researchers want to create a wireless version that does not require the user to be physically connected to the BCI. “Giving people like Ann the ability to freely control their own computers and phones with this technology would have profound effects on their independence and social interactions”, concludes David Moses, assistant professor of neurological surgery at UCSF.
Source: iSanidad.
|
Subscripción via RSS |
·