Sapporo radio personality Kazuya Yonezawa takes to the airwaves once a month, speaking to his listeners in a calm and low voice. But the 60-year-old’s words aren’t coming from his vocal cords.
They are being reconstructed by a speech synthesis software called Voistar.
Yonezawa has amyotrophic lateral sclerosis, the debilitating disease also known as Lou Gehrig’s disease. ALS weakens the muscles and gradually deprives the body of all motor skills, including speech.
But thanks to technology that can reconstruct voices taken away by disease or surgery, patients like Yonezawa can resume communicating in computer-generated voices that sound strikingly similar to their own.
That allows Yonezawa to do his radio show each month, tapping away at his keyboard to mainly discuss his battle with ALS and current events.
Before he was diagnosed, Yonezawa was known for having a talkative personality.
“I’ve come to realize how precious verbal communication is,” he said. Rather than stick to a script, he improvises during the show and tries to interview guests as well.
Voistar users read out 40 to 1,000 text samples and record the audio data before their conditions worsen or before laryngectomies.
Based on that data, the software analyzes their voice tone and speaking patterns so their keyboard strokes can be rendered as natural speech when they start using it.
Satoshi Watanabe of Human Techno System Tokyo Co., which sells the product, recalled the moment in 2007 when he was contacted by Izumi Maki, a professor at Osaka University of Arts who was dealing with hypopharyngeal cancer. Maki wanted to continue his lectures in his own voice after surgery to remove his larynx.
When the software was completed and Maki tried it, his voice came out with a Kansai dialect similar to his own. This was well-received by his students, and the professor even used the software to give a speech at his son’s wedding.
“We were a chatty couple, and we were even able to have arguments, thanks to the software,” his wife Keiko, 69, said.
Voistar costs anywhere from ¥360,000 to ¥950,000, depending on data capacity, and is spreading by word of mouth.
“We want to keep improving the software to make it more financially accessible, and reconstruct people’s speech as naturally as possible,” Watanabe said.
Similar software is available free on the Internet.
MyVoice reconstructs speech by combining 124 pre-recorded sounds. It was developed by occupational therapist Musashi Honma, 56, of Tokyo Metropolitan Neurological Hospital, and systems engineer Takaki Yoshimura, 53, of Sasebo, Nagasaki Prefecture. Both have speech impediments.
“For some,” Honma said, “recording their voices empowers them to keep fighting.”
In a time of both misinformation and too much information, quality journalism is more crucial than ever.
By subscribing, you can help us get the story right.