In sci-fi movies, it’s common to see a character having spontaneous conversations with robots and humanoids, like C-3PO of “Star Wars” fame.
In reality, technology hasn’t reached that level yet. But big-name Internet firms like Apple, Google, Amazon and Microsoft are inching closer, brushing up voice assistant functions and installing them in their products.
Seeing huge potential for the future, Japanese companies have also been developing the technology. NTT Docomo Inc. and Yahoo Japan Corp., the top two companies competing in this field, are providing voice-controlled functions for smartphones that can conduct online searches, for instance, and are incorporating the technology into other products like toys.
“At the beginning, it was able to have maybe just one conversation, but it can keep natural conversations now,” said Docomo President Kaoru Kato at a news conference last month.
Kato was referring to a communication toy robot called Ohanas developed by Docomo and toymaker Tomy Co. that they claim is capable of having natural conversations with humans.
Scheduled to hit the market in October, it is the first project that Docomo, the country’s largest carrier, has unveiled as part of a plan to collaborate with other firms using its voice assistant technology.
“Docomo has been developing this natural-language dialogue platform and providing it as Shabette Concier (for smartphones),” said Kato. “We thought this (technology) should be used for not only smartphones but also for other things.”
Shabette Concier, which means “talk to me concierge,” is a voice assistant service introduced in 2012 that enables users to speak to a sheep character in the smartphone to make requests for online searches, ask for weather forecasts or just to chat, similar to Apple’s Siri that comes with iPhones.
Docomo’s Shabette Concier has seen about 30 million installs, and the carrier said it has gained a significant amount of voice data from users, which is key to improving the voice-related technology, including correctly recognizing what users are saying and how to respond.
The voice assistant technology has three key elements: hearing, thinking and speaking.
While Docomo uses a different firm’s speaking technology and has partnered with other firms to codevelop a hearing engine, the carrier itself develops the thinking component that is like a brain of the voice agent.
The natural-language dialogue platform is a cloud-based service, so it can interact with devices that connect to the Internet like the Ohanas robot.
Docomo said while the platform’s voice assistant comes with preprogrammed comments based on what users say, its thinking engine can also reflect on how to respond by itself, and can give responses that it come up with on the spot.
“This is the main part of the natural dialogue platform. We managed to develop technology to make the machines answer based on what they hear from people,” said Takeshi Yoshimura, manager of Docomo’s big data group and service innovation department.
Yet this means the devices might say something unpredictable, so companies interested in the voice technology tend to prefer that responses are controllable, Docomo said.
But Yoshimura said Docomo’s voice technology understands what users are asking even if they vary their phrases.
For instance, if they say they are going to Kyoto, they might use Japanese phrases like “Kyoto ni ikuyo” or “Kyoto ni ikimasu.” The voice assistant will understand what they mean either way.
“Users don’t have to worry about how to speak to the device, so they can speak more casually to it,” Yoshimura said.
Whereas the voice assistant technology works well with Docomo’s carrier business, Yahoo Japan has also been developing its own technology, taking advantage of it position as an Internet giant.
Yahoo Japan started working on creating voice-based Internet search technology in 2009, as its rival, Google, announced a voice-controlled search function back then, according to Jumpei Miyake of Yahoo’s data and science solutions group.
“We thought this would be critical in the future, so then-President (Masahiro) Inoue urged that we should start developing (the technology),” Miyake said.
“We are developing voice assistant technology because a huge amount of text and voice data is essential to improve the technology, and Yahoo’s Internet search engine has that kind of data,” he said.
The firm first introduced voice-Internet search in 2011 for iOS and is now also providing a voice assistant app for Android smartphones, which works with a variety of Yahoo’s services.
“The strength of our voice assistant is that it can provide information from Yahoo’s various services, like weather forecasts, shopping and news,” said Miharu Fujii, manager of search development division at Yahoo.
For example, if users ask the app to direct them to Shibuya from Shinjuku, it will display a route on Yahoo’s map service along with what transportation to take, based on the firm’s transfer guidance, Fujii said.
Fujii said the company plans to improve the level of service for the app to make more elaborate suggestions.
“Let’s say users want to go somewhere tomorrow and the voice assistant may suggest a museum or a sports event. If they choose a museum, it will then tell them what’s exhibited at museums near their home,” Fujii said.
Miyake said it’s true the voice control function isn’t popular in Japan yet. But he’s optimistic about the future, saying children who can’t really type text will grow up using voice-controlled functions.
According to a 2013 report by the U.S.-based market research firm BCC Research, the global market for voice recognition technology was $47 billion in 2011 but is projected to top $113 billion in 2017.
Also, there will be more wearable devices in the coming years that people will control with their voice, Miyake said.
This section, appearing on the second Monday of each month, features new technologies that are still under research and development but are expected to hit the market in coming years.