Despite technological advances, automatic speech recognition has lagged behind, but now an Israel company offers a better solution for phones, GPS and other devices.
More devices than ever use automatic speech recognition (ASR), allowing users to “tell” their device what to do — like dial a phone number by speaking a name rather than tapping a button.
However, ASR technology often doesn’t work well, says Zvi Hava, the CEO of Petah Tikva-based NovoSpeech. “Current ASR solutions are unable to attain 100 percent real-time accuracy of all words spoken by a person, because of background noise, accents and vocabulary,” says Hava.
“In general, these systems succeed only when they are in low noise-controlled environments or when they are either ‘trained’ to recognize a voice, or when the task involved requires some basic and simple commands with a limited vocabulary.”
NovoSpeech has developed a technology that can overcome these problems, Hava says, “enabling devices to clearly recognize what is being said and responding, even if there is a lot of background noise.”
The system can adapt to individual speaking types, accents and dialects without any need for training samples, as nearly all other ASR systems require. The technology is language-independent as well, again without training.
Despite advances in processing power, software algorithms and microphones, speech-to-text technologies have achieved only marginal penetration of a large and growing market, from mobile phones to home appliances to assistive devices, says Hava.
“Some analysts estimate that speech recognition has achieved a mere 15-20 percent of its market potential, but we believe that NovoSpeech can make a major impact in this field.”
The system’s secret is its algorithms based on a unique encoding engine that analyzes natural language sentences and offers extremely high accuracy under real-life conditions.
For example, drivers using a GPS device with voice input might need to change their route or destination on the fly – and if the device “hears” the information wrong, the resulting directions may not be accurate.
“One of the problems with speech recognition is that you often need to use it under noisy conditions. With NovoSpeech technology installed, the background noise that would usually interfere with a driver’s giving instructions is ignored by the device, making it less likely for drivers to get lost.”
The same goes for cell phone voice dialing; with NovoSpeech, the odds of the phone failing to dial a number – or worse, dialing the wrong number – drop dramatically, even for calls made outdoors.
Controlling appliances, remediating speech
NovoSpeech has already had a major success in audio remote-control devices. In January, it signed an agreement with a remote-control manufacturer to integrate NovoSpeech’s speech-recognition engine into the company’s infrared device for controlling home appliances via iPhone.
“Audio remote controls need to be able to operate in noisy environments, like living rooms where TVs and stereos are playing,” says Hava, adding that the manufacturer was quite impressed at the 85%-plus accuracy rate exhibited by NovoSpeech’s ASR engine when integrated with the company’s iPhone app.
NovoSpeech is also in discussions with large manufacturers of Bluetooth device players, for use in embedded speech-recognition engines suitable for voice commands in noisy car environments.
Perhaps the most promising application of NovoSpeech technology is in the area of speech therapy. Previous attempts to use ASR systems to provide feedback to clients were hampered by the systems’ inability to understand influent speakers.
“A successful speech therapy requires precise understanding of what is being said, whether the client is working with a human or a machine,” says Hava. “In either case, there are often many things – nuances, inflections and pronunciation – that get missed because of the limitations of the human trainer, who has a hard time understanding what is being said. Our system ensures crystal-clear reception, leading to much more accurate, objective and effective feedback for clients.
Especially of note, NovoSpeech technology could make feasible, for the first time, production of a home-based speech therapy system.
NovoSpeech is a member of the Trendlines Group’s Mofet B’Yehuda Innovation Accelerator, which is its major funding source. The company was established in 2008 by Hava and others. Dr. Yossef Ben Ezra recently joined as the company’s CTO.