Will we converse with phones and computers in the future?
Victor Melfi starts the conversation with his phone the same way he might begin small talk with a stranger: by asking about the weather.
“What’s the weather in Boston?” he asks the smartphone, which responds that there’s a light freezing rain this winter afternoon.
“What about Schenectady?”
“Looks like clear skies today.”
OK, nothing special here — Siri, the virtual assistant on iPhones, could follow that conversation. But now Melfi, the chief strategy officer for VoiceBox Technologies, an early provider of voice-command systems in cars, tries to push the app’s boundary.
“Get me a stock price for Google,” he instructs, and the phone vocalizes the day’s quote. Melfi follows up with a request for the phone to play “something by the Beatles,” and again the phone complies.
Then Melfi switches to a higher gear.
“What was the volume?”
“The volume for Google is 1,439,000.”
That last exchange separates this VoiceBox system, which is not yet on the market, from Siri. The Bellevue company’s software was able to track the conversation across contexts — stock quotes and music. Volume applies to both music and stocks, but because Melfi used “was” in his request, the program knew he was referencing stocks, the previous context, and not music volume.
As natural language understanding, or NLU, gains traction, consumers might find themselves in conversations like Melfi’s more often. People have been able to tap, swipe, squeeze, and tilt devices for years, but only recently could they talk to electronics. As the young technology becomes mainstream, companies like VoiceBox and Redmond startup Nuiku are betting that NLU will become a critical component of connected computing, as long as the awkwardness of talking to a phone can be overcome.
Many people were introduced to NLU in 2011 when Apple rolled out Siri, the first widespread program that would respond to questions by mining data on the Internet and the user’s phone to find answers. Siri was followed by Microsoft’s Cortana, which could be one of the conpany’s defining products going forward. Cortana will be present on all devices running Windows 10 when the operating system is released later this year, and it announced this week that Cortana will be available on iOS and Android phones.
Throw in products such as Amazon’s Echo, an always-listening virtual assistant that doubles as a Bluetooth speaker, and NLU is entering the public consciousness. But the technology is young enough that novelty is part of its allure — most digital devices still can’t understand or produce speech.
Software that processes natural language does have a tangible advantage over other input methods: simplicity. This concept is the backbone of Nuiku’s products, which are designed for working professionals, particularly those working in the sales realm. The company’s app, which more than 1,000 organizations have begun using since its spring 2014 release, allows salespeople to verbally input notes, which are then uploaded to Salesforce’s customer relationship management platform. Nuiku also is expanding into the original equipment manufacturing realm. CEO Sean Thompson is in negotiations with companies such as Microsoft and SAP to handle NLU capabilities for their own business apps.
“The whole thing we started with is this notion that the business-application user experience sucks,” Thompson says. “It’s all about efficiency. … In Salesforce, we can take an utterance and handle all the menu-based database functions.”
Nuiku’s products can be customized for industry jargon — something Siri and Cortana can’t handle — and are able to make suggestions based on analysis of stockpiled data. “That’s exercising this notion of assistant becoming coach,” Thompson says.
While Nuiku is focusing on NLU for business professionals, VoiceBox is betting on voice being a popular input method for a more widespread audience as cars, phones, and homes become connected in the cloud. The company gained notoriety when it built a prototype similar to Amazon’s Echo in 2005, an act that caught Toyota’s attention. In 2008, the automaker rolled out a voice-control interface designed by VoiceBox. But VoiceBox is thinking beyond the car, which is why Melfi’s demo took place on a phone. (VoiceBox was to announce a “flagship partner in the mobile industry” in March, according to Melfi, but that announcement has been put on hold.)
Melfi says the first breakthrough in NLU was when users could control programs with natural speech instead of hierarchical commands (saying, “Play ‘The Thunder Rolls,’” instead of, “Music; Garth Brooks; ‘The Thunder Rolls’”). “The next value proposition for NLU is, how do you get that to work in an environment of connected devices? Luckily, the vision on which we developed the company was the Internet of Things.”
VoiceBox already has a foothold in the automotive realm; Toyota, Renault, and Fiat Chrysler are a few automotive companies that use VoiceBox. Melfi says the company will now focus on NLU for in-home consumer goods, and then will graduate to mobile interfaces.
The future of NLU is tightly correlated with the number of applications connected via the cloud. As networks grow more complex, speech commands could be the easiest way to access, store, and communicate data. “Natural language just means it’s faster and easier to do entry,” says Bern Elliot, an analyst with the research firm Gartner. Ease of use might help NLU proliferate, but the paradigm of speech as digital input is still very new. After Microsoft announced Cortana would be on Windows 10’s desktop version, people snickered on social media about cacophonous offices in which everyone is speaking to their computers.
Elliot expects that won’t be the case, but he does expect to see an NLU on virtually every device in the coming years, and the interactions will feel increasingly human.
“Because speech is such a naturally emotional medium, when you start talking to the machine, there’s a tendency, for instance, to raise your voice when you get frustrated. A good algorithm will detect that, and may say, ‘I’m sorry.’”
Editor’s note: This story was revised to better reflect the status of VoiceBox’s agreement with a mobile partner. A version of this story was published in the June 2015 issue of 425 Business.