Here at The Dock, we’re fascinated by new technologies that impact the market.
One of the emerging technologies the Marketing Innovation Team has been researching lately is voice enabled user interfaces, specifically hardware devices like smartphones and smart speakers that are integrated with virtual assistants you can speak to.
I know some of you out there might be saying to yourselves that this isn’t new technology, and you’re right; IBM’s Shoebox, for example, was developed in 1961 and was capable of recognising 16 spoken words.
Now though, the technology has advanced considerably and companies have reached a stage where they can bring this technology to market, making them available to consumers at scale, at relatively affordable prices.
As companies promote their offering in order to gain market share, the hype around the technology has naturally increased, but, along with that heightened expectation comes the question; has voice technology advanced enough to be useful?
Dadong Wan, who leads Accenture Labs here at The Dock, focusing on delivering R&D projects in AI, recently weighed in on the subject of voice recognition with the Sunday Business Post, saying:
“There’s a huge amount of opportunity; we are really just scratching the surface in terms of interacting with systems.”
When we talk to a Siri or a Google Home, the assistant only understands so much.
There are fundamental problems to tackle in order to have meaningful interactions.
Looking forward, what really excites me is the opportunity to have intelligent machines that can explain the why.
‘The why’ in this instance denotes the gulf between where we are now and where we need to get to in order to realise the promise of this technology.
At present, virtual assistants are useful in so far as they can answer simple questions or respond to basic commands.
Gaps start to appear when we move from information retrieval to a situation where the end point is not pre-determined. For example, if a user were to ask Alexa what the weather forecast is for a particular day she would be able to respond with relative ease. If the same user were to ask Alexa which shoes they should buy, Alexa would struggle.
This is because although voice enabled user interfaces can understand words and phrases, they can’t understand context, or help us work through a decision making process.
For the first time in the history of user interfaces, the expectation is on machines to learn how we communicate. This is really significant.
Previously, if we wanted to engage with technology in any meaningful or useful way, we needed to learn a new set of behaviours. Typing on a keyboard, scrolling on a smartphone, or downloading an app - none of those things come naturally to us; we learned how to do them because we had to. That’s not the case with voice first technology.
We don’t have to learn how to speak; it’s the most natural way we communicate. Being able to communicate in a hands-free, eyes-free way is much more intuitive for us.
As the technology continues to improve, becomes more useful, and moves beyond a system for information retrieval, it’s likely that we’ll naturally gravitate towards it.
Ultimately, it’s too early to say how dominant this technology will be; it’s much more likely that voice enabled user interfaces will be an addition to how we already engage with technology and not a complete replacement.
The convenience factor is too strong to ignore, however; having an intelligent virtual assistant that can cater to your needs is very compelling—watch this space.