A voice user interface (VUI) is something that enables you, the user, to control a device and carry out a task with your voice.
The concept isn’t new. Phone operators have been infuriating their customers with it for decades already. But, with the advent of smart virtual assistants, powered by artificial intelligence, VUI is fast becoming the next big tech disruption. The prediction that 50% of all internet searches will be voice searches by 2020, is just one indication of its potential impact.
So, as the technologists take to the R&D battlefield to get more of their latest offerings into our homes, what do you need to know about VUI?
Following our introductory post on EdTech trends, this article takes a closer look at ‘voice’. We’ll track its emergence and explain its successes so far.
Where are we now?
In the midst of a robot uprising! OK, not quite, but ever since Apple introduced Siri in 2011, VUI technology has taken off. Over the last seven years industry leaders have launched competing systems, planting their flags in this uncharted territory.
Alibaba, China’s $500bn e-commerce group, was the latest to join the race with its AliGenie – and the competition looks set to intensify.

Fortunately, most of us moved on from flirting with Siri some time ago. After all, the shelf life of silly questions isn’t that long. While the novelty of the early versions of VUI may have worn off, new features and functions continue to drive high rates of adoption.
Another important factor is the increasing range of affordable VUI-enabled hardware. According to a report by voicebot.ai, almost 20% of adults in the US have access to smart speakers, such as Amazon Echo, Google Home and Apple Homepod.
These devices give users complete hands-free control. And multi-room support – now an industry standard – means that you can walk around your house, consulting your calendar and finding new recipes as you go!
As you can see for yourself in the graph below, there’s nothing that we can’t already do on our mobile devices. The key to its success is convenience. Voice allows for a screen-free, invisible interface. Simplicity sells.

How did we get here?
Often the line between success and failure is governed by timing. Improvements in internet speeds, processing power, microphone quality have all converged at the right time.
Developers have drawn on the advances in the interconnected areas of automatic speech recognition (ASR) and text to speech (TTS).
Work in the field of natural language understanding (NLU), a branch of AI, has also led to significant progress in the effectiveness and accuracy of human-computer interaction. NLU overcomes the syntactic limitations of computer language by using algorithms to reduce human speech into a structured ontology.
What does this mean exactly? Well, instead of processing language at word-level, AI enables computer software to pick out features of speech such as intent, timing, locations and sentiments.
What’s still to come?
Voice technology is changing the face of design. A new modality calls for a new, multidisciplinary approach and a deep understanding of how natural human language works.
One important step in the advancement of voice design was the release of Amazon’s Alexa Skills Kit (ASK). This digital toolbox gives designers access to APIs, documentation and code samples to promote the development of new functions or ‘skills’. In December 2017, Amazon surpassed the 15,000 mark.
That’s quantity, but how about quality? Well, a quick search of Alexa’s language learning skills doesn’t come back with much to get excited about. Most are tools for translation or ‘word of the day’ apps.
However, don’t be too quick to write it off as a passing fad. As developers refine their approach and harbour the steady advances in AI, fully-conversational human-computer interactions aren’t far off. Voice is here to stay and is likely to provide ever greater opportunities to language learning, both in and outside the classroom.
1 thought on “VUI: the dawn of voice”