It started with K.I.T.T.
When I was four, going on five, a TV show called Knight Rider premiered in the UK. I don’t remember much about the first episode, but I do remember one thing very clearly: my dad’s uncontrolled excitement. It built throughout the day (‘It’s going to be amazing! Amazing!’), and he made absolutely sure I was sat next to him when the credits rolled and that iconic theme song began.
All that fuss, all that excitement, for a talking car.
Dad was right, though; it was pretty amazing. I loved it and remained a fan for most of my childhood (OK, I admit it; I’m still a fan). There was The Hoff, of course – all leather jackets, open shirt buttons and swagger – but the real star of the show was K.I.T.T – Knight Industries Two Thousand – the ‘advanced, artificially intelligent, self-aware and nearly indestructible car’.
Whatever it was that we loved about K.I.T.T, it must have stuck. Over thirty years later, two of the most successful companies in the world (Apple and Google) are in a head-to-head race to bring K.I.T.T’s spiritual successor – the driverless car – to market. It’s taken three decades and millions of dollars, but we can now talk confidently about driverless cars in terms of when rather than if. And, as a little-known and hard-to-spot side effect, the ramifications for the teaching of languages, especially English, could be huge.
The Cambridge connection
In 2013, ELTjam cofounder Tim Gifford had a few meetings with a small, Cambridge-based company called VocalIQ. VocalIQ was founded by a group of researchers who were working on what Tim described at the time as ‘some awesome speech recognition stuff’. We’d founded ELTjam with the hope of helping to bring exciting new technology into mainstream ELT, and I remember hubristically thinking that VocalIQ would be a great place to start. If we’d had a pile of funding capital sitting around, we might even have considered investing (they didn’t receive their £1.28m seed round until June 2014, close to a year later). Looking back, that investment would have been a smart play: just over a month ago, VocalIQ were acquired by Apple for an undisclosed amount. ‘We saw them first,’ I mumbled to myself.
VocalIQ’s website no longer exists, but the company is described on Crunchbase like this:
VocalIQ was formed in March 2011 to exploit technology developed by the Spoken Dialogue Systems Group at University of Cambridge, UK. Still based in Cambridge, the company builds a platform for voice interfaces, making it easy for everybody to voice enable their devices and apps. Example application areas include smartphones, robots, cars, call-centres, and games.
It’s unclear when the Cambridge University Dialogue Systems Group’s website was last updated, but, according to the information it currently displays, the group aims to,
design systems that can be trained on real dialogue data and which explicitly model the uncertainty present in human-machine interaction.
Currently the dialogue systems group is working on the EU funded Probabilistic Adaptive Learning And Natural Conversational Engine (PARLANCE) project. The goal of this project is to design and build mobile applications that approach human performance in conversational interaction, specifically in terms of the interactional skills needed to do so.
Apple is known for many things, but one of the most significant is its breakthroughs in user interface (UI) design, including the Apple Mac’s graphical user interface (GUI) and the iPhone touchscreen. With the release of its virtual assistant Siri in 2011, Apple revealed their vision for the future of UI design: the voice.
The acquisition of VocalIQ’s technology will certainly help Apple further optimise the voice UI of its devices, but the real motivation behind the purchase may have had more to do with cars than virtual assistants. In October, Macrumors reported on how VocalIQ had a particular interest in using voice-based technology to make driving safer. In a now unpublished blog post, VocalIQ are quoted as describing how
a “conversational voice-dialog system” in a car’s navigation system could prevent drivers from becoming distracted by looking at screens.
The specifics around Apple’s move into the electric and/or driverless car market are sketchy, but rumours of a 2019 release date circulate, backed up by evidence such as the company’s hiring of motor industry veterans over the summer. It’s also unclear whether Apple will follow Google’s pursuit of a 100% driverless car or attempt some kind of voice-mediated electric car in the interim. In either case, if Apple’s track record in creating paradigm shifting devices is anything to go by, the car’s release is likely to ultimately change driving forever. And it may, as a by-product, change a few other things too.
Beyond speech recognition
It goes almost without saying that any mobile application that (in the words of Cambridge’s Dialogue Systems Group ) approached human performance in conversational interaction would have an impact on the teaching of languages. But what might that impact look like? The answer lies in the difference between speech recognition and natural language processing.
It can be handy to think of speech recognition as the equivalent of dictation: you speak, the system recognises what you’ve said and then reproduces it. For example, I can dictate an email into my iPhone, and it will reproduce it in text form with considerable accuracy. I can also tell my phone to ‘Call Laurie’, which it would do by recognising my simple voice command. You’ll have spotted voice recognition features in some of your favourite digital language-learning products, including Duolingo. You may also have spotted that, in many of those products, it rarely works with any level of accuracy. This has led to a sense (in ELT at least) that voice recognition is kind of a gimmick – a marketing ploy at best – and not a serious tool in language-learning. In my 11 years in and on the peripheries of the ELT publishing industry, I’ve never had a serious conversation with a publisher about the inclusion of voice recognition as a product feature (others have, of course).
Natural language processing (NLP) is different. With NLP, the system doesn’t simply attempt to reproduce what you say; it attempts to derive meaning from it. It allows you to have something that starts to sound like an actual conversation with a machine. This two-minute clip from Hound (a personal voice assistant product designed to go up against Apple’s Siri), shows how far the technology has come. What hits you first is the speed of the app’s response: it’s almost instant. But there’s a naturalness to the interaction that seems both mesmerising and slightly chilling, especially when it asks questions back in order to get all of the information it needs:
Human: What’s the monthly payment on a million-dollar home?
App: What is the down payment?
Human: Hundred thousand.
App: OK, using a down payment of one hundred thousand dollars, what is the mortgage period?
Human: Thirty years and the interest rate is 3.9%.
App: OK, … [goes on to answer correctly]
Human: Show me Asian restaurants, excluding Chinese and Japanese.
App: Here are several Asian restaurants, excluding Chinese restaurants, Japanese restaurants or sushi bars.
It’s a small detail, but notice how the system has recognised that, by saying he doesn’t want to eat at a Japanese restaurant, the speaker also probably doesn’t want to eat at a sushi bar. That demonstrates an understanding of context which goes far beyond what basic speech recognition can handle. I watch that section of the video and, while I can’t yet quite imagine a free-flowing conversation between human and machine, I can quite easily see basic A2-level transactional exchanges happening, where the chances of deviance from an expected set of outcomes are slim – think checking into a hotel or ordering a meal. I can also easily imagine an app such as Hound taking the place of a human IELTS or Cambridge Main Suite speaking examiner. The more clearly defined the rubric around a task, the easier it would be to replace the human with a machine.
The humans vs. the machines
It may seem over dramatic to frame the conversation about technology in education in the stark terms of human vs. machine, but that’s exactly what it is. And there was no better example of that than the reaction to Sugata Mitra’s infamous IATEFL plenary in 2014. Mitra is often depicted as the poster child for the neoliberal takeover of education, and this in part comes from the accusation that he supports the automation of human teacher labour through the use of computers. When I visited Newcastle to interview him, he spoke bluntly in those terms:
“I don’t see why replacement of human beings with machines should be considered as negative.”
[Of hand-pulled rickshaw drivers] “Would it be unkind to replace them with a machine?”
[Of the profession of a postman] “It’s still there, but what for? It’s a job that can be done by a drone. It’s a job that needn’t be done at all.”
“Why would education be considered that one special subject where challenging how things are done because the times are changing doesn’t apply? The teacher can never be replaced by a machine. The school can never disappear.”
Imagine labour as a continuum, with 100% human at one end and 100% machine at the other. In some industries, such as car manufacturing, we’re already seeing the balance tip towards the machine. And it’s easy to see why: jobs which require the same task to be done with the same level of perfection thousands of times in a row with no deviation are ripe for automation (cue cries that the robots stealing our jobs!). But that’s certainly not what teaching is like: it’s a complex job where an infinite number of possible outcomes exist every time you walk into the classroom.
Yet isn’t driving also a complex task, with a huge number of potential, human-related variables? Where does driving sit on that same automation continuum? We’ve already outsourced one human element – navigation – to GPS devices. And, if you don’t want to drive at all, you can outsource it completely – to an Uber driver that you order on your phone. In an excellent New Yorker article (Two Paths Towards Our Robot Future), Mark O’Connell writes that,
the technology that allows you to summon an Uber, after all, and allows the Uber driver to navigate you to your destination, is on a continuum with the technology that will eventually displace the driver entirely.
In the same article, O’Connell highlights an important distinction between Artificial Intelligence (AI) and Intelligence Augmentation (IA). Apple’s plans for an electric car directed partly by voice are an example of IA: technology is used to augment or improve human performance, in this case by allowing drivers to spend more time looking at the road and less time looking at a screen or other instruments. A completely driverless car would be an example of AI: technology that is so autonomous in its behaviour that it removes the need for a human altogether.
Much of the educational technology available to teachers these days is IA. LMSs, apps, IWBs, online workbooks – they’re all intended, in theory, to augment the teacher: to liberate him or her from mundane tasks such as setting and marking homework. But the fallout from Sugata Mitra’s IATEFL talk showed that many EFL teachers are terrified of a world where technology might replace the teacher completely. They’re terrified of AI going mainstream. If we can create a driverless car, can we also create teacherless learning? Software like VocalIQ’s is a step towards that reality, and Apple’s involvement has just sped everything up.
The death of language teaching as we know it
Imagine a world where language-learning apps contain natural language processing capabilities so powerful that they remove the need for a human teacher. Apps that can offer on-the-spot feedback on errors that would outperform a newly qualified CELTA graduate. Apps that, from the learner’s perspective, seem as fluent and natural in their interaction as a human teacher. Apps that have a sense of humour. Apps that learn about you as they’re teaching and tailor the lesson content to your needs and interests. Apps that give every learner access to the equivalent of an expensive one-on-one teacher.
Left to its own devices, how long do you think it would take the language-teaching industry to develop apps like that (assuming they even would)? 10 years? 20 years? Now imagine the technology needed coming into being as the byproduct of another product, in this case Apple’s electric car. How long would it take then?
Next, imagine that these apps cost next to nothing or – as will probably be the case – they come bundled free with your phone’s operating system. What does that do to the language teaching market? What does that do to private language schools who charge hundreds of pounds for group courses? What does that do to print course books? What does that do to all of the sub-par language-learning apps currently on the market?
Some will claim that, in a world where apps like that existed, there would still be a need and desire for more traditional language teaching. And there will be, for the small group of people who are willing to pay for it. But language teaching isn’t about small, niche markets; it’s about global mass markets measured in hundreds of millions of learners. ELT as we know it doesn’t exist without those numbers.
What is a mass market – and a growing one, especially in emerging markets – is smartphones (specifically low-cost ones). So ask yourself this, whether you’re a teacher, a language-school owner, a publisher, a materials writer or an EdTech company: are the learning experiences that you’re creating – be they lessons, courses, course books or apps – amazing and compelling enough that people would pay for them if they didn’t have to? Would people pay for them if they could get something as ostensibly good for free on their phone? Nothing I’ve ever seen in language-teaching is that good. Yet. But it could be.
I don’t own a car, but I love driving. I love everything about it. I love nipping around a big, unknown city in a three-door hatchback, dodging other drivers and getting lost. I love those long, straight American highways that cut through deserts. I love hairpin turns on tricky mountain roads. I’d lose something I loved if driverless cars became ubiquitous. As an industry, our challenge is to make people love learning as much as that. To make learning such a joy that they’d want to do it anyway. In the era of Apple’s electric car, the challenge won’t be how quickly we can get a student from A to B; the challenge will be how amazing we can make the journey.