Meredith Cicerchia is Director of E-Learning for She holds an M.Sc in Applied Linguistics and Second Language Acquisition from the University of Oxford and a B.A. in French language and literature from Georgetown University. You can find her on Twitter (@MereLanguage) or follow’s blog of SLA inspired tips for language learners.

Maybe I was just being naive, but as a language learner, teacher, curriculum developer and researcher, I always thought Language Testing (LT) and Second Language Acquisition (SLA) were closely linked fields. I figured the LT people who spent their days constructing various assessment tools were not only aware of the most recent findings in the SLA research, but used them to inform test design. Similarly, I assumed the psychometric analyses put into test design were well understood by SLA researchers seeking out reliable instruments for their studies.

But several months into a major language testing-project, I quickly understood the reality of the situation: the worlds of LT and SLA could not be further apart. First off, I discovered I had barely scratched the surface in my understanding of language testing during an SLA degree. Secondly, when it came to the reading and listening content we were developing, the inner-workings of the test-tasks could not have been more foreign. I started asking around and realized I was not the only language-learning advocate surrounded by test developers and assessment gurus who seemed to be “speaking another language.”

So when it comes to LT and SLA- why the rift? To be honest, I still don’t know the answer to this question. The closest I’ve come to understanding it was at an LTRC Language Testing Researchers Conference last year in South Korea. A keynote delivered by British Council APTIS test designer Barry O’Sullivan caused quite a stir with the audience yet I felt myself nodding along as O’Sullivan argued that the two fields needed to make more efforts to work together.

After all, SLA trained item writers and language teachers often create test content. Test specifications typically reference parameters recognized by both fields. And most importantly, the same individuals who use SLA based curriculum and are taught by SLA trained teachers sit the exams (and depending on the stakes, make major life decisions based on the results).

Googling the issue doesn’t really add much to the discussion. According to Dr. Geoff Jordan “The relationship has certainly been theorized a number of times. But yet there remains very little contact between these two critical branches of applied linguistics research.” Cambridge University Press’s Interfaces between Second Language Acquisition and Language Testing Research says trends are being reversed thanks to overlaps in research interests and empirical approaches. Yet Shohamy’s 2000 study cites “limited interfaces” and a lack of relevance of language testing to SLA. What are we to think?

Language Tests from an SLA Perspective

I recall a former professor at Oxford saying exams were a snapshot and if they caught you at the wrong moment or measured only one skill, you could come away believing you were less fluent than you really were. Maybe that’s why I never fully bought in to the language-testing world.

At some point in their studies, most Applied Linguists encounter Selinker’s Interlanguage Continuum, a long and extended line with ups and downs. Selinker depicted language as a life-long journey characterized by many U-shaped curves. When we have understood a rule, we begin to apply it en masse. Eventually, we become aware of exceptions to the rule. Nonetheless, our execution is imperfect and unpredictable for the remainder of the upswing. Therefore, while we may not be 100% correct in demonstrating our knowledge, we have in fact moved forward on our continuum.

But what happens when a language test occurs during an up-swing? And if language testers are not using SLA research to inform their choice of constructs and task-design, how well do test results correlate with the ability and performance of learners? Of course there are many occasions on which we need standardized assessment tools and tests can come in all shapes and sizes. Yet what if there was a simpler, SLA based approach for the rest of us?

Vocabulary-Based Learning

These days I work in big-data fueled digital language learning on methods that simulate immersion and force a departure from the traditional beginner, intermediate and advanced levels that most of the industry is wedded to. So I got to thinking how SLA researchers typically measure ability level and realized the most common tool is some form of productive/receptive vocabulary test.

Considering vocabulary has long been hailed as one of the best predictors of proficiency across reading, listening, writing and speaking it does make some sense. And consequently there are a few startups out there who are imagining language learning in a whole new vocabulary driven light. They get to know an individual’s working vocabulary and let word look-up and frequency data from exposure to authentic content do the rest., for example, is able to achieve this thanks to a robust backend that uses SLA rules governing acquisition from context to turn a measure of working vocabulary into a guideline for sourcing “comprehensible input” from the worldwide web. Bliu Bliu does it by asking learners outright what they do and don’t know. Individuals can bootstrap their way from there and scaffolding is provided in the form of dictionaries, flashcard-makers and smart review platforms.

New Kinds of Tests

Yet despite new vocabulary-driven approaches to language learning, there hasn’t really been a parallel wave of novel approaches to measurement — at least not to my knowledge. Earlier this year, the University of Ghent developed a research tool which spread like wildfire on social media as a fun, fast vocabulary test that tells you about your language ability in very little time. Its popularity was no surprise given it appeals to a generation of multi-tasking millennials who will do anything to avoid a three hour-long exam. But it was more of a game than a real test.

Can we take mobile exams and smartphone testing seriously? This past April, another startup popular with milennials, Duolingo, announced they were throwing their hat in the certified testing ring with a mobile LT center available to people who use their big-data fueled language learning platform. Duolingo’s lessons work via crowd-sourced translation so it will be interesting to understand more as their approach to testing develops.


Yet circling back, I still strongly believe that the separation between LT and SLA camps deserves more attention from everyone involved. With new digital approaches to learning we need new, dynamic and complementary assessment measures and more cross-pollination of ideas between the two fields, both in practice and in the research community.

Imagine the benefits if language testers and second language acquisition researchers came together in the digital startup age! We’d not only have enhanced insight into test constructs and new takes on task-construction (of particular importance for tests delivered in the digital medium) but a new generation of learning and testing tools to help life-long language learners meet their goals.

Featured Photo Credit: Mimolalen via Compfight cc. Text added by ELTjam.


  1. Scott,

    I think I probably owe an apology because as usual I wrote rather too much, inviting the tl;dr response it probably deserved – my rather weak defence for that is that I tend to get carried away by certain hobby horses and this was one of them.

    All that being said, I’m not sure that I did in fact miss the point of the ‘thought experiment’ – whether the question was “How would you assess a speaker (or writer) of a language that has no native speakers, putative or otherwise, such as Esperanto?” or “‘Given there are no educated native-speakers of these languages, by what standards would such exams (or are such exams) scored?’” my answer is:

    The conclusion is – as it should be – that the form of assessment is matched to the purposes for which the language will eventually be made use of.

    I then tried to hint that the distinctions between the uses of, say, Sanskrit, Klingon, Swahili and English are carried over into the forms of assessment (formal or informal) used for each.

    Languages with no native speakers tend to be oriented toward accuracy because either they are mostly used for translation purposes (e.g. Sanskrit) or there is a prescribed set of rules without which the community of its speakers could not exist at all (e.g. Esperanto).

    I suggested that English was almost unique in this regard, not because English is unique per se, but that the way in which it is used – in terms of sheer coverage, scope, depth – is unique in modern history (while there have been and still are other lingua francas none can ever have been said to be so truly global in the way that English quite clearly is).

    With respect, I was therefore trying to suggest that the ‘thought experiment’ was not truly viable because it didn’t take context or use into account and the experiment disintegrates once those issues are included or only returns a superficial answer if they are excluded.

    What I have been arguing all along is that there is not only not an issue with basing a syllabus on the repertoire of an (imaginary?) idealized native speaker but that it is also inappropriate to describe it as an ‘arbitrary’ rather than a principled decision to do, especially with regard to more sophisticated post-intermediate uses of an L2.

    I agree completely that ‘What did you do ___ the weekend?’ is potentially problematic, however what is important here in terms of assessment is the context and use to which such a sentence is being put. I know four sets of exams well, Cambridge main suite, Cambridge young learners, IELTS and the Pearson PTE-Academic and in each one of these the use of a ‘wrong’ preposition in that slot would be mitigated by other considerations such as context and task appropriacy, overall fluency, consistency of style etc. (in fact, as you probably know, that particular example would be marked as correct with the options at / in / on in each of those exams to the best of my knowledge).

    English, thankfully in my opinion, does not have a single official body to make declarations of what counts as absolutely correct in such cases (such as Spain’s Real Academia Española has). But even if we were to say that IELTS, as the world’s most popular English language examination (so we are apparently told), stood as a proxy ‘Academy’ of English, there is no point at which it’s descriptors would lead to a pass or fail over such minutiae in usage.

    To try and avoid another case of tl;dr (if it’s not already well past that point!), I would like to ask more about this:

    “For the moment it’s the native speaker who still calls the shots. But which native speaker, and with what authority, and for how long?”

    I am genuinely fascinated as to why this is a cause of no small concern to some people and arouses such passions, especially when we are talking about English as opposed to other languages (or at least, that’s how it seems to me.

    For instance, I was speaking to a German friend recently, who now teaches EAP in the UK but who has also taught German as a second or foreign language in the past, and she seemed to see no difficulty at all with the idea that a German language exam should be based on the speech of an idealized educated (Hochdeutsch or Standarddeutsch) native speaker – yet German, too, has a wide range of accents, dialects and national models on which that native speaker could potentially be based.

    1. Thanks, Nik, for this interesting discussion (or dialogue, as we seem to be on our own!) By way of drawing it to a close – and returning to the topic that triggered it, I’d re-iterate that SLA research AND language testing are each predicated on the assumption that the goal of language learning is to achieve native-like competence, and that the benchmark for measuring success (or, more often, failure) in both endeavours (i.e. research and testing) is the educated native-speaker. I’d add that not only does this benchmark ignore the inherent multilingualism of the learner, it assumes the existence of what is, for all intents and purposes, a mythical best, or, as Pennycook (2012) puts it, ‘a folk concept, held in place to signal certain ideas about language’’. He adds: ‘The idea of native and non-native speakers really does not do any useful work in thinking about real language use, and at a great deal of harm as a categorisation that cannot escape its roots in nationalism, racism and colonialism’.

      1. Many thanks indeed, I’ve found it very stimulating and hope it was of some interest to you too. While I hope you appreciate I have taken your points seriously and have given them much thought (or as much though as I am capable of giving!), I still find my original position – that the deployment of an educated NS in language syllabuses and assessment is neither detrimental to or disruptive of a learner’s identity or self-esteem, nor an arbitrary decision but a principled one – is still valid.

        This must be for another occasion (should one arise), but on a final note I’m afraid I find Pennycook’s ideas concerning (Neo-)Imperialism and the English language absolutely preposterous. And I would like to point out that this is despite the fact the great majority of my master’s degree was devoted to the study of World Englishes (especially but not exclusively in West Africa), Contact languages and Postcolonial criticism; while this was only an MA and so I am in nowise claiming expertise in these areas, I do feel at least both generally familiar with and sympathetic to a number of issues that also concern Pennycook – and yet I still find that he is way, way off base in much of what he proposes.

        Anyway, many thanks again and have a good day.

  2. Thanks for your fascinating response to my ‘thought experiment’, Nik. However, I suspect you didn’t understand the point of it, and it’s my fault for wording it badly. It’s not relevant that there are, or that there are not, exams of Esperanto (or of Latin or of Klingon, for that matter). The question should be, ‘Given there are no educated native-speakers of these languages, by what standards would such exams (or are such exams) scored?’

    The answer, presumably, is that there is some normative standard based on someone’s idea of what a native-speaker would be like, if there were such a thing. But this would seem to be neither a valid nor a fair way of testing, given its arbitrariness. Nor would it say very much about the Esperanto (or Latin or Klingon) test-takers communicative effectiveness. It would say more about the tester’s own particular biases and dispositions. The standards, effectively, have been manufactured and are based on an ‘airy nothing’.

    Take a more realistic example. Assume a test of (standard) English includes this item: ‘What did you do ___ the weekend?’. As an educated native-speaker of (New Zealand) English, I would, of course, answer ‘in’. If the examiner was an educated native-speaker of British or American English I would probably be marked down. But who, really, has failed here?

    The example is not trivial: language examinees face this problem on a regular basis. As a learner of Spanish, I’ve found that the goal-posts are constantly shifting. Which, for example, is correct: ‘le hablé’ or ‘lo hablé’? And why should it matter? (Seeísmo for an explanation, if you’re not a Spanish speaker). Likewise, there is fierce debate among Catalan speakers (all native speakers, presumably) as to which preposition is correct in certain contexts: ‘per’ or ‘per a’.

    Where, in the end, does accuracy reside? What is ‘accurate’ Klingon really like? And what is ‘accurate’ Catalan really like? Even where there are educated native-speakers of a language, such as Catalan or Spanish or English, who is to decide which of these (possibly millions of) speakers rules? Perhaps, as Humpty Dumpty put it, “The question is, which is to be master – that’s all.”

    For the moment it’s the native speaker who still calls the shots. But which native speaker, and with what authority, and for how long?

  3. Here’s a thought experiment: How would you assess a speaker (or writer) of a language that has no native speakers, putative or otherwise, such as Esperanto? (And, apparently they do exist).

    It’s an interesting question to pose, although if we are specifically referring to Esperanto, then there is no need for a thought experiment as exams for this language already exist and, according to the website of the British association of Esperanto speakers (which is based in Stoke-on-Trent of all places), the third of the three levels available is already tied to the CEFR (The Advanced exam is described as being C1).

    Each of the three levels of the exam award marks of between 30% and 50% for translation (both Esperanto to English and English to Esperanto) and I was slightly amused to note the following instruction (in English) on the Intermediate level sample paper: “Translate the following passage into good English” (my emphasis). Reading through the online guides for prospective candidates, it is quite clear that this is an assessment that is focused on accuracy. I suppose this is to be expected for a language whose creator(s) can actually be named in person (as opposed to natural languages).

    If we discount Esperanto, there are a number of other examples of languages with no non-native speakers: Latin, (Ancient) Greek, various ConLangs, e.g. Klingon, Dothraki, Elvish.

    The Klingon Language Institute offers the Klingon Language Certification Program, for which “[t]he questions will either be translation, fill-in-the-blank, or require answers to questions about grammar.” For the curious, here are some sample questions:

    1a. Translate the following sentence: SuS’a’mo’ pum Sorvetlh

    The answer is apparently “That tree fell because of the powerful wind.” – take that Henry Sweet!

    2b. ya ghaH wo’rIv’e’

    Worf is the __________.

    The missing phrase in the blank (corresponding to ya ghaH) is “tactical officer” – again, nineteenth century Classics students puzzling over the manner in which philosophers pull the lower jaws of certain types of farmyard animal has nothing on this, clearly!

    More seriously though, what the results of my attempt at this thought experiment appear to suggest is that assessment is heavily invested in a faith in and adherence to grammatical and lexical accuracy both of which are related to clearly defined standard forms of the language, regardless of the fact that it no (longer) has any native speakers, or even whether the native speakers are really or wholly fictitious.

    As far as I am concerned, this should not be surprising. While it can be reasonably argued that this emphasis on accuracy is the result of these exams being based on those made for natural / national languages, I think it also points clearly to the fact that what defines the kind of invented languages referred to above is that they are ultimately created as a leisure time pursuit (albeit a highly rarefied and sophisticated one). That being the case, there is such a heavy emphasis on translation as the primary use of these languages must surely be translating text to be spoken by actors and eventually subtitled on screen (Dothraki, Elvish), explaining to laypeople what you’ve just said (Esperanto, Volapük) or making the dead live by offering translations of literary, religious or historical documents (Latin, Sanskrit).

    The conclusion is – as it should be – that the form of assessment is matched to the purposes for which the language will eventually be made use of.

    In contrast to these languages, there are alternative Lingua Francas to English and in some parts of the world (e.g. Swahili), and there are languages that have been specifically created for the purposes of diplomacy, trading and negotiation and only used in specific frontier and border spaces (see e.g. the work of Peter Mühlhäusler on contact languages and the ecology of language / language of ecology for more on this). I’m afraid I don’t have the details to hand, but I know that there is a hill region of Papua New Guinea on the frontier between two tribal regions and in this zone, the men (it is only men who can go there) use a specific language that belongs to neither one tribe nor the other.

    Although you can take an exam in Swahili (, a ‘real’ assessment of efficacy (there’s that word again!) in such languages presumably depends on achieving specific outcomes: conflict is avoided (or successfully provoked, human nature being what it is!), the number of pigs offered as a dowry meets with the satisfaction of all parties involved, you get a good bargain or price for the cassava / vegetable oil / AAA batteries etc. bought or sold in the market and so on.

    To be blunt, though, such a form of assessment does not apply to English – or for that matter, a number of natural national languages used in mainly industrialised nations (e.g. Spanish). English, in my opinion, is definitely unique amongst world languages in this regard.

    English is not only a language of contact and trade, but it is also a language of science, technology, academia, diplomacy and international relations. Given that the latter five uses of English in the modern world each require a high degree of delicacy in the use of grammar and lexis to ensure success, it should hardly be surprising that the ideal of an educated native speaker model is essential as a basis for learning and assessment.

    Of course, Jenkins, Prodromou (and others) are quite right to point out that idioms and proverbs such as ‘to gain Brownie points’ or ‘You can lead a horse to water, but you can’t make it drink’ are basically useless (or worse), but please note that such phrases tend (at least on the whole) not to have a significant bearing on the language of science, technology, academia, diplomacy or international relations (I’m aware the odd exception can be found).

    Quite by chance, I noticed that the latest edition of the ELT Journal includes a number of articles on just this topic of assessment, and the kind of language that should form the basis of it. But even here, I find the position put forward by Christopher Hall in his article Moving beyond accuracy: from tests of English to tests of ‘Englishing’ almost immediately undermined.

    On the one hand, Hall declares that teachers and testers should “question the monolithic position” and that “[w]hat is not helpful […] is the presentation of [standardized native speaker-based] norms as the only ones for successful English usage.” (2014:377); on the other hand, he also concedes that “I recognize the need to test conformity with such varieties under many circumstances (for example in some EAP contexts)” (ibid.).

    As far as I’m concerned, as soon as Hall acknowledges that “conformity”, standards and accuracy still apply to assessment for specific purposes (such as EAP, or even “some EAP” contexts), then the argument is effectively over.

    Education in general is there to set people free, to encourage intellectual development and promote social mobility. This equally applies to education in English language, if not more so. To encourage parochialism is surely to defeat the object of having a more or less standardized international form of communication in the first place.

    For many in the world, this means acquiring a good – more or less – standardized form of the use of English for certain purposes (such as those noted above) and so efforts to avoid assessing (and thereby teaching) such types of language may place a restriction on students before they have even begun learning. And it is to this that I am most strongly opposed: as teachers and materials designers – we do absolutely have to accept certain aspects of the status quo – not necessarily uncritically, but nevertheless we do for the most part have to accept it.

    Education tends towards conservatism of attitude (though not as a rule of politics) precisely because, and especially with regard to the young, it is hard to say what they may need to know in the future. No one wants to experiment with the future life chances of someone else’s child, or someone else’s potential.

    For my own part, I would much rather teach with accuracy and a more standard model in mind now because frankly, I don’t know what kind of English my students may want or need to use in future. To be honest, most of them have no idea either and may not come to a decision until long after they have forgotten my name (which admittedly may not take all that long!).

    If I decide to teach to the language of an idealized educated native speaker model now, and the students choose to reject that model and/or not achieve that model, then they at least know what it is they are working toward should they at a later date decided (or have forced upon them by circumstance) the need to develop their English for much more delicate and sophisticated uses than hanging out in bars or doing shopping etc.; If on the other hand, I decide on their behalf that ‘They don’t really need to know all that stuff, they just need to be able to communicate well’, then I potentially make it much more difficult (if not actually impossible) for them to develop beyond whatever fossilized form of the language they have become accustomed to using.

    I’d rather go with the option that has more potential (if not actual) choice. And therefore, whether it is English, Latin, Elvish or Swahili, I would as a rule prefer the version of assessment that promotes accuracy in relation to communicative competence across the repertoire of uses for that language.

  4. Nik writes, ‘And it’s with this upper or more advanced repertoire of things to be done in the L2 that I find hard to imagine being defined without reference at some point to the putative educated native speaker.’

    Here’s a thought experiment: How would you assess a speaker (or writer) of a language that has no native speakers, putative or otherwise, such as Esperanto? (And, apparently they do exist).

  5. Scott,

    Thanks for your response, to which I just wanted to make a quick(ish!) reply:

    Point 1:

    While I concede the point that the “(monolingual) educated native speaker” has provided “the default model for assessing learners, both for the purposes of research and for the purposes of proficiency testing” I still find it hard to accept that a deficit view is inevitable – why, for instance, can the results of such assessments not be seen in terms of the progress made toward something rather than the distance away?

    That seems to be a not entirely unreasonable position to take and, I assume, is the one taken by all those ‘can do’ statements that present language proficiency in terms of a repertoire of purposes to which the L2 can be successfully applied and, what’s more, applied without overt reference to L1 speaker competence in that language, at least up until post-intermediate.

    There are certainly things that can only be done in the L2 that require a much greater degree of delicacy and precision of language use – the language skills needed to become part of an academic or professional community for instance. And it’s with this upper or more advanced repertoire of things to be done in the L2 that I find hard to imagine being defined without reference at some point to the putative educated native speaker.

    How else can the differences between, say, result, consequence, outcome, repercussion be interpreted and/or exploited for a communicative purpose without reference to the unmarked use of those words in contexts that most (educated) native speakers would recognize as being appropriate?

    And that, I think, is regardless of whether this is an NS-NNS or an NNS-NNS interaction we’re talking about because the appropriate use of language is determined by the purpose and context of use and not what the first language of one or more of the speakers may or may not be (i.e. a Polish C2 level speaker of English may well be able to accommodate her level to an Italian A2 level speaker if all they are talking about is how much they both like coffee and doughnuts but no amount of accommodation will make it feasible for them to meaningfully discourse about e.g. the influence of Durkheim on Bourdieu).

    Following a short lecture by Jennifer Jenkins during which she referred repeatedly to ‘proficient’ ELF speakers, I asked how she had determined that they were proficient speakers if she was rejecting the notion of native speakers as the benchmark. Perhaps she was just annoyed at the question, but her reply at the time was that she knew that these ELF speakers were proficient because they had all passed CPE at a ‘C’ grade or higher. This was some years ago now so maybe she has a better way of defining levels in ELF, but using CPE as a benchmark did seem to rather undermine her main argument (on that day at least).

    Point 2:

    “Yes, employers and other gate-keepers may well continue to favour bilinguals whose English is native-like over bilinguals who are resourceful translingual communicators, but that is no reason why we – as teachers and materials designers – should accept such a status quo uncritically.”

    I’m afraid I have to disagree here:

    [1] How likely is it that “bilinguals whose English is native-like” are less able to draw on their language resources to translate as (or even more) effectively than ” bilinguals who are resourceful translingual communicators”? I can see how someone with relatively modest language skills can be an effective go-between, but am less clear about how someone with superior language skills could be less effective.

    [2] As the use of English as an academic and professional lingua franca seems likely to become more rather than less important for the foreseeable future, doesn’t that mean that the things that the students today are most likely to need to do in the L2 are all those parts of the repertoire that are post-intermediate?

    And again, how is it then possible to define ‘advanced’ levels of competence without reference to delicate and sophisticated manipulation of the language and how, in turn, can that delicacy and sophistication in language use be defined without reference at some point to uses of that NS speakers also consider to be advanced (such as giving a presentation at a board meeting or publishing a research paper and so on)?

More comments

Leave a Reply

Your email address will not be published. Required fields are marked *


Other related posts

See all

Am I a Content Creator or a Writer?

Deconstructing the Duolingo English Test (DET)

My English learning experience – 6 lessons from a millennial learner