Why progress halts around B2, and how gamification and AI can bridge the gap
Have you been learning a language for a while now, but feel stuck?
You’re not alone.
This frustrating moment of stagnation is known as the Intermediate Plateau, and over the past year, I’ve been researching how gamified language apps and AI might help us get through it.
Introduction
Whether you’re grinding Duolingo streaks, watching native content, or deep into flashcard decks, many language learners hit a point where their progress just… stalls.
The intermediate plateau typically occurs between CEFR B2 and C1 (upper-intermediate to advanced). It’s a quiet, confidence-eroding stage that leaves learners demotivated, unsure of how to keep going. Worse, it often causes learners to loop: quit, restart, repeat.
But this isn’t just anecdotal. My research, based on 198 learners and a 14,427-word dataset including the full Duolingo Japanese tree and simulated AI tutoring vocab, suggests that the intermediate plateau is both common and structural.
Even more interestingly, the data showed that it’s not the only one.
There are actually two plateaus
While researchers have identified the B2-C1 plateau, my dataset suggests that a second plateau also seems to appear earlier, right between A2 and B1 (upper-beginner to intermediate).
This means learners aren’t just at risk of getting stuck once. They’re at risk twice. And each plateau requires a different solution, because what works at A2 no longer works at B1.
This is what the intermediate plateau looks like, sad face included.
Why does this happen?
The root cause is vocabulary frequency, and something known as the Power Law Distribution of Language.
In short:
-
The most common 2,000 words cover ~80% of daily speech
-
But 80% word knowledge only yields ~50% comprehension
-
To get to 98% comprehension, the level needed for effortless reading, you need ~8,000 words
This highlights that the rarity of a word is directly inversely related to the amount of meaning it contains. In very simple terms: More common word = Less meaning Less common word = More meaning
Here is a visualisation of the Power Law Distribution based on my dataset.
And here is an example by John Pasden of a text where 80%, and 95% are understandable, to illustrate the point made above:
80% comprehension:
“Bingle for help!” you shout. “This loopity is dying!” You put your fingers on her neck. Nothing. Her flid is not weafling. You take out your joople and bingle 119, the emergency number in Japan. There’s no answer! Then you muchy that you have a new befourn assengle. It’s from your gutring, Evie. She hunwres at Tokyo University. You play the assengle. “…if you get this…” Evie says. “…I can’t vickarn now… the important passit is…” Suddenly, she looks around, dingle. “Oh no, they’re here! Cripett… the frib! Wasple them ON THE FRIB!…” BEEP! the assengle parantles. Then you gratoon something behind you…”
95% comprehension:
In the morning, you start again. You shower, get dressed, and walk pocklent. You move slowly, half- awake. Then, suddenly, you stop. Something is different. The streets are fossit. Really fossit. There are no people. No cars. Nothing. “Where is dowargle?” you ask yourself. Suddenly, there is a loud quapen—a police car. It speeds by and almost hits you. It crashes into a store across the street! Then, another police car farfoofles. The police officer sees you. “Off the street!” he shouts. “Go home, lock your door!” “What? Why?” you shout back. But it’s too late. He is gone.
Even a few unknown words can completely derail understanding. According to Schmitt et al., 98% vocabulary coverage is needed for truly comfortable reading.
This means early progress is fast, visible, and encouraging. But soon, it flattens. New words become rarer, harder to remember because of a lack of natural repetition, and contribute less overall in frequency. At the same time however, these harder words are necessary to start consuming native content.
So what happens in practice?
Let me introduce the three stages of the intermediate plateau that emerged from the survey data:
-
The Lost: Aware they’ve stopped progressing, but are trying to brute force through by continuing to use the same tools
-
The Searching: This group got fed up and started trying all kinds of new things, throwing things at the wall until something sticks
-
The Found: Learners who found new strategies that work for them, and began moving again
Recognizing where you are is half the battle. If you’re reading this and nodding along, you’ve probably moved from “Lost” to “Searching”, and that’s progress.
How gamified language apps fit in
Gamified Language Apps (GLAs) like Duolingo, Busuu, and Memrise are optimized for beginner levels. They:
-
Provide structure
-
Help build early vocab
-
Keep motivation up with streaks and rewards
But, they rarely evolve to match the needs of intermediate learners, and most users stay in them longer than they should.
The plateau hits hardest when learners need to shift methods but don’t realize it. The danger is that promises of “fluency” may cause learners to believe it’s their fault, and that they should just apply themselves more to the same learning strategies.
Breaking through the plateau
The study identified three game mechanics that remained universally effective across all levels:
1. Progress tracking
At higher levels, gains are subtle. You’re not learning general terms like “dog” anymore, you’re learning more specific words like “leash”, “kennel”, or “canine”. Tracking becomes key to maintain motivation when progress becomes invisible.
2. Flashcards (spaced repetition)
For retaining low-frequency words that matter to your personal goals (e.g., job-specific terms, favorite genre vocab), flashcards remain vital, as these rare words will appear less in natural situations, meaning you have less chances to review them compared to the most common words. But, this shouldn’t take over consuming compelling, comprehensible input, as trying to brute-force words will: 1) Only teach you an approximation of the word’s true meaning 2) Removes your chances of encountering words that are naturally going to appear nearby your current knowledge cluster, which will be much easier to learn due to surrounding context and personal interest.
3. Personalization
This is where most tools fall short. Once you reach the end of B2, your vocabulary becomes increasingly shaped by personal context. Some may spend time reading novels or academic texts, while others engage with specific media or daily conversation. Vocabulary growth becomes highly individual, and general-purpose tools struggle to predict what content they should offer.
Engaging with content using a predefined list of words is however perfectly fine for beginner levels, due to those first 1,500 - 2,000 words being so common, they are bound to be roughly the same words for any new learner.
Where AI tutors can help
Although still a generally unexplored field, GPT-powered tutors (GPALTs) can offer high levels of personalization.
They can:
-
Adapt to your existing vocabulary
-
Generate tailored, context-rich lessons
-
Target your interests, not a generic curriculum
In short, they help you stay in your personal vocabulary cluster, allowing you to expand more naturally to neigboring clusters.
However, some things to keep in mind:
-
LLMs don’t “think” like humans.
-
They can introduce subtle misunderstandings.
-
They may not represent true native content.
Still, if used alongside real-world material, AI can help bridge the gap between intermediate and advanced fluency.
Extra for developers: Where language apps should go next
If developers want their users to succeed, especially long term, they need to:
-
Build level-specific tools and embrace that, instead of trying to extend their existing methods over higher (or lower) level content
-
Let user data transfer between platforms to avoid starting over or learners feeling like they will “lose” something when they quit your app
-
Explore AI for personalization beyond static lists and rigid courses
-
Include flashcards, progress tracking and personalization
Final thoughts
Language learning isn’t linear. The intermediate plateau may feel like a failure, but it’s actually a sign of advancement. The challenge is real, but so are the solutions.
This research used the Japanese language as a case study and was based on:
-
198 learners of Japanese
- Analysis of a word list dataset containing:
- Duolingo’s full Japanese tree
- An approximation of the most likely AI-generated language teachers vocabulary content based on the entirety of the Japanese side of Wikipedia
- A reconstruction of the Japanese Language Proficiency test vocabulary content based on wordlists of Coto Academy and the Jisho.org API
- For a total of 14,427 unique words
- Word frequency modeling and gamification mechanic analysis
Feel free to reach out if you have any questions, thoughts, or would like to work together!