Improve what already works.
Any gains prove smart management.
Test-First Approach to MTPE Implementation
Behind this story is one of Europe's fastest-growing mobile game companies*. They already had a localization process that worked. But with AI reshaping the industry, they asked: could we modernize without compromising quality?
AI is reshaping localization workflows industry-wide
Opportunity to improve an existing workflow
Demonstrate responsible innovation with clear metrics
Positive side effect if achievable without quality loss
Any gains prove smart management.
* Details have been anonymized because of NDA restrictions.
We tested 10 AI engines, validated the best performers with a 1,500-word pilot, then transitioned all languages to MTPE. The same translation team stayed on, now post-editing AI output instead of translating from scratch.
100% traditional → MTPE with mandatory human post-editing
100% linguist retention: same team, new efficiency
All 12 languages
Pilot (February 2025) → Full production (July 2025)
* TQI — Translation Quality Index. Each linguist in our pool carries a TQI score — Translation Quality Index, an industry-standard metric — based on their reviewed project work. This is how we verify translation quality by the numbers, objectively.
| Task type | Confirmation | Delivery |
|---|---|---|
| Regular | 3 hours | 1 day |
| Urgent up to 200 words | 1 hour | 12–24 hours |
We designed a three-phase approach to make the transition safe and measurable.
We tested different AI engines to find the best match for this game's content.
Gemini, ChatGPT-4o, Grok, Mistral, Microsoft, and 5 more. All 12 languages were tested across each engine.
AQI (Alconost Quality Index) components: CometQE, nTER, BLEU, CHrF, COMET, BERTScore + Linguist Evaluation.
| Quality | AQI | Lang. | Description & recommendation |
|---|---|---|---|
| Good | > 85 | 6 of 12 | Near-human quality; fluency and adequacy are strong. Accept as-is or very light post-editing. |
| Acceptable | > 70 | 11 of 12 | Good fluency, minor errors; meaning mostly preserved. Light to moderate post-editing. |
| Borderline | 69 | 1 (MS) | Understandable but noticeable issues. Moderate post-editing needed for fluency. |
AI-generated translations achieved acceptable to good quality. Human post-editing brings output to 95%+ TQI.
Validate with 1,500 words of actual game content.
With engines selected, we ran a real-world test using actual game content.
Linguist effort and MT performance quality
Edit distance is one parameter showing how well we configured MT (prompting, settings). Lower distance means less linguist effort and better MT setup. We also tracked overall quality using a 5-step MTPE process to confirm 95%+ TQI is achievable.
Pre-translation delivered at target quality. Linguists approved as-is.
Edit distance up to 30%. Quick fixes for fluency or terminology.
Edit distance 61–99%. Meaning issues or adaptation required.
We began by switching the top European languages to MTPE.
After a month of careful performance monitoring, Dutch and the top Asian languages followed.
A month later, the remaining languages were successfully transitioned as well.
The upgrade worked: careful testing + clear communication + team continuity = sustained results.
4 major updates delivered on schedule during MTPE period
No increase in localization issues reported
37% savings after MTPE implementation
SLA maintained: 3-hour confirmation, 1-day delivery
95%+ TQI guaranteed
100% linguist retention
We test AI solutions systematically, guarantee quality at 95%+ TQI, and help you make evidence-based decisions.
Tell us about your game. We’ll share a tailored plan and pricing within one business day.
About
Services
Our Work
Technology