Help a fellow dev on AI-localization?
A software development team is seeking feedback on their AI-based localization pipeline for HR-related content. They have implemented a methodology using GPT-5-nano for translation and have encountered issues with certain translations that, despite high similarity scores, were flagged by human reviewers. The team is looking for suggestions on improving translation accuracy and understanding metrics that correlate with human acceptance in UI localization.
- ▪The team built an AI-based localization pipeline for their HR software product.
- ▪During a recent Spanish localization run, approximately 75% of strings passed automatically but were later flagged by human translators for inaccuracies.
- ▪The team is seeking insights on improving translation accuracy and metrics for human acceptance in UI localization.
Opening excerpt (first ~120 words) tap to expand
We built an AI-based localization pipeline for our software product (HR domain) and would love feedback/ suggestions from others working in production MT/localization, so that we can learn and improve.Current methodology:GPT-5-nano forward translation + back-translationtext-embedding-3-small cosine similarity on source vs. back-translated text.Threshold: ≥0.92 = auto-approvedOn a recent ~970-string Spanish localization run:~75% of strings passed automaticallyWe then had two human translators review outputs, and both flagged several problematic cases:"Add Attachment" → Agregar AdjuntoBetter: Adjuntar Archivo"Pay Grades" → Grados de PagoBetter: Escalas salariales"Sub Unit" → SubunidadBetter: DepartamentoAll three examples still scored 0.94+ cosine similarity.Google Translate also…
Excerpt limited to ~120 words for fair-use compliance. The full article is at Ycombinator.