Transliteration as Constrained Optimization

Empirical Methods in Natural Language Processing (EMNLP), 2008

[pdf]

Abstract

This paper introduces a new method for identifying named-entity (NE) transliterations in bilingual corpora. Recent works have shown the advantage of discriminative approaches to transliteration, given two strings (ws, wt) in the source and target language, a classifier is trained to determine if wt is the transliteration of ws. This paper shows that the transliteration problem can be formulated as a constrained optimization problem and thus take into account contextual dependencies and constraints among character bi-grams in the two strings. We further explore several methods for learning the objective function of the optimization problem and show the advantage of learning it discriminately. Our experiments show that the new framework results in over 50% improvement in translating English NEs to Hebrew

Bib Entry

@InProceedings{GR_emnlp_2008, author = "Dan Goldwasser and Dan Roth", title = "Transliteration as Constrained Optimization", booktitle = "Empirical Methods in Natural Language Processing (EMNLP)", year = "2008" }