Transliteration for Cross-Lingual Morphological Inflection

Abstract

Cross-lingual transfer between typologically related languages has been proven successful for the task of morphological inflection. However, if the languages do not share the same script, current methods yield more modest improvements. We explore the use of transliteration between related languages, as well as grapheme-to-phoneme conversion, as data preprocessing methods in order to alleviate this issue. We experimented with several diverse language pairs, finding that in most cases transliterating the transfer language data into the target one leads to accuracy improvements, even up to 9 percentage points. Converting both languages into a shared space like the International Phonetic Alphabet or the Latin alphabet is also beneficial, leading to improvements of up to 16 percentage points.

Publication
Proceedings of the 17th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology
Antonios Anastasopoulos
Antonios Anastasopoulos
Assistant Professor

I work on multilingual models, machine translation, speech recognition, and NLP for under-served languages.

Related