Multilingual NLP
Aug 2, 2020
An exciting research direction that we pursue at GMU NLP is building multi-lingual and polyglot systems. The languages of the world often share similar characteristics, and training systems cross-lingually allows us to leverage these similarities and overcome data scarcity issues.
George Mason NLP
The Natural Language Processing group at George Mason University. We work on multilingual models, on and on building robust NLP systems, especially for low-resource and endangered languages.
Related
- GlobalBench: A Benchmark for Global Progress in Natural Language Processing
- LIMIT: Language Identification, Misidentification, and Translation using Hierarchical Models in 350+ Languages
- An Efficient Approach for Studying Cross-Lingual Transfer in Multilingual Language Models
- To token or not to token: A Comparative Study of Text Representations for Cross-Lingual Transfer
- Towards a Universal Python: Translating the Natural Modality of Python into Other Human Languages
Posts
A note on evaluating multilingual benchmarks
A note on evaluating multilingual benchmarks Antonis Anastasopoulos, December 2019. tl;dr: Be careful when reporting averages for multilingual benchmarks, especially if making claims about multilinguality. In addition, averaging by language family can provide additional insights.