The field of natural language processing is moving towards developing more sophisticated models for multilingual language understanding and generation. Researchers are exploring innovative approaches to bridge the gap between languages, including the use of visual information and cultural awareness. A key challenge in this area is addressing the performance gap between high-resource and low-resource languages, as well as mitigating biases in multilingual models. Notable papers in this area include IRLBench, which introduces a benchmark for evaluating large language models in multilingual settings, and Cross-Lingual Representation Alignment Through Contrastive Image-Caption Tuning, which proposes a method for aligning sentence representations across languages using visual information. Additionally, papers such as TransBench and ScholarBench highlight the need for more robust evaluation frameworks for machine translation and academic reasoning in multilingual contexts.
Multilingual Language Understanding and Generation
Sources
IRLBench: A Multi-modal, Culturally Grounded, Parallel Irish-English Benchmark for Open-Ended LLM Reasoning Evaluation
Breaking Language Barriers or Reinforcing Bias? A Study of Gender and Racial Disparities in Multilingual Contrastive Vision Language Models