Advances in Cross-Lingual Transfer Learning

The field of natural language processing is moving towards improved cross-lingual transfer learning, with a focus on developing models that can effectively transfer knowledge across languages. Recent studies have explored various techniques, including the use of pivot languages, cross-lingual optimization, and retrieval-augmented generation. These approaches have shown promising results in improving translation accuracy and fluency, particularly for low-resource languages. Notably, the use of large language models and multilingual pre-training has been instrumental in achieving state-of-the-art results in cross-lingual tasks. Furthermore, research has highlighted the importance of considering language similarity and cultural nuances in developing effective cross-lingual models. Overall, the field is advancing rapidly, with a growing emphasis on developing scalable and efficient methods for cross-lingual transfer learning. Some noteworthy papers in this area include the proposal of Semantic Aware Linear Transfer, which achieves remarkable performance in cross-lingual understanding setups, and the introduction of DeFT-X, a novel approach for denoising sparse fine-tuning for zero-shot cross-lingual transfer.

Sources

Enhancing Low-Resource Minority Language Translation with LLMs and Retrieval-Augmented Generation for Cultural Nuances

Semantic Aware Linear Transfer by Recycling Pre-trained Language Models for Cross-lingual Transfer

Towards Cultural Bridge by Bahnaric-Vietnamese Translation Using Transfer Learning of Sequence-To-Sequence Pre-training Language Model

Combining the Best of Both Worlds: A Method for Hybrid NMT and LLM Translation

From Unaligned to Aligned: Scaling Multilingual LLMs with Multi-Way Parallel Corpora

Cross-Lingual Optimization for Language Transfer in Large Language Models

FuxiMT: Sparsifying Large Language Models for Chinese-Centric Multilingual Machine Translation

Data-Efficient Hate Speech Detection via Cross-Lingual Nearest Neighbor Retrieval with Limited Labeled Data

Editing Across Languages: A Survey of Multilingual Knowledge Editing

Pivot Language for Low-Resource Machine Translation

Tracing Multilingual Factual Knowledge Acquisition in Pretraining

In-Domain African Languages Translation Using LLMs and Multi-armed Bandits

DeFTX: Denoised Sparse Fine-Tuning for Zero-Shot Cross-Lingual Transfer

Shared Path: Unraveling Memorization in Multilingual LLMs through Language Similarities

Transfer of Structural Knowledge from Synthetic Languages

LAGO: Few-shot Crosslingual Embedding Inversion Attacks via Language Similarity-Aware Graph Optimization

Semantic Pivots Enable Cross-Lingual Transfer in Large Language Models

Mechanistic Understanding and Mitigation of Language Confusion in English-Centric Large Language Models

Comparative analysis of subword tokenization approaches for Indian languages