Cross-Lingual Transfer and Prompt Optimization

The field of natural language processing is moving towards more efficient and effective cross-lingual transfer methods, with a focus on leveraging multilingual models and unlabeled target-language data. Recent developments have shown that ranking source languages for cross-lingual transfer can be improved using hidden representations from multilingual models, leading to state-of-the-art results in tasks such as part-of-speech tagging and named entity recognition. Additionally, there is a growing interest in optimizing prompts for large language models, particularly in machine translation tasks where the input component plays a crucial role. Novel approaches such as hierarchical few-shot example selection and prompt rewriting have demonstrated improved translation performance. Furthermore, unified rhetorical structure parsers have been proposed to handle multiple treebanks in different languages, enabling more efficient and accurate discourse parsing. Noteworthy papers include:

  • Model-Based Ranking of Source Languages for Zero-Shot Cross-Lingual Transfer, which presents an algorithm for ranking source languages using multilingual models and unlabeled target-language data.
  • TreePrompt: Leveraging Hierarchical Few-Shot Example Selection for Improved English-Persian and English-German Translation, which proposes a novel example selection approach that learns language model preferences to identify high-quality examples.
  • Bridging Discourse Treebanks with a Unified Rhetorical Structure Parser, which introduces a unified RST-style discourse parser capable of handling multiple treebanks in different languages.
  • Learning to Rewrite Prompts for Bootstrapping LLMs on Downstream Tasks, which introduces a novel prompt optimization method specifically designed for machine translation tasks.

Sources

Model-Based Ranking of Source Languages for Zero-Shot Cross-Lingual Transfer

TreePrompt: Leveraging Hierarchical Few-Shot Example Selection for Improved English-Persian and English-German Translation

Bridging Discourse Treebanks with a Unified Rhetorical Structure Parser

Learning to Rewrite Prompts for Bootstrapping LLMs on Downstream Tasks

Built with on top of