Decoding Strategies for Large Language Models

The field of large language models (LLMs) is moving towards developing more advanced decoding strategies to improve text generation. Researchers are exploring new methods to balance fluency, diversity, and coherence in generated text. One of the key areas of focus is on enhancing the Locally Typical Sampling (LTS) algorithm, which has been shown to struggle with repetition and semantic alignment. Another important direction is the development of techniques to mitigate performance degradation in long-context LLMs, such as addressing the Posterior Salience Attenuation phenomenon. Additionally, there is a growing interest in increasing the diversity of generated text, with methods that factorize the sampling process into multiple stages and incorporate techniques such as intent-based generation. Notable papers in this area include:

Advancing Decoding Strategies: Enhancements in Locally Typical Sampling for LLMs, which proposes an improved version of LTS called Adaptive Semantic-Aware Typicality Sampling (ASTS).
Mitigating Posterior Salience Attenuation in Long-Context LLMs with Positional Contrastive Decoding, which introduces a training-free method called Positional Contrastive Decoding (PCD) to alleviate attention score degradation.
Intent Factored Generation: Unleashing the Diversity in Your Language Model, which presents a simple method for increasing sample diversity in LLMs while maintaining performance.

Decoding Strategies for Large Language Models

Sources