Advances in Preference Learning and Behavior Modeling

The field of preference learning and behavior modeling is moving towards more dynamic and adaptable approaches. Researchers are shifting away from traditional methods that rely on static and manual behavior vocabulary selection, and instead are developing novel pretraining strategies that can automatically construct supervision embeddings. These advancements have shown promising results in improving model performance and reducing bias in online preference learning. Furthermore, the introduction of configurable preference tuning and generative reward models has enabled more fine-grained control and adaptability in modeling human preferences. Notable papers include Bootstrapping Your Behavior, which proposes a novel pretraining strategy that achieves an average improvement of 3.9% in AUC and 98.9% in training throughput, and Configurable Preference Tuning with Rubric-Guided Synthetic Data, which introduces a framework for endowing language models with the ability to dynamically adjust their behavior based on explicit, human-interpretable directives.

Sources

Bootstrapping your behavior: a new pretraining strategy for user behavior sequence data

Debiasing Online Preference Learning via Preference Feature Preservation

Configurable Preference Tuning with Rubric-Guided Synthetic Data

Toward Explainable Offline RL: Analyzing Representations in Intrinsically Motivated Decision Transformers

DCRM: A Heuristic to Measure Response Pair Quality in Preference Optimization

GRAM: A Generative Foundation Reward Model for Reward Generalization

SENIOR: Efficient Query Selection and Preference-Guided Exploration in Preference-based Reinforcement Learning

FEAST: A Flexible Mealtime-Assistance System Towards In-the-Wild Personalization

Context Matters: Learning Generalizable Rewards via Calibrated Features

Built with on top of