Efficient Reinforcement Learning from Human Feedback

The field of reinforcement learning from human feedback (RLHF) is moving towards more efficient and scalable methods. Recent research has focused on leveraging human gaze modeling, developing lightweight reward models, and exploring the robustness of RL-based tractography methods. These advancements aim to reduce computational costs, improve performance, and increase the reliability of RLHF. The use of implicit human feedback, such as non-invasive electroencephalography (EEG) signals, is also being explored to provide continuous feedback without requiring explicit user intervention. Notable papers include:

Enhancing RLHF with Human Gaze Modeling, which demonstrates the potential of human gaze in improving RLHF efficiency.
Tiny Reward Models, which presents a family of small, bidirectional masked language models that rival the capabilities of larger models.
Exploring the robustness of TractOracle methods in RL-based tractography, which introduces a novel RL training scheme called Iterative Reward Training (IRT).
Aligning Humans and Robots via Reinforcement Learning from Implicit Human Feedback, which proposes a novel framework that utilizes non-invasive EEG signals to provide continuous, implicit feedback.

Efficient Reinforcement Learning from Human Feedback

Sources