Personality Modeling and Control in Large Language Models

The field of natural language processing is moving towards more nuanced and controllable models of personality, with a focus on aligning model behavior with psychological theory. Recent work has explored the use of prototype theory and Big Five personality traits to improve the accuracy and interpretability of personality modeling. Additionally, there is a growing interest in developing methods for controlling and steering model behavior to meet specific needs, such as generating text with desired personality attributes. Notable papers in this area include: Cognitive Alignment in Personality Reasoning: Leveraging Prototype Theory for MBTI Inference, which presents a framework for MBTI inference that operationalizes prototype theory within a language model-based pipeline. Activation-Space Personality Steering: Hybrid Layer Selection for Stable Trait Control in LLMs, which proposes a novel pipeline for extracting hidden state activations from transformer layers and identifying trait-specific optimal layers for robust injection.

Sources

Cognitive Alignment in Personality Reasoning: Leveraging Prototype Theory for MBTI Inference

Modeling the Construction of a Literary Archetype: The Case of the Detective Figure in French Literature

ECO Decoding: Entropy-Based Control for Controllability and Fluency in Controllable Dialogue Generation

Multi-Personality Generation of LLMs at Decoding-time

Activation-Space Personality Steering: Hybrid Layer Selection for Stable Trait Control in LLMs

Decoding Emergent Big Five Traits in Large Language Models: Temperature-Dependent Expression and Architectural Clustering

Built with on top of