Do LLMs Have Distinct and Consistent Personality? TRAIT: Personality Testset designed for LLMs with Psychometrics

Do LLMs Have Distinct and Consistent Personality?
TRAIT: Personality Testset designed for LLMs with Psychometrics

Yonsei University, Seoul National University, Allen Institute for AI, NCSOFT
NAACL 2025 Findings
^*Indicates Equal Contribution

Abstract

Recent advancements in Large Language Models (LLMs) have led to their adaptation in various domains as conversational agents. We wonder: can personality tests be applied to these agents to analyze their behavior, similar to humans? We introduce TRAIT, a new benchmark consisting of 8K multi-choice questions designed to assess the personality of LLMs. TRAIT is built on two psychometrically validated small human questionnaires, Big Five Inventory (BFI) and Short Dark Triad (SD3), enhanced with the ATOMIC10× knowledge graph to a variety of real-world scenarios. TRAIT also outperforms existing personality tests for LLMs in terms of reliability and validity, achieving the highest scores across four key metrics: Content Validity, Internal Validity, Refusal Rate, and Reliability. Using TRAIT, we reveal two notable insights into personalities of LLMs: 1) LLMs exhibit distinct and consistent personality, which is highly influenced by their training data (e.g., data used for alignment tuning), and 2) current prompting techniques have limited effectiveness in eliciting certain traits, such as high psychopathy or low conscientiousness, suggesting the need for further research in this direction.

Benchmark Design

We construct TRAIT through a three-step pipeline with Human-AI collaboration. Starting from small-scale validated questionnaires (BFI and SD3), we expand them to diverse personality descriptions, then create detailed scenarios using the ATOMIC knowledge graph, and finally transform them into multi-choice questions with carefully designed options

Results 1: Current LLMs' Personality is distinct.

TRAIT reveals that LLMs exhibit distinct and consistent personalities. Alignment-tuned models like GPT-4 and Claude show higher agreeableness (>85%) and conscientiousness, while displaying lower scores in Dark Triad traits compared to pre-trained models, suggesting the significant impact of alignment tuning on model personality.

Results 2: SFT change personalities a lot, but DPO doesn't.

Alignment tuning significantly changes model personality, primarily through supervised instruction tuning rather than preference tuning. Changes include increased agreeableness (+22.9%) and decreased Dark Triad traits (-81.1%), which closely aligns with characteristics typically desired in teaching assistants.

Results 3: Prompting has limited effect on personality.

While prompting can effectively elicit most personality traits (average success rate of 85.2%), alignment-tuned models show notable resistance to expressing high psychopathy (79.8%) and high neuroticism (72.3%), compared to their ability to express other traits.

TRAIT's Validity and Reliability

TRAIT demonstrates superior validity and reliability compared to existing personality tests for LLMs, achieving the highest scores in content validity, internal validity, and reliability metrics while maintaining a zero refusal rate in multiple-choice questions.

BibTeX

@article{lee2024llms, title={Do llms have distinct and consistent personality? trait: Personality testset designed for llms with psychometrics}, author={Lee, Seungbeen and Lim, Seungwon and Han, Seungju and Oh, Giyeong and Chae, Hyungjoo and Chung, Jiwan and Kim, Minju and Kwak, Beong-woo and Lee, Yeonsoo and Lee, Dongha and others}, journal={arXiv preprint arXiv:2406.14703}, year={2024} }