[Submitted on 30 Mar 2025 (

v1

), last revised 13 Apr 2026 (this version, v2)]

Title:If an LLM Were a Character, Would It Know Its Own Story? Evaluating Lifelong Learning in LLMs

View a PDF of the paper titled If an LLM Were a Character, Would It Know Its Own Story? Evaluating Lifelong Learning in LLMs, by Siqi Fan and 8 other authors

View PDF

Abstract:Large language models (LLMs) can carry out human-like dialogue, but unlike humans, they are stateless due to the superposition property. However, during multi-turn, multi-agent interactions, LLMs begin to exhibit consistent, character-like behaviors, hinting at a form of emergent lifelong learning. Despite this, existing benchmarks often fail to capture these dynamics, primarily focusing on static, open-ended evaluations. To address this gap, we introduce LIFESTATE-BENCH, a benchmark designed to assess lifelong learning in LLMs. It features two episodic datasets: Hamlet and a synthetic script collection, rich in narrative structure and character interactions. Our fact checking evaluation probes models' self-awareness, episodic memory retrieval, and relationship tracking, across both parametric and non-parametric approaches. Experiments on models like Llama3.1-8B, GPT-4-turbo, and DeepSeek R1, we demonstrate that nonparametric methods significantly outperform parametric ones in managing stateful learning. However, all models exhibit challenges with catastrophic forgetting as interactions extend, highlighting the need for further advancements in lifelong learning.

Submission history

From: Siqi Fan [

view email

]

[v1]

Sun, 30 Mar 2025 16:50:57 UTC (1,602 KB)

[v2]

Mon, 13 Apr 2026 09:13:11 UTC (1,596 KB)