RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text

Mrinmaya Sachan; Peng Cui; Ryan Cotterell; Tiannan Wang; Wangchunshu Zhou; Yifan Hou; Yuchen Eleanor Jiang; Zhenxin Xiao

arxiv: 2305.13304 · v1 · pith:NE6SQSZAnew · submitted 2023-05-22 · 💻 cs.CL · cs.LG

RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text

Wangchunshu Zhou , Yuchen Eleanor Jiang , Peng Cui , Tiannan Wang , Zhenxin Xiao , Yifan Hou , Ryan Cotterell , Mrinmaya Sachan This is my paper

classification 💻 cs.CL cs.LG

keywords recurrentgptinteractivelongtextlanguagemechanismaigcarbitrarily

0 comments

read the original abstract

The fixed-size context of Transformer makes GPT models incapable of generating arbitrarily long text. In this paper, we introduce RecurrentGPT, a language-based simulacrum of the recurrence mechanism in RNNs. RecurrentGPT is built upon a large language model (LLM) such as ChatGPT and uses natural language to simulate the Long Short-Term Memory mechanism in an LSTM. At each timestep, RecurrentGPT generates a paragraph of text and updates its language-based long-short term memory stored on the hard drive and the prompt, respectively. This recurrence mechanism enables RecurrentGPT to generate texts of arbitrary length without forgetting. Since human users can easily observe and edit the natural language memories, RecurrentGPT is interpretable and enables interactive generation of long text. RecurrentGPT is an initial step towards next-generation computer-assisted writing systems beyond local editing suggestions. In addition to producing AI-generated content (AIGC), we also demonstrate the possibility of using RecurrentGPT as an interactive fiction that directly interacts with consumers. We call this usage of generative models by ``AI As Contents'' (AIAC), which we believe is the next form of conventional AIGC. We further demonstrate the possibility of using RecurrentGPT to create personalized interactive fiction that directly interacts with readers instead of interacting with writers. More broadly, RecurrentGPT demonstrates the utility of borrowing ideas from popular model designs in cognitive science and deep learning for prompting LLMs. Our code is available at https://github.com/aiwaves-cn/RecurrentGPT and an online demo is available at https://www.aiwaves.org/recurrentgpt.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 11 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding
cs.CL 2023-08 unverdicted novelty 8.0

LongBench is the first bilingual multi-task benchmark for long context understanding in LLMs, containing 21 datasets in 6 categories with average lengths of 6711 words (English) and 13386 characters (Chinese).
Multi-Head Recurrent Memory Agents
cs.LG 2026-07 unverdicted novelty 7.0

The paper proposes Multi-Head Recurrent Memory (MHM) with a select-then-update strategy to improve memory retention in long-context recurrent agents.
The Moltbook Files: A Harmless Slopocalypse or Humanity's Last Experiment
cs.CL 2026-05 unverdicted novelty 7.0

An AI-agent social platform generated mostly neutral content whose use in fine-tuning reduced model truthfulness comparably to human Reddit data, suggesting limited unique harm but flagging tail risks like secret leaks.
MetaPS: Adaptive Programmatic Strategy Selection for Market Agents
cs.AI 2026-06 unverdicted novelty 6.0

MetaPS trains models via simulation rollouts to select from programmatic strategy libraries for market agents, yielding better performance than fixed or direct LLM baselines across model sizes.
Externalizing Research Synthesis and Validation in AI Scientists through a Research Harness
cs.AI 2026-06 unverdicted novelty 6.0

Xcientist externalizes research synthesis and validation in AI scientists via contract-governed artifacts to maintain traceable trajectories and avoid claim drift across three domains.
Unveiling Privacy Risks in Multi-modal Large Language Models: Task-specific Vulnerabilities and Mitigation Challenges
cs.CR 2026-06 unverdicted novelty 6.0

Introduces MM-Privacy dataset and evaluations showing MLLMs leak sensitive data from images in various tasks, highlighting task inconsistency effects.
Scaling Self-Evolving Agents via Parametric Memory
cs.AI 2026-06 unverdicted novelty 6.0

TMEM lets LLM agents evolve their policy mid-episode by absorbing distilled supervision into online LoRA updates, outperforming summary and retrieval baselines on several long-context benchmarks.
DeepSurvey: Enhancing Analytical Depth and Citation Reliability in Automated Survey Generation
cs.AI 2026-05 unverdicted novelty 6.0

DeepSurvey introduces an agentic system for automated survey generation that improves depth through full-text keynotes, cross-paper clustering, and code analysis, while boosting citation reliability via graph expansio...
PAAC: Privacy-Aware Agentic Device-Cloud Collaboration
cs.LG 2026-05 unverdicted novelty 6.0

PAAC aligns planner-executor decomposition with the device-cloud boundary via typed placeholders and on-device sanitization, delivering 15-36% higher accuracy and 2-6x lower leakage than prior device-cloud baselines o...
CSR: Infinite-Horizon Real-Time Policies with Massive Cached State Representations
cs.RO 2026-05 unverdicted novelty 6.0

CSR with ASR enables infinite-horizon real-time LLM policies via stable KV-cache properties and background eviction, delivering 26x lower latency and SOTA recall on embodied benchmarks.
Episodic-Semantic Memory Architecture for Long-Horizon Scientific Agents
cs.AI 2026-05 unverdicted novelty 5.0

A dual-process memory architecture for scientific AI agents maintains 70-85% accuracy over 15,000 messages by using a constant 10-message episodic window and domain-specific semantic consolidation, consuming 62% fewer...