dataset February 28, 2026

Coding Agent Conversations: 549 Sessions of AI Tool Use

The *Coding Agent Conversations* dataset (ID: `peteromallet/dataclaw-peteromallet`) is a collection of 549 logged sessions where large language models act as coding assistants. Each session records metadata such as the project name, model version (e.g., Claude Opus 4‑6, Claude Sonnet 4‑6), timestamps, and a detailed message stream that includes user prompts, assistant replies, internal "thinking" notes, and explicit tool calls (e.g., file reads). The dataset totals 15.1 B input tokens and 4.6 M output tokens, providing a rich substrate for studying how LLMs perform tool‑use and agentic coding.

Exported with the open‑source DataClaw tool, the dataset is offered under an MIT license and is formatted as a JSON Lines file (`conversations.jsonl`). Its schema captures per‑message roles, content, timestamps, and a structured `tool_uses` field, while anonymizing file paths and hashing usernames for privacy. The README emphasizes the project's artistic intent: to highlight the restrictive data policies of model providers by making freely shared interaction logs publicly available.

Researchers and developers can load the data directly via the `datasets` library (`load_dataset("peteromallet/dataclaw-peteromallet", split="train")`). The collection spans multiple Claude model variants and includes detailed session statistics (message counts, token usage, tool‑use frequency). This makes it valuable for tasks such as evaluating coding‑assistant performance, analyzing tool‑use patterns, or fine‑tuning new agents on realistic, multi‑turn coding dialogues.

Project Ideas

Create a dashboard that visualizes tool‑use frequency and patterns across different Claude model versions.
Fine‑tune a smaller open‑source coding assistant on the conversation logs to improve its tool‑calling accuracy.
Train a classifier that predicts when an assistant will issue a tool call based on the preceding user message and assistant "thinking" notes.
Develop an evaluation suite that measures coding assistant success rates by comparing assistant actions to the logged tool inputs and outcomes.
Build a narrative generator that turns raw conversation sessions into readable case studies for documenting AI‑augmented development workflows.

← Back to all reports