dataset May 22, 2026

AgentTrove Dataset Overview

AgentTrove (open-thoughts/AgentTrove) is the largest publicly available collection of agentic interaction traces, comprising 1,696,847 rows from 219 source datasets covering code repair, shell scripting, math problem solving, competitive programming, and general computer‑use tasks. The data follows the Terminus‑2/ShareGPT format, with each row containing a full agent trajectory (`messages`) and metadata such as `original_source`, `original_teacher`, `reward`, and `task_id`. The dataset aggregates traces generated by a variety of teacher models (e.g., GLM‑4.7, GPT‑5.1 Nano, Qwen3, Kimi K2.0) and spans multiple domains (codeforces, nl2bash, SWEGym, StackExchange, etc.). At ~1.7 M rows it is four times larger than the Nemotron Terminal Corpus, making it a valuable resource for training and evaluating next‑generation autonomous agents, reinforcement‑learning‑from‑human‑feedback (RLHF) pipelines, and tool‑use benchmarking. The data is stored in Parquet and can be accessed via Hugging Face datasets, Dask, Polars, or ML‑Croissant APIs.

Project Ideas

  1. Fine‑tune a large language model on AgentTrove using RLHF to improve tool‑use accuracy for code generation and shell scripting tasks.
  2. Create a benchmark suite that measures success rates across different teacher models and domains by evaluating the `reward` column on held‑out traces.
  3. Develop a multi‑agent curriculum learning framework that mixes high‑reward and low‑reward trajectories to teach agents when to ask for clarification versus proceeding autonomously.
  4. Leverage the diverse `original_source` labels to train domain‑adaptive adapters that switch reasoning strategies depending on the task type (e.g., math vs. programming).
  5. Use the dataset to train a meta‑learning model that predicts the most effective teacher model for a given new task based on early conversation cues.
← Back to all reports