dataset March 02, 2026

Real Slop: 155k Real LLM Interactions for Dialogue & Safety Research

Real Slop is a Hugging Face dataset released by the user Solenopsisbot that aggregates 155,000 real‑world language model interactions in English. The entries span a variety of model families and are stored in Parquet format, making them readily consumable with the datasets, pandas, polars, and mlcroissant libraries. With a size between 100K and 1M rows, the collection provides a substantial corpus for text‑generation research.

The dataset was compiled with explicit user consent: participants were informed that their requests would be logged, and the data underwent aggressive PII filtering to protect privacy, albeit possibly over‑filtering. A portion of the logs includes tool‑calling events, though early entries lack corresponding tool definitions. Notably, the dataset also contains a significant amount of NSFW conversation that was left unfiltered, reflecting the natural usage patterns of many LLM deployments.

Real Slop is trending because it offers authentic interaction data rather than synthetic prompts, enabling more realistic evaluation of conversational agents, safety mechanisms, and tool‑use capabilities. Researchers can leverage the diverse model responses, the mixed presence of tool calls, and the unfiltered NSFW content to benchmark robustness, develop content moderation models, and study how different LLMs handle real user queries.

Project Ideas

  1. Fine‑tune a conversational model on the Real Slop interactions to improve response realism and diversity.
  2. Create a benchmark suite that evaluates LLMs' ability to interpret and execute tool‑calling instructions using the logged tool call entries.
  3. Develop a PII detection and redaction system by training on the heavily filtered portions of the dataset as a safe baseline.
  4. Train an NSFW content classifier on the unfiltered adult‑language segments to enhance moderation pipelines.
  5. Perform a comparative analysis of user intents and model replies across the various LLMs in the dataset to study response variability.
← Back to all reports