dataset February 13, 2026

Moonworks Lunara Aesthetic II: High‑Quality Image Variation Dataset

The **Moonworks Lunara Aesthetic II** dataset, released by the creator *moonworks*, provides 2,854 paired images designed for research on image editing, image‑to‑image generation, and identity preservation. Each sample contains an original artwork created by Moonworks and a corresponding variant generated by the sub‑10B‑parameter *Lunara* diffusion model, along with prompts, semantic change tags, and a numeric variation set score. The dataset is stored in Parquet format (≈8.4 GB) and is accessible via the Hugging Face `datasets` library, with optional streaming support for fast startup.

The collection targets the *image‑to‑image* and *text‑to‑image* task categories, emphasizing aesthetic quality, controlled contextual changes, and preservation of the original subject’s identity. It is intended for benchmarking image‑variation models, evaluating how well models maintain identity under aesthetic transformations, and studying diffusion‑based image generation. The accompanying paper (arXiv:2602.01666) details the novel diffusion mixture architecture used to produce the variants. With an Apache‑2.0 license, the dataset is openly reusable for academic and commercial projects.

The README includes a minimal Colab example that streams the training split, extracts the `original_image` and `variant_image` fields, and visualizes them side‑by‑side. Feature metadata lists additional fields such as `base_prompt`, `variation_prompt`, `semantic_category_changes`, `prompt_topic`, and a floating‑point `variation_set` score, offering rich contextual information for downstream experiments.

Project Ideas

  1. Fine‑tune a diffusion model on the paired images to improve identity‑preserving style transfer.
  2. Create an evaluation benchmark that measures how well generated variants retain the original subject using the provided prompts and semantic change tags.
  3. Build an interactive web demo that lets users upload an image and see Lunara‑style aesthetic variations guided by custom text prompts.
  4. Train a regression model to predict the `variation_set` score from the original image and prompt, enabling controlled strength of aesthetic changes.
  5. Develop a classifier that categorizes the type of semantic changes (e.g., lighting, color, composition) based on the `semantic_category_changes` field for automated analysis of variation patterns.
← Back to all reports