model June 08, 2026

Ideogram 4 (nf4): Open‑Weight 9.3B Text‑to‑Image Diffusion Model with JSON Prompting

Ideogram 4 (nf4) is Ideogram AI's first open‑weight text‑to‑image model, released in June 2026. Built from scratch with 9.3 B parameters, it uses a fully single‑stream Diffusion Transformer (DiT) architecture and incorporates the Qwen3‑VL‑8B‑Instruct vision‑language model as its text encoder. The model is integrated with the Hugging Face diffusers library and supports the `Ideogram4Pipeline` for easy inference.

The model stands out for its structured JSON prompting interface, which gives users fine‑grained control over composition, colour palettes, bounding‑box layout, and multilingual text rendering. It can generate native 2 k resolution images (up to 2048 × 2048) and supports a wide range of aspect ratios. Ideogram 4 has been benchmarked as the top open‑weight model on design‑focused leaderboards such as Design Arena, ContraLabs, and LMArena, and it excels in text rendering, layout control, and spatial reasoning.

Ideogram 4 is released under a non‑commercial license (Ideogram 4 Non‑Commercial) and the weights are gated on Hugging Face, requiring users to accept the license and provide an access token. The README provides a quick‑start guide, including a CLI that can call Ideogram's free "magic‑prompt" API to expand plain‑text prompts into the required JSON format, as well as local prompt‑upsampling via the `Qwen3‑VL‑8B` head. The model is positioned as a foundation for design‑centric image generation and invites the research community to build on its capabilities.

Project Ideas

  1. Create a web‑based design assistant that takes a plain prompt, uses Ideogram's magic‑prompt API to generate a structured JSON caption, and returns a 2k‑resolution image for social‑media graphics.
  2. Develop a Python tool that generates custom logos by specifying hex colour palettes and bounding‑box positions in a JSON prompt, leveraging Ideogram 4's precise colour and layout control.
  3. Build a plugin for presentation software (e.g., PowerPoint) that inserts multilingual, text‑rich illustrations using Ideogram 4's JSON prompting via the diffusers `Ideogram4Pipeline`.
  4. Implement a batch generator for illustrated book pages, using JSON captions to define character placement, speech‑bubble boxes, and background elements for consistent page layout.
  5. Set up a research benchmark comparing Ideogram 4's in‑image text rendering on multilingual captions against other open‑weight models, using the 7Bench layout test and X‑Omni OCR evaluation datasets.
← Back to all reports