model May 09, 2026

Qwen3.6‑27B Heretic‑Uncensored Finetune (NEO‑CODE‑Di‑IMatrix‑MAX) – GGUF Quantized Model Report

The Qwen3.6‑27B Heretic‑Uncensored model is a post‑trained, 27‑billion‑parameter causal language model with a vision encoder that has been stripped of safety‑filtering (“heretic”) and further finetuned for unrestricted, creative generation. It is distributed in the GGUF format and quantized with a suite of advanced NEO‑CODE‑Di‑IMatrix‑MAX quantizations (IQ‑4, IQ‑6, Q6_K, Q8_0, etc.), delivering near‑original quality while shrinking the model to run on consumer‑grade hardware. Benchmarks show a Same‑Top‑P of 94‑99 % across quantizations, Mean KLD under 0.03 for the best quants, RMS Δp around 1–1.5 %, and perplexity within 0.1 % of the full‑precision BF16 baseline. The model supports a native context window of 262 k tokens (extendable to >1M) and includes a vision encoder, making it suitable for multimodal tasks. Safety filters have been deliberately removed, yielding an “un‑censored” generation style with a refusal rate of only ~4 % on the internal Qwen evaluation set.

Project Ideas

  1. Long‑form storytelling or world‑building where unrestricted imagination and low‑latency inference are required.
  2. Agentic coding assistants that retain reasoning across many iterative prompts, leveraging the 256k‑token context for complex repo‑level tasks.
  3. Multimodal content creation (image‑to‑text, visual question answering) using the built‑in vision encoder together with the uncensored language head.
  4. Research experiments on safety‑filter removal effects, comparing logical drift and stability against the base Qwen3.6 model.
  5. Deploying on edge devices or low‑cost servers (e.g., CPUs with 8‑16 GB RAM) thanks to the high‑quality GGUF quantizations (IQ‑4, Q6_K, Q8_0).
← Back to all reports