Qwen3.6‑27B Heretic‑Uncensored Finetune (NEO‑CODE‑Di‑IMatrix‑MAX) – GGUF Quantized Model Report
DavidAU/Qwen3.6-27B-Heretic-Uncensored-FINETUNE-NEO-CODE-Di-IMatrix-MAX-GGUF ↗
The Qwen3.6‑27B Heretic‑Uncensored model is a post‑trained, 27‑billion‑parameter causal language model with a vision encoder that has been stripped of safety‑filtering (“heretic”) and further finetuned for unrestricted, creative generation. It is distributed in the GGUF format and quantized with a suite of advanced NEO‑CODE‑Di‑IMatrix‑MAX quantizations (IQ‑4, IQ‑6, Q6_K, Q8_0, etc.), delivering near‑original quality while shrinking the model to run on consumer‑grade hardware. Benchmarks show a Same‑Top‑P of 94‑99 % across quantizations, Mean KLD under 0.03 for the best quants, RMS Δp around 1–1.5 %, and perplexity within 0.1 % of the full‑precision BF16 baseline. The model supports a native context window of 262 k tokens (extendable to >1M) and includes a vision encoder, making it suitable for multimodal tasks. Safety filters have been deliberately removed, yielding an “un‑censored” generation style with a refusal rate of only ~4 % on the internal Qwen evaluation set.
Project Ideas
- Long‑form storytelling or world‑building where unrestricted imagination and low‑latency inference are required.
- Agentic coding assistants that retain reasoning across many iterative prompts, leveraging the 256k‑token context for complex repo‑level tasks.
- Multimodal content creation (image‑to‑text, visual question answering) using the built‑in vision encoder together with the uncensored language head.
- Research experiments on safety‑filter removal effects, comparing logical drift and stability against the base Qwen3.6 model.
- Deploying on edge devices or low‑cost servers (e.g., CPUs with 8‑16 GB RAM) thanks to the high‑quality GGUF quantizations (IQ‑4, Q6_K, Q8_0).