HRM-Text-1B: Hierarchical Reasoning Model for Efficient Text Generation
HRM-Text-1B is a 1 billion‑parameter language model released by Sapient Intelligence. It implements the Hierarchical Reasoning Model (HRM) architecture, a dual‑timescale recurrent design where a high‑level (slow) Transformer stack and a low‑level (fast) stack iterate over the same input embeddings for H_cycles × (L_cycles + 1) steps, injecting additive state at each cycle. This structure provides effectively unbounded compute depth while keeping the parameter count modest. The model is trained from scratch on a mixture of publicly available English text corpora using a PrefixLM objective, with special prefix tokens that enable bidirectional attention over the prompt and causal generation for the response.
The checkpoint is a **pre‑alignment** model: it has not been instruction‑tuned, RL‑HFed, or adapted for multi‑turn dialogue. It supports raw text‑generation via the `text-generation` pipeline in the Transformers library (>= 5.9.0). Prompting can be controlled with condition tags such as `direct`, `cot`, `synth`, and `noisy`. For reasoning or math tasks the composite condition `synth,cot` (synthetic style followed by chain‑of‑thought) has been shown to elicit step‑by‑step responses, though quality remains uneven compared with instruction‑tuned models. Classification, extraction, and short‑form QA benefit from the `direct` condition with a few in‑context examples.
Technical specifications include a hidden size of 1536, 16 layers per stack, 12 multi‑head attention heads (head dimension 128), SwiGLU activation, gated attention, RoPE positional encoding, and parameter‑less Pre‑RMSNorm. The model operates in bfloat16, supports sequences up to 4096 tokens, and uses a 65 k vocabulary. The README provides sample code showing how to set `token_type_ids` so the PrefixLM mask matches training, and the repository links to the open‑sourced data pipeline used for pre‑training. The model is released under the Apache‑2.0 license and is intended as a starting point for downstream fine‑tuning or alignment work.
Project Ideas
- Fine‑tune HRM-Text-1B on a small instruction dataset to create a lightweight, domain‑specific chatbot.
- Build a few‑shot classification service that uses the `direct` condition with 2–8 in‑context examples for rapid text labeling.
- Develop a math‑reasoning assistant that prefixes prompts with the composite `synth,cot` tags to trigger chain‑of‑thought generation.
- Create a synthetic data generator that leverages the `synth` condition to produce curated‑style text for downstream training pipelines.
- Adapt the model for code‑related tasks by performing supervised fine‑tuning on a curated code corpus, as suggested by early third‑party experiments.