Dense
Nanbeige · 2026-02
Nanbeige 4.1
Dense decoder architecture with GQA attention mechanism.
Nanbeige 4.1 decoder block architecture: Attention: GQA. Normalization: RMSNorm. FFN: SwiGLU. Position encoding: RoPE. Scale: 3B, 262K context, 24 layers. Decoder type: Dense.
GQA·SwiGLU
3B|262K context|GQA|Dense
Architecture Specifications
Parameters3B
Context Window262K
Decoder TypeDense
AttentionGQA
Release Date2026-02
CategoryEfficient & Small
OrganizationNanbeige
Key Features
Ultra-compact262K contextChinese-focused
Enterprise AI platform
Compare, evaluate, and deploy LLM architectures at scale
Colaberry AI provides architecture specifications, benchmark comparisons, and deployment guidance for enterprise AI teams.