Hybrid
NVIDIA · 2026-03
Nemotron 3 Super
Hybrid decoder architecture with Mostly Mamba-2 + GQA attention mechanism.
Nemotron 3 Super decoder block architecture: Attention: Mostly Mamba-2 + GQA. Normalization: RMSNorm. FFN: SwiGLU. Position encoding: RoPE. Scale: 120B, 1M context, 96 layers. Decoder type: Hybrid.
Mostly Mamba-2 + GQA·SwiGLU
12B active / 120B total|1M context|Mostly Mamba-2 + GQA|Hybrid
Architecture Specifications
Parameters12B active / 120B total
Context Window1M
Decoder TypeHybrid
AttentionMostly Mamba-2 + GQA
Active Parameters12B
Release Date2026-03
CategoryHybrid Architecture
OrganizationNVIDIA
Key Features
Mamba-2 SSM1M context12B active hybrid
Enterprise AI platform
Compare, evaluate, and deploy LLM architectures at scale
Colaberry AI provides architecture specifications, benchmark comparisons, and deployment guidance for enterprise AI teams.