Skip to content
Hybrid
NVIDIA · 2026-03

Nemotron 3 Super

Hybrid decoder architecture with Mostly Mamba-2 + GQA attention mechanism.

Nemotron 3 Super decoder block architecture: Attention: Mostly Mamba-2 + GQA. Normalization: RMSNorm. FFN: SwiGLU. Position encoding: RoPE. Scale: 120B, 1M context, 96 layers. Decoder type: Hybrid.

Mostly Mamba-2 + GQA·SwiGLU
12B active / 120B total|1M context|Mostly Mamba-2 + GQA|Hybrid

Architecture Specifications

Parameters12B active / 120B total
Context Window1M
Decoder TypeHybrid
AttentionMostly Mamba-2 + GQA
Active Parameters12B
Release Date2026-03
CategoryHybrid Architecture
OrganizationNVIDIA

Key Features

Mamba-2 SSM1M context12B active hybrid
Enterprise AI platform

Compare, evaluate, and deploy LLM architectures at scale

Colaberry AI provides architecture specifications, benchmark comparisons, and deployment guidance for enterprise AI teams.

Catalog Workspace

Discover agents, MCP servers, and skills in one governed surface

Use structured catalog views to compare readiness, ownership, integrations, and deployment posture before rollout.