Dense
Cohere · 2026-02
Tiny Aya
Dense decoder architecture with GQA + SWA + NoPE attention mechanism.
Tiny Aya decoder block architecture: Attention: GQA + SWA + NoPE with Sliding Window Attention. Normalization: RMSNorm. FFN: SwiGLU. Position encoding: NoPE. Scale: 3.35B, 8,192 context, 24 layers. Decoder type: Dense.
GQA + SWA + NoPE·SwiGLU
3.35B|8,192 context|GQA + SWA + NoPE|Dense
Architecture Specifications
Parameters3.35B
Context Window8,192
Decoder TypeDense
AttentionGQA + SWA + NoPE
Release Date2026-02
CategoryEfficient & Small
OrganizationCohere
Key Features
No positional embeddingsMassively multilingualCompact
Enterprise AI platform
Compare, evaluate, and deploy LLM architectures at scale
Colaberry AI provides architecture specifications, benchmark comparisons, and deployment guidance for enterprise AI teams.