Qwen3-Next series: a hybrid architecture with Gated DeltaNet × Gated Attention. 80B total parameters with ~3B active per step, optimized for long context, high concurrency, and low latency. Instruct and Thinking target production chat and deep reasoning respectively.
• 4 min read
News