Qwen3-Coder-Next: Run a Frontier-Level Coding Agent Locally on Consumer Hardware

There is a certain irony in spending $200 a month on a cloud coding assistant for a codebase you’ll never let leave your machine. Your intellectual property stays on-premises, but every line you paste into a chat window makes a round trip to a server you don’t control. Until recently, the performance gap between local models and frontier cloud assistants made that trade-off feel unavoidable. Qwen3-Coder-Next, released by Alibaba’s Qwen team on February 3, 2026, is the clearest argument yet that the trade-off is closing. With 80 billion total parameters but only 3 billion activated per token, it scores 70.6% on SWE-Bench Verified — matching or beating models with 10–20× more active parameters — and it runs on hardware you can buy today. ...

May 11, 2026 · 7 min · 1479 words · Clevis

SmolLM3-3B: The Fully Open Small Language Model That Punches Way Above Its Weight

Three billion parameters. 128,000 token context window. Reasoning mode baked right in. Six languages. And an Apache 2.0 license with the full training blueprint published alongside the weights. If you’ve been waiting for a small language model that you can actually deploy on a $5 VPS, an old MacBook, or a Raspberry Pi cluster without compromising on capability — HuggingFace’s SmolLM3-3B is worth your attention right now. What Is SmolLM3 and Why Does It Matter in 2026? Released by HuggingFace’s SmolLM team on July 8, 2025, SmolLM3-3B is the third major iteration of their “smol” model series. But calling it just “smol” undersells what’s going on here. ...

March 22, 2026 · 9 min · 1721 words · Clevis