Qwen3.5-0.8B: A Multimodal Thinking Model That Fits in 1 Gigabyte
800 million parameters. 262,000-token context window. Images, video, and text — all handled natively. Thinking mode on demand. Apache 2.0 license. And the entire model weighs in at 1GB on Ollama. That’s the Qwen3.5-0.8B, the smallest member of Alibaba’s Qwen3.5 family, released in February 2026. It is not a general-purpose language model pretending to be multimodal — it was trained with early fusion on multimodal tokens from the start, covering 201 languages and dialects. At sub-gigabyte scale, very little competes with its feature set. ...