Blazing fast.
Sub-second inference*, making it 30%+ faster than any competing model.
* Inference time varies by model and hardware. See model comparison for more details.
Beautiful.
Production-quality visuals as fast as you can prompt.




















































































































Seeing is believing.
We built a free demo so you don't have to take our word for it.
No signup, no credit card.
Run it locally.
Built to fine-tune.
FLUX.2 [klein] runs on your hardware.
Choose from four variants optimized for different use cases.
| Model | Description | License | Inference Time (GB200)In seconds | Inference Time (RTX5090)In seconds | VRAM |
|---|---|---|---|---|---|
| FLUX.2 [klein] 9B | Our distilled model. Outstanding quality at sub-second speed. Great for real-time generation while retaining quality. Marketing launch will focus on this model. | FLUX Non-Commercial License | ~0.5 | ~2 | 19.6 GB |
| FLUX.2 [klein] 9B Base | Our undistilled foundation model. Maximum flexibility and control. Great for fine-tuning. | FLUX Non-Commercial License | ~6 | ~35 | 21.7 GB |
| FLUX.2 [klein] 4B | The fastest variant in the Klein family. Built for interactive applications, real-time previews, and latency-critical production use cases. | Apache 2.0 | ~0.3 | ~1.2 | 8.4 GB |
| FLUX.2 [klein] 4B Base | A smaller foundation model with exceptional quality-to-size ratio. Ideal for local deployment, fine-tuning on limited hardware, and efficient generation and editing workflows. | Apache 2.0 | ~3 | ~17 | 9.2 GB |