FLUX.2-dev + Turbo (Merged) - GGUF[Q4_K_M]

Overview

This repository contains a merged and quantized version of the 32B FLUX.2-dev model with the Turbo LoRA baked directly into the UNET weights. It is quantized to Q4_K_M format using a custom llama.cpp build specifically patched for the FLUX architecture.

Why use this merged version?

If you use a base GGUF model and apply a LoRA node dynamically in ComfyUI, the engine applies "lowvram patches" (dequantizing layers on the fly during generation). For a 32B model, this drastically kills generation speed, even if the model fits in your VRAM.

By using this pre-merged GGUF file:

0 LowVRAM patches applied.
The model fits entirely into 24GB VRAM (full load: True on RTX 3090/4090).
Maximizes performance when paired with SageAttention or FlashAttention.

File Details

Base Model: FLUX.2-dev (32B parameters)
LoRA Applied: FLUX.2-dev-Turbo (Weight: 1.0)
Format: GGUF
Quantization: Q4_K_M
File Size: ~17.8 GB

How to use in ComfyUI

Download the flux2-dev-turbo-Q4_K_M.gguf file.
Place it in your ComfyUI/models/unet/ directory.
Use the Unet Loader GGUF node to load the model.
⚠️ IMPORTANT: DO NOT use a Load LoRA node for the Turbo LoRA. The weights are already baked into this UNET. Just connect it straight to your Sampler.
Setup your Sampler for Turbo (e.g., 8 steps, CFG 1.0-1.5, depending on the Turbo LoRA requirements).

Hardware Requirements

RAM: 32 GB+ recommended
VRAM: ~24 GB (Fits comfortably on RTX 3090 / 4090 alongside a quantized Mistral-3 encoder).

Credits & Acknowledgements

Base model by Black Forest Labs.
Turbo LoRA by Fal.
Quantized using the awesome ComfyUI-ModelQuantizer and city96's patched llama.cpp.

Note: Please adhere to the original FLUX.2-dev non-commercial license when using this model.

Downloads last month: 92

GGUF

Model size

32B params

Architecture

flux

Hardware compatibility

4-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for UDream/flux2-dev-turbo-Q4_K_M

Base model

black-forest-labs/FLUX.2-dev

Quantized

(9)

this model