File size: 4,270 Bytes
84ad675 c21dcc3 84ad675 c21dcc3 84ad675 c21dcc3 84ad675 c21dcc3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 | ---
license: apache-2.0
library_name: videox_fun
---
# Qwen-Image-2512-Fun-Controlnet-Union
[](https://github.com/aigc-apps/VideoX-Fun)
## Model Card
| Name | Description |
|--|--|
| Qwen-Image-2512-Fun-Controlnet-Union-2602.safetensors | Compared to the previous version of the model, we added Gray control to the model. The model was trained for a longer time than before. |
| Qwen-Image-2512-Fun-Controlnet-Union.safetensors | ControlNet weights for Qwen-Image-2512. The model supports multiple control conditions such as Canny, HED, Depth, Pose, MLSD and Scribble. |
## Model Features
- This ControlNet is added on 5 layer blocks. It supports multiple control conditions—including Canny, HED, Depth, Pose, MLSD, Scribble and Gray. It can be used like a standard ControlNet.
- Inpainting mode is also supported.
- When obtaining control images, acquiring them in a multi-resolution manner results in better generalization.
- You can adjust control_context_scale for stronger control and better detail preservation. For better stability, we highly recommend using a detailed prompt. The optimal range for control_context_scale is from 0.70 to 0.95.
## Results
<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
<tr>
<td>Pose + Inpaint</td>
<td>Output</td>
</tr>
<tr>
<td><img src="asset/inpaint.jpg" width="100%" /><img src="asset/mask.jpg" width="100%" /><img src="asset/pose.jpg" width="100%" /></td>
<td><img src="results/pose_inpaint.png" width="100%" /></td>
</tr>
</table>
<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
<tr>
<td>Pose</td>
<td>Output</td>
</tr>
<tr>
<td><img src="asset/pose2.jpg" width="100%" /></td>
<td><img src="results/pose2.png" width="100%" /></td>
</tr>
</table>
<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
<tr>
<td>Pose</td>
<td>Output</td>
</tr>
<tr>
<td><img src="asset/pose.jpg" width="100%" /></td>
<td><img src="results/pose.png" width="100%" /></td>
</tr>
</table>
<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
<tr>
<td>Scribble</td>
<td>Output</td>
</tr>
<tr>
<td><img src="asset/scribble.jpg" width="100%" /></td>
<td><img src="results/scribble.png" width="100%" /></td>
</tr>
</table>
<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
<tr>
<td>Canny</td>
<td>Output</td>
</tr>
<tr>
<td><img src="asset/canny.jpg" width="100%" /></td>
<td><img src="results/canny.png" width="100%" /></td>
</tr>
</table>
<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
<tr>
<td>HED</td>
<td>Output</td>
</tr>
<tr>
<td><img src="asset/hed.jpg" width="100%" /></td>
<td><img src="results/hed.png" width="100%" /></td>
</tr>
</table>
<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
<tr>
<td>Depth</td>
<td>Output</td>
</tr>
<tr>
<td><img src="asset/depth.jpg" width="100%" /></td>
<td><img src="results/depth.png" width="100%" /></td>
</tr>
</table>
<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
<tr>
<td>Gray</td>
<td>Output</td>
</tr>
<tr>
<td><img src="asset/gray.jpg" width="100%" /></td>
<td><img src="results/gray.png" width="100%" /></td>
</tr>
</table>
## Inference
Go to the VideoX-Fun repository for more details.
Please clone the VideoX-Fun repository and create the required directories:
```sh
# Clone the code
git clone https://github.com/aigc-apps/VideoX-Fun.git
# Enter VideoX-Fun's directory
cd VideoX-Fun
# Create model directories
mkdir -p models/Diffusion_Transformer
mkdir -p models/Personalized_Model
```
Then download the weights into models/Diffusion_Transformer and models/Personalized_Model.
```
📦 models/
├── 📂 Diffusion_Transformer/
│ └── 📂 Qwen-Image-2512/
├── 📂 Personalized_Model/
│ └── 📦 Qwen-Image-2512-Fun-Controlnet-Union.safetensors
```
Then run the file `examples/qwenimage_fun/predict_t2i_control.py` and `examples/qwenimage_fun/predict_i2i_inpaint.py`. |