GitHub - ali-vilab/DiffCamera: [SIGGRAPH Asia 2025] DiffCamera: Arbitrary Refocusing on Images

[SIGGRAPH Asia 2025] DiffCamera: Arbitrary Refocusing on Images

Yiyang Wang · Xi Chen · Xiaogang Xu · Yu Liu · Hengshuang Zhao

The University of Hong Kong | The Chinese University of Hong Kong | Tongyi Lab

TODO

[✅] Release the Gradio demo code.
[✅] Release the command-line inference code.
[TODO] Release the checkpoint.
[TODO] Release the code of the dataset simulation pipeline.

Installation

Install with pip:

pip install -r requirements.txt

Download Checkpoints

Download the DiffCamera checkpoint from TODO and put it in checkpoints/.
Download Flux.1-Dev as the backbone model.
[Optional] Download Depth-Anything-V2(https://huggingface.co/depth-anything/Depth-Anything-V2-Large-hf) to detect depth maps.

Running Gradio demo locally

Tips: this requires a GPU to run.

If you have prepared a depth map: Run the following command:

python gradio_demo.py --resume_from_checkpoint $DiffCamera_checkpoint_path --pretrained_model_name_or_path $FLUX_checkpoint_path

If you have don't have a depth map: Run the following command to enable the depth estimation model:

python gradio_demo.py --resume_from_checkpoint $DiffCamera_checkpoint_path --pretrained_model_name_or_path $FLUX_checkpoint_path --depth_model_path $Depth_anything_v2_larg_hf_path

After that, you should be able to see something like Running on local URL: http://127.0.0.1:8888. Click or paste it into your browser, and then you can see the gradio demo inferface:

Running command-line inference code locally

Checklist before running the inference code:

Downloading and placing the checkpoint of DiffCamera (xxx.bin) into folder checkpoints/.
Modify scripts/infer_flux.sh. Specify the paths to FLUX.1-Dev, DiffCamera checkpoint, depth estimation model, and the input image. You can also modify the bokeh level (blur strenghth, ranging from 0 to 30) and the focus coordinates. The coordinates are normalized from 0 to 1, and focus_point_x=0 stands for the top edge, focus_point_x=1 stands for the bottom edge, focus_point_y=0 stands for the leftmost edge, focus_point_y=1 stands for the rightmost edge. (Tips: you can also try the gradio demo locally to get a better intuitive idea of how these parameters affect the refocus result :)

After preparation, run the following command to refocus on a single image.

./scripts/infer.sh

Limitations and Future Work

The current DiffCamera can only support a limited resolution of 512x512 and 1024x1024.
The current DiffCamera cannot refer to a reference image as input to provide a customized refocus result on a blurry subject.
The current DiffCamera may be subject to some artifacts on challenging scenes where complex details are blurry, i.e., the information of the blurry objects is very limited.

Citation

If you find this codebase useful for your research, please use the following entry.

@article{wang2025diffcamera,
  title={DiffCamera: Arbitrary Refocusing on Images},
  author={Wang, Yiyang and Chen, Xi and Xu, Xiaogang and Liu, Yu and Zhao, Hengshuang},
  journal={SIGGRAPH Asia 2025 Conference Papers},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
asset		asset
configs		configs
scripts		scripts
src/flux		src/flux
LICENSE		LICENSE
gradio_demo.py		gradio_demo.py
inference.py		inference.py
readme.md		readme.md
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

[SIGGRAPH Asia 2025] DiffCamera: Arbitrary Refocusing on Images

TODO

Installation

Download Checkpoints

Running Gradio demo locally

Running command-line inference code locally

Limitations and Future Work

Citation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

ali-vilab/DiffCamera

Folders and files

Latest commit

History

Repository files navigation

[SIGGRAPH Asia 2025] DiffCamera: Arbitrary Refocusing on Images

TODO

Installation

Download Checkpoints

Running Gradio demo locally

Running command-line inference code locally

Limitations and Future Work

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages