Add FLUX.2 Klein Inpaint Pipeline by adi776borate · Pull Request #13050 · huggingface/diffusers

adi776borate · 2026-01-28T19:22:14Z

What does this PR do?

Fixes #13005
This PR adds the Flux2KleinInpaintPipeline for image inpainting using the FLUX.2 [Klein] model with optional reference image conditioning.

Examples

Basic Inpainting

Image	Mask	Prompt	Result
		Face of a yellow cat, high resolution, sitting on a park bench
		A young college boy, high resolution, sitting on a park bench

Inpainting with Reference Image

Image	Mask	Reference	Prompt	Result
			Replace this ball

Known Limitations

Generation quality may vary - Some outputs may contain artifacts. This can often be mitigated with better prompts and tuning hyperparameters (strength, guidance_scale, num_inference_steps).
Reference image conditioning is experimental - Inpainting with image_reference may not consistently produce desired results.

Image	Mask	Reference	Prompt	Result
			Replace puppy with panda

These limitations may stem either from a bug in the pipeline implementation by me or from inherent constraints of the model. Feedback is appreciated.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@asomoza @sayakpaul
Anyone in the community is free to review the PR once the tests have passed.

Natans8 · 2026-01-28T20:13:38Z

I have to be honest, I am not getting good results at all so far, especially when working with bounding box masks, even without a reference.
With black-forest-labs/FLUX.2-klein-4B, since it's a distilled model, guidance_scale is ignored, so I essentially have only strength and num_inference_steps to tweak.

At strength=1.0 as it often goes with editing models, I get the entire inpainting area overridden edge to edge instead of just the prompted subject, usually with a lot of artefacts.
At lower strengths I get my prompt plain ignored.

On the other hand, I don't have the same issues with the regular Flux2KleinPipeline with the same parameter values and prompt, it changes the prompted subject and leaves the rest of the image untouched. Same goes for other model pipelines like FluxKontextInpaintPipeline.

If you ask, I can provide examples of the results I get once my GPU frees up.

Copilot

Pull request overview

This PR adds the Flux2KleinInpaintPipeline to enable image inpainting capabilities for the FLUX.2 Klein model. The pipeline supports both basic text-guided inpainting and experimental reference image conditioning, addressing issue #13005.

Changes:

Implements a new inpainting pipeline for FLUX.2 Klein with masking support
Adds optional reference image conditioning for more controlled inpainting
Extends Flux2ImageProcessor with do_binarize and do_convert_grayscale parameters for mask processing

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`src/diffusers/pipelines/flux2/pipeline_flux2_klein_inpaint.py`	Main pipeline implementation with inpainting logic, mask handling, and reference image support
`tests/pipelines/flux2/test_pipeline_flux2_klein_inpaint.py`	Test suite covering basic inpainting functionality including different prompts, output shapes, and strength variations
`src/diffusers/pipelines/flux2/image_processor.py`	Enhanced image processor with binarization and grayscale conversion options
`src/diffusers/pipelines/flux2/__init__.py`	Export declarations for the new pipeline
`src/diffusers/pipelines/__init__.py`	Top-level export declarations
`src/diffusers/__init__.py`	Main package export declarations
`src/diffusers/utils/dummy_torch_and_transformers_objects.py`	Dummy object for missing dependencies

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/diffusers/pipelines/flux2/pipeline_flux2_klein_inpaint.py

tests/pipelines/flux2/test_pipeline_flux2_klein_inpaint.py

adi776borate · 2026-01-29T09:47:11Z

I'd missed patchifying the mask latents earlier. With that corrected, here are some new example generations:

Image	Mask	Prompt	Result
		Face of a yellow cat, high resolution, sitting on a park bench	Using 4B model
			Using 9B model
		A young college boy, high resolution, sitting on a park bench	Using 4B model
			Using 9B model
		A young woman standing with bouquet, natural pose, realistic proportions, consistent lighting and environment	Using 4B model
		A young woman standing with bouquet, natural pose, realistic proportions, consistent lighting and environment	Using 4B model

Below all are using 9B model with strength=1.0:

Image	Mask	Reference	Prompt	Result
			Replace this ball
			Replace puppy with panda

@Natans8 this might improve what you observed but I'll suggest wait till maintainers' review.
Thanks!

adi776borate · 2026-02-03T16:25:13Z

@asomoza @yiyixuxu
Sorry for tagging again, but can you please review this?

asomoza · 2026-02-03T16:27:39Z

oh I was waiting to see if the your changes fixed the issue that @Natans8 was having, I'll give it a test today review the PR

adi776borate · 2026-02-15T19:35:27Z

Hi, @Natans8 can you please try my last commit? Also, it would be helpful, if you provide some examples.
@asomoza ,would appreciate a review when you get time.

Natans8 · 2026-02-15T20:29:06Z

Hi, @Natans8 can you please try my last commit? Also, it would be helpful, if you provide some examples. @asomoza ,would appreciate a review when you get time.

It is still working a little weird to me. These are my results on the latest branch:
For example for the following code on

pipe = Flux2KleinInpaintPipeline.from_pretrained("black-forest-labs/FLUX.2-klein-4B", torch_dtype=torch.bfloat16)
inpainted_img = pipe(
    prompt="Remove the person, keep the background unchanged",
    image=image,
    mask_image=mask_pil, #rectangular mask
    num_inference_steps=4,
    guidance_scale=1.0,
    generator=generator,
    strength=1.0).images[0]

And this image

I get the following:

Or if I try to change the prompt to "remove all people" I can get something like this:

"Change the person's cloths to blue":

Same prompt with strength 0.8:

For comparison this is Flux2KleinPipeline with the prompt "Remove the boy in red":

"Change the clothes of the boy in red to blue, keep the others unchanged":

I had to fiddle a little in the prompt to specify which person I was referring to, not sure if that comparison is fair, but the point is that there is no artifacts.

When trying to tweak the strength, or the prompt of the inpaint pipeline I don't get much better results, sometimes it's random trees or rooms insensitive to the contents of the image, sometimes it's a black box, a white box, a slice of a wall. Of course I have to compare how it fares with 9B as well, ideally.

There is also the possibility that I'm just doing something incorrectly, I do not have a lot of experience with Diffusers. But i hope this feedback will be useful.

asomoza · 2026-02-16T14:54:36Z

For some reason, the demo examples for inpainting that people always use are very simple which will never fail, for example the dog sitting on the bench from the original SD, any inpainting where there's a clear separation from the subject and the background is very simple and any modern model can do it without any issues, you can even use generative fill from photoshop before the AI era and it will also be good.

For what I've tested, Klein almost never changes the rest of the image if you are specific in the edit, the VAE is also very good which makes the quality loss minimal.

The use case for this pipeline (IMO) is to be able to separate subjects when they're very similar and you can't just prompt it, in @Natans8 you can prompt to for a specific person but its enough of a good example because it's not a simple image to inpaint and we can prompt for people, also I wouldn't test this with the base model since it's not worth the 50 steps and CFG.

I did myself a test pipeline with differential diffusion to see if the model is capable of doing a good edit with restrictions, I will use for all the test just the prompt remove the people and using @Natans8 source image, I got this:

So the model is capable of doing it, so now to test this pipeline:

With the default strength:

base	distilled

I'll use just the distilled one since the results are similar, so varying the strenght I got these:

0.8	0.9

With 0.9 you could see it's somewhat decent and it's on pair for what I've seen with other inpainting pipelines.

I also tried with the prompt change the the color of the shirt to blue while maintaining everything else intact but it changed the person too.

@adi776borate does this match your results? if this looks usable to you, we can continue and review it.

adi776borate · 2026-02-17T14:32:14Z

Yes, I also got sub-optimal results, sometimes very poor.
I used 9B model with strength 1.0.

"Remove the person" (non-rectangular mask):

"Remove the person" (rectangular mask):

"Change the person's clothes to Blue" (non-rectangular mask):

I'll take a look at this again in this week to identify the bug.

…ask spatial alignment and remove unused VAE encoding

…76borate/diffusers into feature/flux2-klein-inpaint

asomoza

left some more comments, don't forget to run:

make style 
make quality

so we can run the tests.

src/diffusers/pipelines/flux2/pipeline_flux2_klein_inpaint.py

asomoza · 2026-03-25T00:52:37Z

src/diffusers/pipelines/flux2/pipeline_flux2_klein_inpaint.py

+                # broadcast to batch dimension in a way that's compatible with ONNX/Core ML
+                timestep = t.expand(latents.shape[0]).to(latents.dtype)
+
+                latent_model_input = torch.cat([latents, condition_image_latents], dim=1)


here we're concatenating the latents and the condition_image_latents, but when called with the methods, you're passing a dtype that can be different.

latents uses prompt_embeds.dtype while condition_image_latents uses self.vae.dtype, this means that if the user changes the vae dtype (for example to use a higher precision for encoding or decoding) this will fail.

This is probably something we can also fix in the other PR

tests/pipelines/flux2/test_pipeline_flux2_klein_inpaint.py

src/diffusers/pipelines/flux2/pipeline_flux2_klein_inpaint.py

Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>

adi776borate · 2026-03-26T17:46:21Z

@asomoza I have made the suggested changes and also modified the batching logic as this specific test was failing with earlier code : test_inference_batch_single_identical. The previous code was merging all images in a batch into a single sequence. I changed it to have 1-to-1 mapping for batched inputs and broadcasting for single inputs.

src/diffusers/pipelines/flux2/pipeline_flux2_klein_inpaint.py

Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>

adi776borate · 2026-04-04T06:50:40Z

Hi! If everything appears to be in order, shall we proceed with the tests?

asomoza

thanks, left a few comments, we're close to merge now

asomoza · 2026-04-06T12:10:59Z

src/diffusers/pipelines/flux2/pipeline_flux2_klein_inpaint.py

+        # 2.2 Preprocess reference image
+        processed_image_reference = None
+        if image_reference is not None and not (
+            isinstance(image_reference, torch.Tensor) and image_reference.size(1) == self.latent_channels


you skip the preprocessing but in 1017 you set the processed_image_reference as None, so if the user passes the image_reference as tensors it skips this and never set the processed_image_reference

Added an else block to handle it

asomoza · 2026-04-06T12:14:31Z

src/diffusers/pipelines/flux2/pipeline_flux2_klein_inpaint.py

+        else:
+            image_latents = torch.cat([image_latents], dim=0)


Suggested change

else:

image_latents = torch.cat([image_latents], dim=0)

the image_latents are already the same

Agreed, removing it

asomoza · 2026-04-06T12:21:50Z

src/diffusers/pipelines/flux2/pipeline_flux2_klein_inpaint.py

+            else:
+                # multiple images per sample in the batch
+                item_ids = [x_ids]
+                for _ in range(1, b_i):
+                    t_offset += scale
+                    t = torch.tensor([t_offset]).view(-1)
+                    item_ids.append(
+                        torch.cartesian_prod(t, torch.arange(height), torch.arange(width), torch.arange(1))
+                    )
+                x_ids = torch.cat(item_ids, dim=0)  # (b_i * h * w, 4)
+                x_ids = x_ids.unsqueeze(0).expand(batch_size, -1, -1)
+                all_image_latent_ids.append(x_ids)
+                t_offset += scale


is this path ever reached?

Yes it is reachable when we pass multiple reference images for a single sample.

asomoza · 2026-04-06T12:31:21Z

tests/pipelines/flux2/test_pipeline_flux2_klein_inpaint.py

+    params = frozenset(
+        ["prompt", "image", "image_reference", "mask_image", "height", "width", "guidance_scale", "prompt_embeds"]
+    )
+    batch_params = frozenset(["prompt", "image", "mask_image"])


Suggested change

batch_params = frozenset(["prompt", "image", "mask_image"])

batch_params = frozenset(["prompt", "image", "image_reference", "mask_image"])

HuggingFaceDocBuilderDev · 2026-04-06T12:41:09Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

yiyixuxu · 2026-04-09T20:24:53Z

@claude can you help to do a review here?

github-actions · 2026-04-09T20:25:14Z

Claude Code is working…

I'll analyze this and get back to you.

View job run

yiyixuxu · 2026-04-09T20:29:54Z

@bot /style

github-actions · 2026-04-09T20:30:25Z

Style bot fixed some files and pushed the changes.

yiyixuxu · 2026-04-10T02:08:37Z

@adi776borate can you run make fix-copies?

adi776borate · 2026-04-10T03:28:04Z

@adi776borate can you run make fix-copies?

I intentionally held off on that. Could you please review #13299 first? I discussed this with @asomoza here.

adi776borate · 2026-04-10T03:48:29Z

@asomoza I have addressed your last comments and also removed some redundant dtype casts while I was at it. This should now correctly handle pre-encoded latents as source or ref image input. I think we should also update the docs to tell the user that, we expect patchified latents as input.

adi776borate added 6 commits January 26, 2026 23:53

Add Flux2KleinInpaintPipeline

2d22e4c

Fixed mask channel mismatch and a bit of cleaning

d213e59

Added tests and minor refactors

738ac43

Added support for reference images for inpainting

6fd76dd

Style fixes

2516f06

Fixed the example docstring

c44b69a

sayakpaul requested a review from yiyixuxu January 29, 2026 02:39

sayakpaul assigned asomoza and unassigned asomoza Jan 29, 2026

sayakpaul requested review from asomoza and Copilot January 29, 2026 02:39

Copilot started reviewing on behalf of sayakpaul January 29, 2026 02:40 View session

Copilot AI reviewed Jan 29, 2026

View reviewed changes

src/diffusers/pipelines/flux2/pipeline_flux2_klein_inpaint.py Show resolved Hide resolved

tests/pipelines/flux2/test_pipeline_flux2_klein_inpaint.py Outdated Show resolved Hide resolved

tests/pipelines/flux2/test_pipeline_flux2_klein_inpaint.py Show resolved Hide resolved

Corrected mask latent preparation for correct dimensional alignment

9502d77

adi776borate marked this pull request as draft February 22, 2026 08:52

adi776borate added 5 commits March 14, 2026 16:36

replace masked_image_latents context with clean_source_latents, fix m…

f85ee3b

…ask spatial alignment and remove unused VAE encoding

Merge branch 'main' into feature/flux2-klein-inpaint

b6262b0

Fix T-coordinate collision for conditioning

9ffef8f

Merge branch 'feature/flux2-klein-inpaint' of https://github.com/adi7…

c5865a0

…76borate/diffusers into feature/flux2-klein-inpaint

Merge branch 'main' into feature/flux2-klein-inpaint

766072f

adi776borate mentioned this pull request Mar 21, 2026

Fix missing latents_bn_std dtype cast in VAE normalization #13299

Open

6 tasks

asomoza reviewed Mar 25, 2026

View reviewed changes

joangava approved these changes Mar 25, 2026

View reviewed changes

adi776borate and others added 2 commits March 26, 2026 17:33

Fixed batch inference discrepancy and addressed review comments

60e1ed2

Fixed a typo

a026f0f

Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>

asomoza reviewed Apr 1, 2026

View reviewed changes

src/diffusers/pipelines/flux2/pipeline_flux2_klein_inpaint.py Outdated Show resolved Hide resolved

asomoza reviewed Apr 1, 2026

View reviewed changes

src/diffusers/pipelines/flux2/pipeline_flux2_klein_inpaint.py Outdated Show resolved Hide resolved

src/diffusers/pipelines/flux2/pipeline_flux2_klein_inpaint.py Outdated Show resolved Hide resolved

adi776borate and others added 2 commits April 1, 2026 20:34

Apply suggestion from @asomoza

e8f590b

Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>

Reused encoded latents and fix channel check consistency

2d83f13

asomoza reviewed Apr 6, 2026

View reviewed changes

asomoza added the close-to-merge label Apr 6, 2026

fixed pre-encoded latent preprocessing for source and ref images

41d8a98

github-actions bot added tests utils pipelines size/L PR with diff > 200 LOC labels Apr 9, 2026

Apply style fixes

eac2a72

github-actions bot added size/L PR with diff > 200 LOC and removed size/L PR with diff > 200 LOC labels Apr 9, 2026

joangava approved these changes Apr 10, 2026

View reviewed changes

	batch_params = frozenset(["prompt", "image", "mask_image"])
	batch_params = frozenset(["prompt", "image", "image_reference", "mask_image"])

Conversation

adi776borate commented Jan 28, 2026

What does this PR do?

Examples

Basic Inpainting

Inpainting with Reference Image

Known Limitations

Before submitting

Who can review?

Uh oh!

Natans8 commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

adi776borate commented Jan 29, 2026

Uh oh!

adi776borate commented Feb 3, 2026

Uh oh!

asomoza commented Feb 3, 2026

Uh oh!

adi776borate commented Feb 15, 2026

Uh oh!

Natans8 commented Feb 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

asomoza commented Feb 16, 2026

Uh oh!

adi776borate commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

asomoza left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

asomoza Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

adi776borate commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

adi776borate commented Apr 4, 2026

Uh oh!

asomoza left a comment

Choose a reason for hiding this comment

Uh oh!

asomoza Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

adi776borate Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

asomoza Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

adi776borate Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

asomoza Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

adi776borate Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

asomoza Apr 6, 2026

Choose a reason for hiding this comment

Natans8 commented Jan 28, 2026 •

edited

Loading

Natans8 commented Feb 15, 2026 •

edited

Loading

adi776borate commented Feb 17, 2026 •

edited

Loading

asomoza Mar 25, 2026 •

edited

Loading

adi776borate commented Mar 26, 2026 •

edited

Loading

github-actions bot commented Apr 9, 2026 •

edited

Loading

adi776borate commented Apr 10, 2026 •

edited

Loading