Feat: flux2 dev support by Pfannkuchensack · Pull Request #9234 · invoke-ai/InvokeAI

Pfannkuchensack · 2026-05-25T20:15:12Z

Summary

Adds end-to-end support for FLUX.2 [dev] alongside the existing FLUX.2 Klein implementation. Dev is a 32B guidance-distilled rectified flow transformer that uses Mistral Small 3.1 (24B) as its sole text encoder instead of Klein's Qwen3, with joint_attention_dim=15360 and a guidance-active distillation. It shares the 32-channel AutoencoderKLFlux2 VAE and the 4D-RoPE sampling backend with Klein, so most of the existing infrastructure is reused — only Mistral-specific loaders, configs, and graph wiring are new.

Backend

Taxonomy: Flux2VariantType.Dev, ModelType.MistralEncoder, ModelFormat.MistralEncoder, new MistralVariantType.Small3_1. Added to AnyVariant and variant_type_adapter. ModelRecordChanges.variant union extended.
Probing: dev is detected by context_in_dim = 15360 (main) / vec_in_dim = 5120 / hidden_size = 6144 (LoRA, all formats). Existing Klein and FLUX.1 probes preserved. Main_Diffusers_Flux2_Config also accepts the Flux2Pipeline / Flux2Transformer2DModel class names.
New configs/mistral_encoder.py: Diffusers folder, single-file safetensors, and GGUF configs under BaseModelType.Any / ModelType.MistralEncoder. Folder probe excludes full pipelines (those match Main_Diffusers_Flux2_Config).
New load/model_loaders/mistral_encoder.py: three loaders (Diffusers via AutoModel, single-file via MistralModel, GGUF via MistralModel + llama.cpp key conversion). Includes:
- _convert_for_bare_mistral_model: strips the model. prefix and drops lm_head so a MistralForCausalLM state dict loads cleanly into bare MistralModel.
- _materialize_remaining_meta_tensors: replaces any params/buffers still on the meta device after load_state_dict (norms → ones, others → zeros) with a warning listing what was missing, so the model cache → VRAM move can't fail on partial state dicts.
- llama.cpp converter handles attn_q_norm/attn_k_norm (Mistral 3.x QK-norm), attn_q/k/v/output, attn_norm, ffn_*, plus the root token_embd/output_norm/output keys.
- _load_processor_with_offline_fallback walks a list of sources (black-forest-labs/FLUX.2-dev:tokenizer, then mistralai/Mistral-Small-3.1-…, then …-3.2-…), trying AutoProcessor and AutoTokenizer for each, cache-first then online. Final error spells out three workarounds.
Klein FLUX.2 transformer loaders are generic enough for dev — Flux2Transformer2DModel diffusers defaults are already dev's (8+48 blocks, joint_attention_dim=15360, mlp_ratio=3.0, 4D RoPE). No new transformer loader code needed.
Backend flux2/denoise.py + sampling_utils.py are model-agnostic — same 32-channel VAE, 4D RoPE, packing, guidance vector, scheduler support — no changes needed.

Invocations

flux2_dev_model_loader, flux2_dev_text_encoder, flux2_dev_lora_loader (+ collection variant). MistralEncoderField added to model.py.
Text encoder runs Mistral's chat template with a fixed system message and stacks hidden states from layers (10, 20, 30) → (B, seq, 15360) matching the transformer's joint_attention_dim. Tries multimodal [{type, text}] content first (PixtralProcessor / Mistral3Processor) and falls back to plain-string content, then to manual [INST]…[/INST] formatting.
flux2_denoise / flux2_vae_decode / flux2_vae_encode are reused unchanged — they're already model-agnostic.

Qwen3 probe strictness (bugfix)

_get_qwen3_variant_from_state_dict / _get_variant_from_config now return None / raise NotAMatchError for unknown hidden_size instead of silently defaulting to qwen3_4b. The old fallback meant any llama.cpp GGUF causal LM (Mistral, Llama, …) was misclassified as Qwen3 — caught when a Mistral 3.x GGUF was identified as qwen3_4b.

Frontend

New MistralEncoderModelConfig type + isMistralEncoderModelConfig, isFlux2DevMainModelConfig, isFlux2DevDiffusersMainModelConfig guards. selectMistralEncoderModels, selectFlux2DevDiffusersModels, useMistralEncoderModels, useFlux2DevDiffusersModels hooks/selectors.
paramsSlice: flux2DevVaeModel / flux2DevMistralEncoderModel / flux2DevSourceModel fields + reducers + selectIsFlux2Dev / selectIsFlux2Klein selectors.
ParamFlux2DevModelSelect component (VAE + Mistral Encoder dropdowns), wired into AdvancedSettingsAccordion (Dev shows the Mistral selector, Klein keeps the Qwen3 selector).
buildFLUXGraph: dev branch for flux2_dev_model_loader / flux2_dev_text_encoder / shared flux2_denoise with full txt2img / img2img / inpaint / outpaint support, plus multi-reference image editing via flux_kontext collect chain (Flux2RefImageExtension is model-agnostic). Dev model loader's vae is wired into both flux2_denoise (required for BN statistics / inpaint) and flux2_vae_decode.
New addFlux2DevLoRAs helper, wires LoRAs through flux2_dev_lora_collection_loader.
readiness.ts: variant-aware FLUX.2 readiness — dev requires flux2DevVaeModel + flux2DevMistralEncoderModel (or a Dev diffusers source), Klein keeps the Qwen3/VAE check. hasFlux2DevDiffusersSource threaded through both generate and canvas tabs.
zModelType / zModelFormat / zFlux2VariantType extended for mistral_encoder / mistral_small_3_1 / dev. New zMistralVariantType in AnyModelVariant union. Display-name maps updated.
New i18n keys: noFlux2DevVaeModelSelected, noFlux2DevMistralEncoderModelSelected.
OpenAPI schema regenerated; TS types up to date.

Starter models (7 entries)

black-forest-labs/FLUX.2-dev Diffusers (~80 GB, Non-Commercial)
diffusers/FLUX.2-dev-bnb-4bit Diffusers (NF4, ~18 GB w/ offload)
city96/FLUX.2-dev-gguf Q4 / Q6 / Q8 transformer-only entries depending on the FLUX.2 VAE + a Mistral encoder
Mistral Small 3.1 encoder (bf16, NF4)

Related Issues / Discussions

None yet. The upstream feature/flux2-noncommercial-license work is unrelated but worth coordinating with — FLUX.2 [dev] inherits the BFL Non-Commercial License and is flagged in isNonCommercialMainModelConfig.

QA Instructions

Backend probing (no model load required, takes seconds):

uv run --extra cuda pytest tests/test_imports.py tests/model_identification/test_identification.py::test_default_settings_main tests/model_identification/test_identification.py::test_controlnet_t2i_default_settings

Manual probe against any local FLUX.2 [dev] artifacts (folder, transformer subfolder, text_encoder subfolder, GGUF, single-file VAE, Mistral GGUF) — all should classify with the matching dev / mistral_encoder configs and existing FLUX.2 Klein fixtures should still classify identically.

Frontend checks (from invokeai/frontend/web):

pnpm lint:tsc
pnpm lint:eslint
pnpm test:no-watch

End-to-end with a real model:

Install via the Model Manager:
- Transformer: any of the city96/FLUX.2-dev-gguf quants (Q4_K_M is a good balance at ~18 GB) or the full black-forest-labs/FLUX.2-dev diffusers folder
- VAE: flux2-vae.safetensors (or extract from a diffusers source)
- Mistral encoder: mistralai/Mistral-Small-3.1-24B-Instruct-2503 GGUF (or the BFL text_encoder subfolder, or the diffusers/FLUX.2-dev-bnb-4bit text_encoder for NF4)
Select the FLUX.2 [dev] transformer in the main-model picker. The Advanced Settings accordion should now show "FLUX.2 [dev] VAE" and "FLUX.2 [dev] Mistral Encoder" dropdowns (not the Klein Qwen3 selector).
For GGUF / single-file transformers: pick the standalone VAE + Mistral encoder. The pre-queue check should now produce the Dev-specific error messages if either is missing.
Defaults from MainModelDefaultSettings.from_base(Flux2, Dev): 28 steps, guidance 3.5, CFG 1.0, 1024×1024.
Generate. Watch the log — first run will warm the model cache; subsequent runs go straight to denoise.
Try a multi-reference image: add a FLUX.2 ref image (the same UI as Klein), generate.
Apply a FLUX.2 LoRA and generate — variant mismatch (Klein LoRA on Dev) should log a warning but not crash.

Tokenizer/processor caveat: the GGUF Mistral encoder has no bundled tokenizer; the loader pulls from a fallback chain (BFL FLUX.2-dev tokenizer/, then mistralai/Mistral-Small-3.1-…, then …-3.2-…). If HF_ENDPOINT is unreachable and nothing is cached, the loader raises a clear error with three documented workarounds (install the Diffusers folder, set a reachable endpoint, or pre-cache with huggingface-cli download …).

Merge Plan

No DB migration: the params slice gained new fields, but they default to null and the slice version did not change. Existing user states keep working — Dev-specific fields just stay null until the user picks a Dev model.
Tested locally against a flux2-dev-Q2_K.gguf + Mistral 3.2 Q3_K_S GGUF + standalone FLUX.2 VAE setup; full-precision Diffusers pipeline path is structurally covered but I have not run inference on the 80 GB bf16 weights.
Non-Commercial License: dev inherits the existing FLUX dev / Klein 9B non-commercial flagging in isNonCommercialMainModelConfig.

Checklist

The PR has a short but descriptive title, suitable for a changelog — e.g. feat(flux2): add FLUX.2 [dev] support
Tests added / updated (if applicable) — existing model-identification + readiness test fixtures updated to cover the new fields; Klein fixtures verified still passing
❗Changes to a redux slice have a corresponding migration — N/A, new fields default to null, slice version unchanged
Documentation added / updated (if applicable)
Updated What's New copy (if doing a release after this PR)

Adds end-to-end support for FLUX.2 [dev] alongside the existing Klein implementation. Dev uses Mistral Small 3.1 (24B) as its sole text encoder instead of Klein's Qwen3, with joint_attention_dim=15360 and the guidance-distilled 32B transformer. Backend - taxonomy: Flux2VariantType.Dev, ModelType.MistralEncoder, ModelFormat.MistralEncoder, MistralVariantType - configs: probe dev via context_in_dim=15360 (main + LoRA); new mistral_encoder.py with Diffusers / Checkpoint / GGUF configs; Main_Diffusers_Flux2_Config accepts Flux2Pipeline class name - loaders: new mistral_encoder.py (AutoModel for Diffusers folder, MistralModel for single-file + GGUF with llama.cpp key conversion). Existing Klein transformer loaders are generic enough for dev - ModelRecordChanges.variant union extended with MistralVariantType Invocations - flux2_dev_model_loader, flux2_dev_text_encoder (Mistral chat-template with FLUX2_DEV_SYSTEM_MESSAGE and layer-stacking 10/20/30), flux2_dev_lora_loader (+ collection variant) - MistralEncoderField on model.py; flux2_denoise / flux2_vae_decode / flux2_vae_encode reused unchanged (already model-agnostic) Frontend - types/hooks/selectors for MistralEncoder, isFlux2DevMainModelConfig, selectFlux2DevDiffusersModels, useMistralEncoderModels - params slice fields flux2DevVaeModel / flux2DevMistralEncoderModel / flux2DevSourceModel + reducers, selectIsFlux2Dev / selectIsFlux2Klein - ParamFlux2DevModelSelect component, wired into AdvancedSettingsAccordion - buildFLUXGraph dev branch with full txt2img / img2img / inpaint / outpaint + multi-reference image editing (same flux_kontext + collect chain as Klein, since Flux2RefImageExtension is model-agnostic) - addFlux2DevLoRAs helper for dev LoRA wiring - zModelType / zModelFormat / zFlux2VariantType extended for mistral_encoder / mistral_small_3_1 / dev - OpenAPI schema regenerated, TS types updated Starter models - FLUX.2 [dev] Diffusers (bf16 + NF4), three GGUFs (Q4/Q6/Q8), Mistral encoder (bf16 + NF4)

Follow-up fixes after first end-to-end run with FLUX.2 [dev] GGUF + Mistral 3.x GGUF + standalone FLUX.2 VAE. Frontend - buildFLUXGraph: wire dev model loader's vae into both flux2_denoise (required for BN statistics / inpaint) and flux2_vae_decode; missing edge was raising RequiredConnectionException at runtime - readiness.ts: variant-aware FLUX.2 readiness check — dev requires flux2DevVaeModel + flux2DevMistralEncoderModel (or a Dev diffusers source); Klein keeps Qwen3/VAE check. Threads hasFlux2DevDiffusersSource through generate + canvas tabs and updates buildGenerateTabArg / buildCanvasTabArg test helpers - en.json: noFlux2DevVaeModelSelected, noFlux2DevMistralEncoderModelSelected Mistral encoder loader (GGUF / single-file) - Fix "Cannot copy out of meta tensor": llama.cpp conversion produced `model.*` keys but loader instantiated bare MistralModel (no `model.` prefix). Add _convert_for_bare_mistral_model to strip the prefix and drop lm_head before load_state_dict - _materialize_remaining_meta_tensors: after load_state_dict, replace any still-meta parameters (norms→ones, others→zeros) and buffers so the cache→VRAM move can't fail on partial state dicts, with a warning listing what was missing - llama.cpp converter: map attn_q_norm/attn_k_norm (Mistral 3.x qk-norm variants), with ordering before attn_q/attn_k to avoid bad rewrites Tokenizer / processor fallback - _load_processor_with_offline_fallback walks a list of sources (black-forest-labs/FLUX.2-dev tokenizer subfolder, then mistralai/Mistral-Small-3.1-… and 3.2-…), trying AutoProcessor then AutoTokenizer for each, cache-first then online. Final error spells out the three workarounds (install Diffusers folder, set HF_ENDPOINT, pre-cache the tokenizer) - flux2_dev_text_encoder: try multimodal `[{type, text}]` chat template first (PixtralProcessor / Mistral3Processor), fall back to plain string content (AutoTokenizer), then to manual [INST]…[/INST] Qwen3 encoder probe strictness - _get_qwen3_variant_from_state_dict and _get_variant_from_config now return None / raise NotAMatchError for unknown hidden_size instead of silently defaulting to qwen3_4b. The old fallback meant any llama.cpp GGUF causal LM (Mistral, Llama, …) was wrongly classified as Qwen3 — visible when the Mistral 3.x GGUF was identified as a Qwen3-4B encoder - Checkpoint / GGUF / Diffusers loaders propagate the strictness

Pfannkuchensack added 2 commits May 25, 2026 06:11

Pfannkuchensack requested review from JPPhoto, blessedcoolant, dunkeroni and lstein as code owners May 25, 2026 20:15

github-actions Bot added python PRs that change python files invocations PRs that change invocations backend PRs that change backend files services PRs that change app services frontend PRs that change frontend files labels May 25, 2026

Chore Path fix

684d7d5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: flux2 dev support#9234

Feat: flux2 dev support#9234
Pfannkuchensack wants to merge 3 commits into
invoke-ai:mainfrom
Pfannkuchensack:feature/flux2-dev-support

Pfannkuchensack commented May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Pfannkuchensack commented May 25, 2026

Summary

Related Issues / Discussions

QA Instructions

Merge Plan

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant