Skip to content

Fix FlashAttention3 optimal use when available#195

Open
3manifold wants to merge 1 commit intoLightricks:mainfrom
3manifold:patch-1
Open

Fix FlashAttention3 optimal use when available#195
3manifold wants to merge 1 commit intoLightricks:mainfrom
3manifold:patch-1

Conversation

@3manifold
Copy link
Copy Markdown

@3manifold 3manifold commented Apr 17, 2026

When the model cfg is set to "attention_type": "default" (e.g. see config {"transformer": {"_class_name": "AVTransformer3DModel",...,"attention_type": "default", ...} } in https://huggingface.co/Lightricks/LTX-2.3/blob/main/ltx-2.3-22b-dev.safetensors), AttentionCallable is not treated optimally in packages/ltx-core/src/ltx_core/model/transformer/attention.py.

In brief, packages/ltx-core/src/ltx_core/model/transformer/model_configurator.py passes down attention_type to packages/ltx-core/src/ltx_core/model/transformer/attention.py. There, even if the imports flag the various attention cases correctly (see memory_efficient_attention, flash_attn_interface) at the top of the file, class AttentionFunction(Enum) is mishandling AttentionCallable values.

This PR fixes that behaviour following the principle of FA3 > XFormers > PyTorch. It also adds a fallback from FlashAttention3 to PytorchAttention in case of arbitrary attention masks to avoid errors during runs.

resolves #196

@3manifold 3manifold marked this pull request as ready for review April 17, 2026 13:48
@3manifold 3manifold changed the title Fix FlashAttention3 optimal support use when available Fix FlashAttention3 optimal support when available Apr 17, 2026
@3manifold 3manifold changed the title Fix FlashAttention3 optimal support when available Fix FlashAttention3 optimal use when available Apr 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] FlashAttention3 not employed when available & attention_type is default

1 participant