Skip to content

Add BCn encoders/decoders with RDO support#1167

Open
walcht wants to merge 83 commits into
KhronosGroup:mainfrom
walcht:add-BCn-decoder
Open

Add BCn encoders/decoders with RDO support#1167
walcht wants to merge 83 commits into
KhronosGroup:mainfrom
walcht:add-BCn-decoder

Conversation

@walcht

@walcht walcht commented May 7, 2026

Copy link
Copy Markdown
Contributor

As discussed in #1159: ktxTexture2_CompressBCn and ktxTexture2_DecodeBCn are introduced in this PR to allow libktx users/consumers to encode/decode BCn textures from/to raw decompressed formats.

https://github.com/richgel999/bc7enc_rdo does not support BC6HU/BC6HS encoding/decoding and also no BC2 (this format is essentially dead since BC3 replaces it).

Please feel free to give feedback, edit, and nitpick as much as possible.

Some context: I am adding KTX2 support to OIIO (PR: AcademySoftwareFoundation/OpenImageIO#5185) and having libktx encode/decode BCn formats significantly simplifies things (also ETC encoding/decoding which I can also open a PR for - if approved).

I haven't updated the KTX-Software-CTS with the added BCn test files.

Once this is finalized, this will fix #587.

Current TODOs:

  • add CTS test files from a primary platform. As far as I understand, I shouldn't create/transcode KTX2 test files on non-primary platfroms (non arm64). I did do the testing on my local machine (see https://github.com/walcht/KTX-Software-CTS). => added CTS tests + golden files (still not on a primary platform).
  • add RDO post processing step
    • expose RDO parameters with detailed description (some parameters are still missing - I haven't understood them yet)
    • BC1 mode RDO
    • [x] BC2 mode RDO BC2 is rarely used (I don't see any reason why to use it instead of BC3). Postponed or not planned to be implemented at all.
    • BC3 mode RDO
    • BC4 mode RDO
    • BC5 mode RDO
    • BC6H mode RDO (HDR). I don't know if this is possible. If not, then I will not adjust the ert::reduce_entropy function for the moment. postponed (too much work/overload to include in this PR).
    • BC7 mode RDO
    • ultra smooth mode (see: https://richg42.blogspot.com/2021/02/updated-bc7encrdo-with-improved-smooth.html). => trivial to implement, just have to remove dependency on image_u8
  • add multithreading for encoder part (up to 9X decrease in time for my 6-core system)
  • add multithreading for RDO (bc7enc_rdo makes use of OpenMP so we need to do our own MT same as in encoder - this is trivial) => 5.50X to 7.00X time decrease compared to single-threaded mode.
  • add BCn encoding support for ktx encode command
  • add BCn decoding to PNG support for ktx extract command
  • add BCn encoding support for ktx create command
  • agree on which parameters to expose in ktxBCnParams struct (apparently there are many and I am not qualified to know which subset to expose or to expose them all). For the moment I will just expose them all.
    We agreed on using same parameters as UASTC RDO and adding additional ones if they make sense (I haven't benchmarked skip 0 MSE error option so I might remove it if it useless).
  • fix bc7enc_rdo compiler warnings (very simple to do, before merging git diff with original files so that we are 100% that we haven't changed anything).
  • finalize test cases (not CLI tests but rather libktx tests - e.g., texturetests, etc.).
  • enable SIMD acceleration for BC7 encoder (using ISPC - see https://github.com/ispc/ispc) (this should be straightforward and should be enabled by default via a CMake flag; e.g., BC7_SIMD). => since we are planning to use bc7f we will be planning to use bc7g once it is released (this is the SIMD equivalent of bc7f).
  • test LIBKTX_FEATURE_BCN_DECODER CMake flag option
  • use bc7f for BC7 encoding instead of bc7enc_rdo's encoder (significantly faster + maintained).
  • add single image decoders used by GLUpload/VkUpload (should be exposed in the library API but must be independent of the ktxTexture* classes. Software reading a KTX file incrementally will find them useful).
  • add BC6H decoder (copied from MIT Licensed https://github.com/iOrange/bcdec from basisu)
  • add BC6H encoder from basist::astc_6x6_hdr::fast_encode_bc6h
    • fix issues with signed formats (same way that UASTC HDR handles it) => if we are given a signed input which the encoder can't handle; asserts fail, etc. (obviously, since BC6HU is the only one supported). Need to do some pre-processing step clamp signed and above-maximal-encodable value...
  • add BC2 decoder (unpack BC1 followed by unpack sharp alpha).
  • [x] add BC2 encoder postponed/aborted (rarely used).

Note1: no LLMs/AI coding tools were used in any capacity whatsoever in writing or aiding in the writing of this PR.
Note2: I am an individual contributor (main reason I am contributing here is to add support for KTX2 in Blender).

Edit: TODO list edits

walcht and others added 4 commits May 5, 2026 17:01
Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
*Add encode/decode tests (that use both CompressBCn/DecodeBCn)
*Add BCn ktx2 test files (transcoded from
tests/resources/ktx2/color_grid_uastc_zstd_5.ktx2)
*Cleanup BCn test fixtures
*Remove `std::cout` statement

Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
@walcht walcht marked this pull request as draft May 7, 2026 14:11
@walcht

walcht commented May 7, 2026

Copy link
Copy Markdown
Contributor Author

There are also a lot of compiler warnings from bc7enc_rdo dependency. These should be straightforward to address directly in copied files from bc7enc_rdo.

walcht added 2 commits May 8, 2026 09:13
*Add BC1, BC3, BC4, BC5, and BC7 encoding support to "ktx encode"
command.
*Cleanup ktxBCnParams and add BC1/BC3 quality and mode params.
*Add docstrings/documentation to newly added enums/structs in ktx.h.

Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
@CLAassistant

CLAassistant commented May 8, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@MarkCallow

MarkCallow commented May 9, 2026

Copy link
Copy Markdown
Collaborator

@walcht, Thank you for this. Can you view the build logs?

One issue I notice immediately is that there are no changes to command_create.cpp. This too needs updating to support BCn encoding.

Regarding RDO and multi-threading, the UASTC encoder also has separate multi-threading options for the encoder and for the RDO step. You can follow the same model.

If we need to fix warnings in the encoder I suggest forking the encoder then incorporating the fork here by way of git subrepo. That will make it easier to potentially contribute fixes back upstream. If you are agreeable I can make the fork and provide instructions for how to incorporate it in your workarea.

Re. SIMD and ISPC, be careful how you support this. We need to support building and running on arm64 processors. Also compile flags to enable SSE or other SIMD options are not compatible with straightforward use of universal build tool chains, which is why this project does not do universal builds. Better is use of compiler pre-defined macros and run-time queries to discover what the software is being compiled for and running on. However since we aren't doing universal builds there is no need to obsess over this last detail.

@walcht

walcht commented May 10, 2026

Copy link
Copy Markdown
Contributor Author

@MarkCallow - Concerning the command_create.cpp, I am adding BCn encoding for it currently (somehow forgot it) - will also check if I have missed any other commands.

Concerning the build/CI logs: they are mostly failing because of bc7enc_rdo compiler warnings which should be suppressed. They will also fail because I haven't re-generated the golden files yet for ktx CTS (e.g., ktx encode --help output).

If you are agreeable I can make the fork and provide instructions for how to incorporate it in your workarea

Please do so (as far as I understood, this will be forked under the KhronosGroup and any updates here will be pushed there via the git subrepo command). Until that is done, I will keep edit the bc7enc_rdo sub-folder until the CI passes.

Concerning SIMD and ISPC: I think it makes sense to leave this for another PR, do you agree? (reasoning is this: I have to get the basics working properly, add proper testsuite that covers all supported BCn formats, etc. Once that is done, I can open another PR for SIMD performance improvements). Or I will leave this to very end (last in TODO list above).

walcht added 2 commits May 10, 2026 05:44
*Suppress bc7enc_rdo warnings like unused-variables, memset'ing a
non-trivial class (in this case the class is obviously trivial hence a
void* cast is used to suppress this warning).

Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
*Some CIs report further unused variables/functions that are not
reported when building locally.

Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
@walcht

walcht commented May 10, 2026

Copy link
Copy Markdown
Contributor Author

Updates about CI jobs:

  • reuse lint: I don't know why it is not ignoring external/bc7enc_rdo but I have added the SPDX comments in these files.
  • MingW CI: failing because of some linkage error with bc7enc_rdo (undefined reference to rgbx::). I have to test this on my Windows machine to see what I am doing wrong/missing.
  • MacOS (arm64) compiling bc7enc_rdo fails. I have no idea why (I don't have a MacOS machine so I will be testing this CI on my fork).
  • Windows CI: probably some NSIS issue (I don't think I have introduced anything to make it fail)
  • Other failures are still due to unused-variables/functions that I don't get locally (have addressed all that were mentioned in the CI logs).

@MarkCallow

MarkCallow commented May 10, 2026

Copy link
Copy Markdown
Collaborator

@walcht,

You will need to rebase to or merge current main to get the fix for the NSIS issue on Windows.

I have created a fork of https://github.com/richgel999/bc7enc_rdo. To incorporate it in your add-BCn-decoder branch do the following in the repo root directory:

git subrepo clone https://github.com/KhronosGroup/bc7enc_rdo.git external/bc7enc_rdo -b changes_for_ktx

You can make changes in this subdirectory, as you are now, and when everything is working I can push the changes to the fork.

Concerning SIMD and ISPC: I think it makes sense to leave this for another PR

I agree. I wanted to make you aware that arm64 is a build target.

Re. reuse, you have to add an entry to REUSE.toml to get external/bc7enc_rdo ignored. It is better to do that than add SPDX comments to all the files. The entry in REUSE.toml will have to mention a license. Use the MIT license option.

Here are a few high level points.

  • I want to use the bc7f encoder once it is integrated into the bc7enc_rdo repo.
  • Re. decoding, currently the ETC and ASTC decoders have different APIs. The former has a non-public unpack function for single blocks. This is called from the GL and Vulkan upload functions, if the device does not support ETC, to convert the data before it is uploaded. The latter has a call that decodes and converts an entire ktxTexture2 object. I want to have both for all decoders: ASTC, BCn and ETC and the transcoders as both have their uses. It was a mistake not to have provided an unpack function and updated the upload functions when the ASTC decoder was added.
  • I would like BC6H support to be included in this or a later PR.
  • I will be away for 5 days from May 15th and not be able to approve workflow runs during that time. Please plan accordingly.

@MarkCallow

Copy link
Copy Markdown
Collaborator

Re the macOS build failure, because the output from the Xcode build is so voluminous it is run through a script, xcpretty, to prettify it. On a past CI service, without this, the logs exceeded the maximum allowed. Recently, for reasons I have yet to investigate, it has started swallowing compile errors. You can turn it off by editing scripts/install_macos.sh and commenting out the line that installs xcpretty.

walcht added 2 commits May 10, 2026 12:31
*Add BCn encoder support for `ktx create` command.
*Add missing tests for `ktx encode`, `ktx extract`, and `ktx create`.
*Expose BC1/BC3 approximation mode option to ktx CLIs
*Misc cleanups (still early-stage PR)

Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
@walcht

walcht commented May 10, 2026

Copy link
Copy Markdown
Contributor Author

@MarkCallow

I want to use the bc7f encoder once it is integrated into the bc7enc_rdo repo.

Agree. I wasn't initially aware of this. Apparently bc7f is significantly faster that the bc7 encoder used here (also the added benefit of being continuously maintained). I will integrate this right now since this seems to be straightforward (bc7enc_rdo also seems a bit not-longer-maintained so l think it's better to just integrate it now rather than waiting for it to be integrated into bc7enc_rdo repo).

Concerning the decoding API: I added BCn decoders for VkUpload/GLUpload as a TODO (will follow same API as in etcunpack). Might also open a PR to add it for ASTC since I have already spent some time getting familiar with this code base.

I would like BC6H support to be included in this or a later PR.

Will add it in this PR. Will add it at the very end though since I have to finalize current formats.

I will be away for 5 days ...

I will try to address CI issues now (It is fine if I incrementally commit here to see if certain jobs pass?).

walcht and others added 6 commits May 10, 2026 14:30
Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
…Group/bc7enc_rdo.git external/bc7enc_rdo

subrepo:
  subdir:   "external/bc7enc_rdo"
  merged:   "dbe416d2"
upstream:
  origin:   "https://github.com/KhronosGroup/bc7enc_rdo.git"
  branch:   "changes_for_ktx"
  commit:   "dbe416d2"
git-subrepo:
  version:  "0.4.9"
  origin:   "https://github.com/ingydotnet/git-subrepo"
  commit:   "5e0f401"
*Before this commit, bc7enc_rdo dependency was manually copied to
external/bc7enc_rdo directory (only needed files were copied). This was
not ideal for a lot of reasons (mainly that we are introducing changes
that may be streamed back to the original repo and having a
subrepo/submodule is better suited for that than manually copying
dependency files).
*Add bc7enc_rdo to REUSE.toml with MIT license.
*Git ignore compile_commands.json file (used by clangd LSP)

Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
@walcht

walcht commented May 10, 2026

Copy link
Copy Markdown
Contributor Author

The integrated basisu_transcoder is a bit outdated (actually, significantly) and doesn't contain bc7f. So for the moment I will just use bc7enc_rdo's BC7 encoder until the fork at (https://github.com/KhronosGroup/basis_universal/tree/fixes_for_ktx_v5_0) is updated to a commit that includes the bc7f namespace.

@MarkCallow

MarkCallow commented May 11, 2026

Copy link
Copy Markdown
Collaborator

I was not aware that bc7f is included in basisu_transcoder. By pure coincidence I have just completed integration of Basis Universal release 2.1.0. See the update_basisu_to_2_1_0 branch. We have the v5.0.0 release in flight. I've made a v5.0.0-rc1 release while we wait for some external (to KTX-Software) pieces to be put in place. My plan was to wait until we made the v5.0.0 release before merging this branch. If you retarget this PR and your branch to update_basisu_to_2_1_0 you can start working on bc7f now.

Neither the KTX-Software code nor our golden files required any updates for our extensive test suite to pass with BU 2.1.0. I am therefore amenable to merging it now but will have to discuss within the Khronos WG and can't make any promises.

The single image decoders used by GLUpload/VkUpload should be exposed in the library API but must be independent of the ktxTexture* classes. Software reading a KTX file incrementally will find them useful.

Regarding the current ETC decoder, please note that it does not have a recognized open source license so might not be suitable for your OIIO work.

*Add initial RDO post processing step for BC1, BC3, and BC7 but without
ultrasmooth blocks support see:
https://richg42.blogspot.com/2021/02/updated-bc7encrdo-with-improved-smooth.html

*Fix encoder in case input texture does not have multiple-of-4
dimensions. This was reading beyond std::vector size before this commit.

*Add initial RDO params to BCnParams with verbose explanation/description
comments.

*Misc refactoring/adjustments.

Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
@walcht

walcht commented May 14, 2026

Copy link
Copy Markdown
Contributor Author

@MarkCallow - please don't approve the CI workflow yet (it will fail because I haven't updated the golden test files yet).

  • I have added initial RDO postprocessing step support for BC1, BC3, and BC7 (only relying on ert.h/cpp without having to use bc7enc_rdo utils.h/.cpp and rdo_bc_encoder.h/cpp because these used OpenMP and it's just better that we only depend on the actual core RDO function in ert.h/cpp).
  • Also fixed the encoder in case input texture is not multiple of 4 (still have to absolutely verify this in all cases - this was reading beyond std::vector's size before this commit...).
  • No ultrasmooth block support yet (added to TODO list).
  • Added initial RDO params for BCnParams in ktx.h (some detailed description for fields I understand).

Will add bc7f once I finish RDO post processing.

@MarkCallow

Copy link
Copy Markdown
Collaborator

I have just merged PR #1170 so you will now find bc7f in main. Please remove external/bc7enc_rdo and adapt this to bc7f. I hope that not too much of your work to date will have been wasted.

One other thing re. bcn_codec.cpp, we need to be able to build with just the decoder when building libktx_read. astc_codec.cpp is one file because of pieces needed by both encoder and decoder so it has ifdefs to allow building only the decoder parts. Unless there is substantial common code you can consider making separate source files for encode and decode. If you do not, then add similar ifdefs.

walcht added 3 commits June 17, 2026 11:51
*Number of components was set to 4 which is incorrect. Changed to 3
which is what BC1 input (i.e., RGB) expects.

Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
@walcht

walcht commented Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

I have a question:

  • Doesn't it make sense to add a warning to both ktx create and ktx encode with BCn encoder when RDO is used but no supercompression is applied?

I will push some commits that finalize BCn encoder documentation and adds a lot of more tests that ensure:

  • non-RDO MT output is exactly the same as non-RDO ST output
  • RDO MT output is exactly the same as RDO ST output
  • RDO tests are now always used with supercompression (--zstd and --zlib)
  • RDO and non-RDO params are tested separately
  • ST and MT are tested separately

I also added some option to CI scripts to print out compiler options flags. This is extremely important for us since I want to know if libktx + ktxtools are compiled used -fno-strict-aliasing or not.

@walcht

walcht commented Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

I also found out the issue why the runners fail with MT RDO (most likely => I verified this and yes this is 100% the cause):

  • ktx create <...> --bcn-rdo (or encode) with --threads 12 does not generate same output as with --threads 1. Why? The naive approach of dividing RDO work between threads results in non-deterministic output in a way that the parameter --bcn-rdo-d is no longer respected when using MT (and becomes dependent on number of threads!).

This is why when omitting the threads option, the code uses max number of available physical threads (4 for Github runners) which is different from the one on my system (12) and which explains why RDO fails.

Let me guess, your MacOS machine certainly has to have 12 physical threads (lscpu shows 12)? Otherwise this explanation is wrong.

This is also incorrect in the original bc7enc_rdo repo (very surprising since it wasn't mentioned in there).

IMO, the work has to be distributed to RDO runners in a more clever way where:

  1. window size (or dictsize) is respected regardless of the number of chosen threads
  2. may use less that the number of specified threads to respect 1)

I think this is also why UASTC RDO is producing non-deterministic output (minus some details about hashes and std::unordered_map, etc.). Might also investigate that.

@MarkCallow MarkCallow left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still have to review the changes under the tools directory. I want to submit these comments now because they become disassociated from their files when the files are updated. I have had to make the same comments 3 times in some cases while they have been pending.

Lots of nitpicks. Nothing major. Thanks for your perseverance through all this.

Comment thread lib/include/ktx.h Outdated
Comment thread lib/include/ktx.h Outdated
Comment thread tools/ktx/command_create.cpp
Comment thread external/bc7enc_rdo/LICENSES/MIT.txt Outdated
Comment thread scripts/generate_bcn_test_files.sh Outdated
Comment thread lib/src/bcn_encoder.cpp Outdated
Comment thread lib/src/bcn_encoder.cpp Outdated
Comment thread lib/src/bcn_encoder.cpp Outdated
Comment thread lib/src/bcn_encoder.cpp
Comment thread .gitignore
@MarkCallow

Copy link
Copy Markdown
Collaborator

I have a question:

* Doesn't it make sense to add a warning to both `ktx create` and `ktx encode` with BCn encoder when RDO is used but no supercompression is applied?

I don't think we are doing it with UASTC LDR. But it does make sense. Please add it for both.

I also added some option to CI scripts to print out compiler options flags. This is extremely important for us since I want to know if libktx + ktxtools are compiled used -fno-strict-aliasing or not.

I think it is only set for the basisu_encoder target. I didn't see those options in the review I've conducted so far.

Who is "us" in this context?

@MarkCallow

Copy link
Copy Markdown
Collaborator

I also found out the issue why the runners fail with MT RDO (most likely => I verified this and yes this is 100% the cause):

Fantastic. Congratulations.

Let me guess, your MacOS machine certainly has to have 12 physical threads (lscpu shows 12)? Otherwise this explanation is wrong.

Well lscpu is not available in macOS, but sysctl machdep.cpu says "machdep.cpu.core_count: 12".

This is also incorrect in the original bc7enc_rdo repo (very surprising since it wasn't mentioned in there).

IMO, the work has to be distributed to RDO runners in a more clever way where:

1. window size (or dictsize) is respected regardless of the number of chosen threads

2. may use less that the number of specified threads to respect 1)

I think this is also why UASTC RDO is producing non-deterministic output (minus some details about hashes and std::unordered_map, etc.). Might also investigate that.

This is almost certainly the reason why UASTC RDO too produces non-deterministic results.

The std::unordered_map issue led to non-determinism in encoding across platforms due to differing implementations. I think the basisu code has been fixed to use its own map. Certainly my tests show deterministic results across platforms for all but one very edge case and any case using zlib supercompression.

@walcht

walcht commented Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

Thanks for the detailed comments/reviews. I will address this after I push the commit that fixes RDO multithreading.

Who is "us" in this context?

Typo from my part. I meant "important for this PR" so that I know whether to watch out for UBs caused by pointer castings then dereferences. I will figure this out at the very end after I address all other issues/reviews.

This is almost certainly the reason why UASTC RDO too produces non-deterministic results.

If true then a command like this ktx create <invoke-uastc-params> --uastc-rdo --uastc-rdo-d 8192 may actually do something like this (depending on how many threads) ktx create <invoke-uastc-params> --uastc-rdo --uastc-rdo-d <actual-window-size-can-be-smaller-than-8192> which at the very least, if currently not fixable, should notify the user that provided window size may not actually be the one that is used because of MT.

Also, another funny observation, if you do that (ktx create with different --threads) for typical-sized images (e.g., 1920x1080 and definitely not something extremely small like 16x16), on a typical system (e.g., 12 threads) you will probably get deterministic output.

walcht added 4 commits June 18, 2026 16:02
*Previously, RDO MT was non deterministic in the sense that a ktx
create/encode with --threads 4 (or any >1) may not necessarily produce
the same output as --threads 1. Not only that, the higher the --threads
parameter and the lower the size of the image (in number of blocks) the
more likely that the actually used RDO window size will be lower than
the one that is provided! This commit addresses this by potentially
using a lower-than-requested thread count to make sure that RDO MT
produces exact same output as RDO ST.

Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
*Why did I even add that in the first place...

Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
@walcht

walcht commented Jun 19, 2026

Copy link
Copy Markdown
Contributor Author

There are still some reviews/comments I haven't addressed yet.

Before doing the final (re-)review:

  • There are also still some tests that I want to add (e.g., non-multiple-of-4 input textures, bc1 mode).
  • I haven't done anything yet for premultiplied-alpha input
  • Need to update my guide about using BCn with ktx

walcht added 3 commits June 19, 2026 10:53
Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>

@MarkCallow MarkCallow left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes. These comments are from reviewing the changes. I have yet to review the tools changes.

Regarding clang_format, we haven't turned it on it. Before I do I want to create a smudge/clean filter that will apply the formatting on commit. (The smudge filter would be a nop.) I have not had the time. If you are already using clang_format in your workflow I'd love to hear how you do it.

Comment thread .github/workflows/macos.yml Outdated
Comment thread .github/workflows/macos.yml
Comment thread .github/workflows/mingw.yml Outdated
Comment thread .github/workflows/windows.yml Outdated
Comment thread .github/workflows/windows.yml Outdated
Comment thread lib/src/bcn_decoder.cpp Outdated
VkFormat decompressedVkFormat;
KTX_error_code result;

if (This->supercompressionScheme != KTX_SS_NONE)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. You will have to load the data. Since there is no KTX_TEXTURE_CREATE_SKIP_INFLATE flag if the data has been loaded, it will have been inflated and the SS changed. If you see an SS other than NONE, you will need to check for a pending load. If available, do it. If not raise an error. Check the ASTC decoder for an example.

Comment thread lib/src/bcn_decoder.cpp
@walcht

walcht commented Jun 21, 2026

Copy link
Copy Markdown
Contributor Author

I forgot to push my last commit before your view. Will push it shortly and will request some reviews to some changes I made.

@walcht

walcht commented Jun 21, 2026

Copy link
Copy Markdown
Contributor Author

Before I do I want to create a smudge/clean filter that will apply the formatting on commit. (The smudge filter would be a nop.) I have not had the time.

If you want I can open a PR for that (+ build step clang-format formatting check).

If you are already using clang_format in your workflow I'd love to hear how you do it.

I have it set up via Vim/NeoVim. On some custom key press I set, it just invokes the clang-format CLI on currently open buffer following the .clang-format file in current working directory. I also have it set up to do partial formatting (only selected lines) to avoid introducing unnecessary changes to un-formatted files (e.g., tools/ subfolder).

walcht added 2 commits June 21, 2026 11:58
Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
@MarkCallow

Copy link
Copy Markdown
Collaborator

Before I do I want to create a smudge/clean filter that will apply the formatting on commit. (The smudge filter would be a nop.) I have not had the time.

If you want I can open a PR for that (+ build step clang-format formatting check).

That would be wonderful but let's wait until we get the BCn related stuff finished. There is already a workflow to run clang-format but the calls to it in other workflows are commented out. This is because several changes are needed in the formatting for some minor tweaks and to avoid breaking some files. We had a contractor who put in place these format-related items but the contract came to an end before the various issues could be addressed. I will have to find the closed PR where I listed the issues and refresh my memory.

If you are already using clang_format in your workflow I'd love to hear how you do it.

I have it set up via Vim/NeoVim. On some custom key press I set, it just invokes the clang-format CLI on currently open buffer following the .clang-format file in current working directory. I also have it set up to do partial formatting (only selected lines) to avoid introducing unnecessary changes to un-formatted files (e.g., tools/ subfolder).

Thanks for the this. I am a Vim user too. Is this a custom thing you did or is it a standard Vim extension?

@walcht

walcht commented Jun 21, 2026

Copy link
Copy Markdown
Contributor Author

That would be wonderful but let's wait until we get the BCn related stuff finished.

Agree. Either way, adding clang-format checks to CI should be fairly simple.

Is this a custom thing you did or is it a standard Vim extension?

Only using built-in commands/cli-invocations. I'm mostly using NeoVim but this should also work on Vim:

To format a selection of lines:

  1. I optionally select lines I want to format with
  2. I type :'<,'>%!clang-format which pipes the selected lines into clang-format CLI and overwrites them with its output (very handy with other CLIs. E.g., json formatting a selection using Python).

Or for formatting the whole file (this is what I usually do, then I revert hunks that I didn't change):

  1. I type :%!clang-format

@walcht

walcht commented Jun 21, 2026

Copy link
Copy Markdown
Contributor Author

While adding actual robust testing to non-CTS tests (i.e., don't just check that decoding worked fine but compare the underlying data with original input within 0.08 tolerance), I found 2 more issues:

  1. RDO with non-SRGB input yields erroneous results (it assumes SRGB input).
  2. BC5 decoder is doing something incorrect => fairly confident that my decoder is correct. There is some issue with UASTC and transcoding to BC5. I opened another issue. Details are there (I know, I'm spamming a lot of issues now...).

walcht added 4 commits June 22, 2026 08:05
*Without this, ktxdiff fails with "Mismatching header" on RGB16 SFLOAT
input.

Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
*So that tests can compare decoded BCn KTX2 files with original files
without failing because of different KVD.

Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
*Previously these unorm PNG BCn inputs had sRGB colorspace and were
converted via --assign-tf in ktx create calls.

*Remove (i.e., set to 0) unused channels (e.g., for rg8_unorm_bc5, set B
to 0 using oiiotool). See genktx2 for exact command.

Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
*Add option to write failed ktxdiff BCn decode tests to temporary
location (in build/Testing/Temporary/GeneratedFiles/).

*Update CTS submodule to always use supercompression in tests when RDO
is used.

*Significantly improve BCn decoder and encoder testing via differential
testing with UASTC -> BCn transcode files.

Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add BC7 encoder with RDO

3 participants