Use gzip compression level 6 for better speed tradeoff by sdodson · Pull Request #4444 · coreos/coreos-assembler

sdodson · 2026-02-23T04:13:18Z

Testing on rhcos-418.94.202602191946-0-qemu.ppc64le.qcow2 shows only a
0.3% improvement but 2.7x increase in duration. When we're compressing 6-10
multi-gig files per arch per build this adds up quickly.

2462MiB input qcow

gzip -6 => 1216 MiB, 71 sec, ~ 2 MiB peak RSS
gzip -9 => 1212 MiB, 188 sec, ~ 2 MiB peak RSS
pigz -6 -p4 => 1224 MiB, 5.4 sec, ~ 6 MiB peak RSS
pigz -9 -p4 => 1212 MiB, 28 sec, ~ 6 MiB peak RSS
zstd -3 -T1 => 1197 MiB, 3.3 sec, ~ 55 MiB peak RSS
zstd -3 -T4 => 1197 MiB, 2.5 sec, ~ 112 MiB peak RSS
zstd -9 -T4 => 1153 MiB, 9.9 sec, ~ 256 MiB peak RSS

Measured on a Intel(R) Core(TM) Ultra 7 165U, limited to 4 cores when using
parallel compression tools, assumption being we're not giving our build
environments > 4 cores.

Ultimately, we should consider switching to zstd which compresses better,
faster (even single threaded), and can leverage multiple cores. The only downside being
significantly higher memory usage where it can be 100x worse compared to gzip
and scales with thread count. However, it seems managable at four threads in
in a build environment where I assume 256 MiB is readily available. On the
decompression side it peaks at about 10MiB with this sample.

Testing on rhcos-418.94.202602191946-0-qemu.ppc64le.qcow2 shows only a 0.3% improvement but 2.7x increase in duration. When we're compressing 6-10 multi-gig files per arch per build this adds up quickly. 2462MiB input qcow gzip -6 => 1216 MiB, 71 sec, ~ 2 MiB peak RSS gzip -9 => 1212 MiB, 188 sec, ~ 2 MiB peak RSS pigz -6 -p4 => 1224 MiB, 5.4 sec, ~ 6 MiB peak RSS pigz -9 -p4 => 1212 MiB, 28 sec, ~ 6 MiB peak RSS zstd -3 -T1 => 1197 MiB, 3.3 sec, ~ 55 MiB peak RSS zstd -3 -T4 => 1197 MiB, 2.5 sec, ~ 112 MiB peak RSS zstd -9 -T4 => 1153 MiB, 9.9 sec, ~ 256 MiB peak RSS Measured on a Intel(R) Core(TM) Ultra 7 165U, limited to 4 cores when using parallel compression tools, assumption being we're not giving our build environments > 4 cores. Ultimately, we should consider switching to zstd which compresses better, faster, and can leverage multiple cores. The only downside being significantly higher memory usage where it can be 100x worse compared to gzip and scales with thread count. However, it seems managable at four threads in in a build environment where I assume 256 MiB is readily available. On the decompression side it peaks at about 10MiB with this sample.

openshift-ci · 2026-02-23T04:13:29Z

Hi @sdodson. Thanks for your PR.

I'm waiting for a coreos member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

gemini-code-assist

Code Review

This pull request changes the default gzip compression level from 9 to 6 to improve compression speed, based on benchmarks showing a significant speedup for a minimal loss in compression ratio. The changes in src/cmd-compress and src/cosalib/qemuvariants.py are consistent and correctly implement this. I've added a couple of suggestions to improve maintainability by using constants for the compression level instead of magic numbers.

gemini-code-assist · 2026-02-23T04:14:24Z

src/cmd-compress

    # CLI args or from image.json
    image_json = builds.get_build_image_json(build)
-    gzip_level = 9
+    gzip_level = 6


To improve readability and maintainability, consider defining 6 as a constant at the top of the file (e.g., DEFAULT_GZIP_LEVEL = 6) and using it here. This avoids the use of a magic number and makes it easier to change in the future.

gemini-code-assist · 2026-02-23T04:14:24Z

src/cosalib/qemuvariants.py

            match self.compression:
                case "gzip":
-                    rc = ['gzip', '-9c', uncompressed_path]
+                    rc = ['gzip', '-6c', uncompressed_path]


Similar to the change in src/cmd-compress, it would be good to avoid a magic number here. Consider defining a constant for the gzip compression level (6) at the module level and using it here to improve maintainability. Ideally, this constant would be shared between this file and src/cmd-compress.

sdodson · 2026-02-23T04:15:51Z

Oh, I should mention the data collected above was using these packages, some of which are newer than those present in RHEL9.

pigz-2.8-7.fc43.x86_64
gzip-1.13-4.fc43.x86_64
zstd-1.5.7-2.fc43.x86_64

sdodson · 2026-02-23T14:37:21Z

@dustymabe PTAL, you seem to have been the person most recently to have touched this when you normalized the two points I'm changing toward -9 in 9447bad. I think my estimates show that it may save about 20 minutes per arch per build.

I think there are other places that use gzip in this codebase but they didn't seem to be parts that dealt with compressing large files, things like Ignition assets etc. I don't think it matters nearly as much in those places.

dustymabe · 2026-02-23T14:59:38Z

Ultimately, we should consider switching to zstd which compresses better,

if you think that's a real possibility for OCP/RHCOS we can push for it upstream again and maybe consider switching it on for RHEL10 based RHCOS?

WDYT?

sdodson · 2026-02-23T18:00:36Z

Yeah, I'll summarize some more analysis and advocate for a move to zstd in that thread.

I'd like for us to consider moving forward with this change independent of that outcome given that we're only saving 3.76MiB but adding 110 seconds to each image we compress, assuming that the one sample I have is representative. I could try other files if we suspect that perhaps gzip -9 has a larger impact on non qcow files.

dustymabe · 2026-02-23T18:31:23Z

Yeah, I'll summarize some more analysis and advocate for a move to zstd in that thread.

That would be great!

I'd like for us to consider moving forward with this change independent of that outcome

That's reasonable. It's worth noting that upstream in FCOS we already use xz as our default compressor and the only artifacts we are compressing with gzip compression are applehv and digitialocean and powervs, the first two being upstream only.

given that we're only saving 3.76MiB but adding 110 seconds to each image we compress, assuming that the one sample I have is representative. I could try other files if we suspect that perhaps gzip -9 has a larger impact on non qcow files.

It would be nice to get the same benchmark for just a .raw file. Barring that showing dramatically different results I'm happy to just merge this. Though, note, we'd have to backport this change to older releases (we have branches in this repo for older OpenShift releases) to realize the same benefits there. It might be nice to shield those releases from any potential fallout from this change, but at the same time if we want the time savings we may want to backport after some period of time.

dustymabe · 2026-02-23T18:32:07Z

/ok-to-test

openshift-ci bot added the needs-ok-to-test label Feb 23, 2026

gemini-code-assist bot reviewed Feb 23, 2026

View reviewed changes

openshift-ci bot added ok-to-test and removed needs-ok-to-test labels Feb 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Use gzip compression level 6 for better speed tradeoff#4444

Use gzip compression level 6 for better speed tradeoff#4444
sdodson wants to merge 1 commit intocoreos:mainfrom
sdodson:gzip-6

sdodson commented Feb 23, 2026

Uh oh!

openshift-ci bot commented Feb 23, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 23, 2026

Uh oh!

gemini-code-assist bot Feb 23, 2026

Uh oh!

sdodson commented Feb 23, 2026

Uh oh!

sdodson commented Feb 23, 2026 •

edited

Loading

Uh oh!

dustymabe commented Feb 23, 2026

Uh oh!

sdodson commented Feb 23, 2026

Uh oh!

dustymabe commented Feb 23, 2026

Uh oh!

dustymabe commented Feb 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

sdodson commented Feb 23, 2026

Uh oh!

openshift-ci bot commented Feb 23, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

sdodson commented Feb 23, 2026

Uh oh!

sdodson commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dustymabe commented Feb 23, 2026

Uh oh!

sdodson commented Feb 23, 2026

Uh oh!

dustymabe commented Feb 23, 2026

Uh oh!

dustymabe commented Feb 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sdodson commented Feb 23, 2026 •

edited

Loading