Use gzip compression level 6 for better speed tradeoff#4444
Use gzip compression level 6 for better speed tradeoff#4444sdodson wants to merge 1 commit intocoreos:mainfrom
Conversation
Testing on rhcos-418.94.202602191946-0-qemu.ppc64le.qcow2 shows only a 0.3% improvement but 2.7x increase in duration. When we're compressing 6-10 multi-gig files per arch per build this adds up quickly. 2462MiB input qcow gzip -6 => 1216 MiB, 71 sec, ~ 2 MiB peak RSS gzip -9 => 1212 MiB, 188 sec, ~ 2 MiB peak RSS pigz -6 -p4 => 1224 MiB, 5.4 sec, ~ 6 MiB peak RSS pigz -9 -p4 => 1212 MiB, 28 sec, ~ 6 MiB peak RSS zstd -3 -T1 => 1197 MiB, 3.3 sec, ~ 55 MiB peak RSS zstd -3 -T4 => 1197 MiB, 2.5 sec, ~ 112 MiB peak RSS zstd -9 -T4 => 1153 MiB, 9.9 sec, ~ 256 MiB peak RSS Measured on a Intel(R) Core(TM) Ultra 7 165U, limited to 4 cores when using parallel compression tools, assumption being we're not giving our build environments > 4 cores. Ultimately, we should consider switching to zstd which compresses better, faster, and can leverage multiple cores. The only downside being significantly higher memory usage where it can be 100x worse compared to gzip and scales with thread count. However, it seems managable at four threads in in a build environment where I assume 256 MiB is readily available. On the decompression side it peaks at about 10MiB with this sample.
|
Hi @sdodson. Thanks for your PR. I'm waiting for a coreos member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Code Review
This pull request changes the default gzip compression level from 9 to 6 to improve compression speed, based on benchmarks showing a significant speedup for a minimal loss in compression ratio. The changes in src/cmd-compress and src/cosalib/qemuvariants.py are consistent and correctly implement this. I've added a couple of suggestions to improve maintainability by using constants for the compression level instead of magic numbers.
| # CLI args or from image.json | ||
| image_json = builds.get_build_image_json(build) | ||
| gzip_level = 9 | ||
| gzip_level = 6 |
| match self.compression: | ||
| case "gzip": | ||
| rc = ['gzip', '-9c', uncompressed_path] | ||
| rc = ['gzip', '-6c', uncompressed_path] |
There was a problem hiding this comment.
|
Oh, I should mention the data collected above was using these packages, some of which are newer than those present in RHEL9. |
|
@dustymabe PTAL, you seem to have been the person most recently to have touched this when you normalized the two points I'm changing toward I think there are other places that use gzip in this codebase but they didn't seem to be parts that dealt with compressing large files, things like Ignition assets etc. I don't think it matters nearly as much in those places. |
if you think that's a real possibility for OCP/RHCOS we can push for it upstream again and maybe consider switching it on for RHEL10 based RHCOS? WDYT? |
|
Yeah, I'll summarize some more analysis and advocate for a move to zstd in that thread. I'd like for us to consider moving forward with this change independent of that outcome given that we're only saving 3.76MiB but adding 110 seconds to each image we compress, assuming that the one sample I have is representative. I could try other files if we suspect that perhaps gzip -9 has a larger impact on non qcow files. |
That would be great!
That's reasonable. It's worth noting that upstream in FCOS we already use
It would be nice to get the same benchmark for just a |
|
/ok-to-test |
Testing on rhcos-418.94.202602191946-0-qemu.ppc64le.qcow2 shows only a
0.3% improvement but 2.7x increase in duration. When we're compressing 6-10
multi-gig files per arch per build this adds up quickly.
2462MiB input qcow
gzip -6 => 1216 MiB, 71 sec, ~ 2 MiB peak RSS
gzip -9 => 1212 MiB, 188 sec, ~ 2 MiB peak RSS
pigz -6 -p4 => 1224 MiB, 5.4 sec, ~ 6 MiB peak RSS
pigz -9 -p4 => 1212 MiB, 28 sec, ~ 6 MiB peak RSS
zstd -3 -T1 => 1197 MiB, 3.3 sec, ~ 55 MiB peak RSS
zstd -3 -T4 => 1197 MiB, 2.5 sec, ~ 112 MiB peak RSS
zstd -9 -T4 => 1153 MiB, 9.9 sec, ~ 256 MiB peak RSS
Measured on a Intel(R) Core(TM) Ultra 7 165U, limited to 4 cores when using
parallel compression tools, assumption being we're not giving our build
environments > 4 cores.
Ultimately, we should consider switching to zstd which compresses better,
faster (even single threaded), and can leverage multiple cores. The only downside being
significantly higher memory usage where it can be 100x worse compared to gzip
and scales with thread count. However, it seems managable at four threads in
in a build environment where I assume 256 MiB is readily available. On the
decompression side it peaks at about 10MiB with this sample.