Remove `verify_gz` by eileencodes · Pull Request #9093 · ruby/rubygems

eileencodes · 2025-11-17T22:12:29Z

This PR removes the #verify_gz method because it is redunant and unnecessary.

Previously the data.tar.gz would get read twice for every file - once in verify_gz and once in extract_files. The extract_files method verifies the data.tar.gz when it reads it, and raises an error if unzipping it fails.

The verify_gz code can be seen in some profiles as a hotspot - although not major - as it accounts for between 9% and 17% of time, but only when the installation thread doesn't have native extensions or plugins.

Note: to create this profile I actually split the fetching and installing into two steps so I could profile just installing all the gems. I then profiled just the install code.

What was the end-user or developer problem that led to this PR?

I was looking at profiles for bundler and noticed we spend time in verify_gz but that method is redundant.

What is your fix for the problem, implemented in this PR?

See commit message

Make sure the following tasks are checked

Describe the problem / feature
Write tests for features and bug fixes
Write code to solve the problem
Make sure you follow the current code style and write meaningful commit messages without tags

cc/ @tenderlove @Edouard-chin

Edouard-chin · 2025-11-18T00:04:58Z

lib/rubygems/package.rb

-  def verify_gz(entry) # :nodoc:
-    Zlib::GzipReader.wrap entry do |gzio|
-      # TODO: read into a buffer once zlib supports it
-      gzio.read 16_384 until gzio.eof? # gzip checksum verification


It seems that we read until EOF because that's when the GZipReader will verify the gzip checksum header against the decompressed data. I read that from the doc https://docs.ruby-lang.org/en/3.4/Zlib/GzipReader.html

When an reading request is received beyond the end of file (the end of compressed data). That is, when Zlib::GzipReader#read [...] reading returns nil.

Are we still reading at eof anywhere else so that the checksum verification is still performed?

rubygems.org currently depends on this code to validate gems being pushed. Talking with Arron this morning, it should be fairly trivial 🤞🏻 to move, but I wanted to highlight this so any changes can be coordinated between the two projects.

Edouard-chin · 2025-11-18T15:35:32Z

Ah, just as a FYI since I'm seeing this (trying to fix something in the same areas), when we swap the "spec" object (from an EndpointSpecification to the actual Specification read from the gemspec), we perform the verification and if there is an issue we remove the .gem file since it's corrupted.

rubygems/bundler/lib/bundler/source/rubygems.rb

Lines 193 to 205 in 9711b0c

    
           if spec.remote 
        
             s = begin 
        
               installer.spec 
        
             rescue Gem::Package::FormatError 
        
               Bundler.rm_rf(path) 
        
               raise 
        
             rescue Gem::Security::Exception => e 
        
               raise SecurityError, 
        
                "The gem #{File.basename(path, ".gem")} can't be installed because " \ 
        
                "the security policy didn't allow it, with the message: #{e.message}" 
        
             end 
        
             spec.__swap__(s)

I think now we'll no longer remove the .gem if it's corrupted.

This PR removes the `#verify_gz` method because it is redunant and unnecessary. Previously the `data.tar.gz` would get read twice for every file - once in `verify_gz` and once in `extract_files`. The `extract_files` method verifies the `data.tar.gz` when it reads it, and raises an error if unzipping it fails. The `verify_gz` code can be seen in some profiles as a hotspot - although not major - as it accounts for between 9% and 17% of time, but only when the installation thread doesn't have native extensions or plugins.

tenderlove

I think this is good and we should merge it.

We merged this PR which ensures that gems are written atomically when they're downloaded. Since they're written atomically, we'll never try to extract a corrupted gem so there's no need to check. As @colby-swandale mentioned, this code was being used by RubyGems.org on upload to verify integrity of the gem, but we've removed that in this PR.

@hsbt you tagged this as a "breaking change". I don't think it breaks anything, but I could be wrong. Do you know what it breaks?

hsbt · 2026-02-09T00:30:53Z

@tenderlove

I've labeled it this way because it means we're no longer doing what we were doing before.

It's just a backport and documentation usage, not a merge that won't happen until Version 5. Shall we change it to performance label?

colby-swandale · 2026-02-09T00:34:31Z

I agree with changing it to performance label 👍🏻

Edouard-chin reviewed Nov 18, 2025

View reviewed changes

hsbt added the rubygems: performance label Nov 20, 2025

colby-swandale self-requested a review December 15, 2025 11:50

eileencodes force-pushed the remove-verify_gz branch from 9e0e706 to 737c829 Compare January 5, 2026 18:47

hsbt added the rubygems: breaking change label Jan 7, 2026

tenderlove approved these changes Feb 7, 2026

View reviewed changes

colby-swandale approved these changes Feb 9, 2026

View reviewed changes

colby-swandale merged commit 3810177 into ruby:master Feb 9, 2026
78 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove `verify_gz`#9093

Remove `verify_gz`#9093
colby-swandale merged 1 commit intoruby:masterfrom
eileencodes:remove-verify_gz

eileencodes commented Nov 17, 2025

Uh oh!

Edouard-chin Nov 18, 2025 •

edited

Loading

Uh oh!

colby-swandale Nov 18, 2025

Uh oh!

Edouard-chin commented Nov 18, 2025

Uh oh!

tenderlove left a comment

Uh oh!

hsbt commented Feb 9, 2026

Uh oh!

colby-swandale commented Feb 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

eileencodes commented Nov 17, 2025

What was the end-user or developer problem that led to this PR?

What is your fix for the problem, implemented in this PR?

Make sure the following tasks are checked

Uh oh!

Edouard-chin Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

colby-swandale Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

Edouard-chin commented Nov 18, 2025

Uh oh!

tenderlove left a comment

Choose a reason for hiding this comment

Uh oh!

hsbt commented Feb 9, 2026

Uh oh!

colby-swandale commented Feb 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Edouard-chin Nov 18, 2025 •

edited

Loading