Skip to content

fix(bigquery): close GAPIC storage transport and auth sessions to prevent socket leaks#17508

Merged
chalmerlowe merged 2 commits into
mainfrom
fix/bigquery-socket-leak
Jun 19, 2026
Merged

fix(bigquery): close GAPIC storage transport and auth sessions to prevent socket leaks#17508
chalmerlowe merged 2 commits into
mainfrom
fix/bigquery-socket-leak

Conversation

@chalmerlowe

Copy link
Copy Markdown
Contributor

This PR resolves resource leaks (specifically open sockets left in the ESTABLISHED state) that occur during client lifecycle operations and credential refreshing in system/unit tests.

The Problem

  1. Transport Lifecycle: When closing the BigQuery Storage client, calling _transport.grpc_channel.close() was insufficient for releasing all network resources. The full _transport.close() method needs to be invoked to tear down the underlying transport channel correctly.

  2. Dynamic Auth Sessions: Under certain authentication environments (like Workload Identity/GCE Metadata server inside Kokoro CI), the google-auth library dynamically instantiates helper HTTP sessions to fetch access tokens. These sessions are not owned by the BigQuery client and are not closed automatically, leading to leaked sockets.

  3. Flaky Test Assertions: Socket count assertions in system tests were flaky because Python's garbage collection is non-deterministic, meaning sockets remained open in the operating system even after client close calls until a garbage collection cycle swept them.

Changes

  • Client & Transport Lifecycle:

    • Updated connection.py, magics.py, and table.py to close the BigQuery Storage transport using _transport.close() instead of _transport.grpc_channel.close().
  • Testing Improvements:

    • Added patch_tracked_requests interceptor to system/unit tests to track and explicitly close all dynamically spawned credential-refreshing HTTP sessions when the test context exits.
    • Added explicit gc.collect() calls to socket leak verification tests to force synchronous sweeping of unreferenced socket objects before asserting final socket counts.
  • Code Coverage:

    • Appended # pragma: NO COVER to Python version checks in __init__.py for code paths that render a deprecation warning code if attempted to run on Python runtimes <3.10 test matrix (these paths do not run in our CI/CD since we never execute code with the older runtimes).

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the BigQuery client to close the transport directly via _transport.close() instead of _transport.grpc_channel.close(), updates corresponding unit and system tests, and adds a new TestPropertyGraphReference test class. The review feedback suggests improving the patch_tracked_requests context manager in system tests to avoid closing pre-existing sessions, and refactoring manual debug prints to use idiomatic assertion messages instead.

Comment thread packages/google-cloud-bigquery/tests/system/test_client.py Outdated
Comment thread packages/google-cloud-bigquery/tests/system/test_client.py Outdated
Comment thread packages/google-cloud-bigquery/tests/system/test_client.py Outdated
Comment thread packages/google-cloud-bigquery/tests/system/test_magics.py Outdated
Comment thread packages/google-cloud-bigquery/tests/system/test_magics.py Outdated
@chalmerlowe chalmerlowe force-pushed the fix/bigquery-socket-leak branch from 77adf99 to f2ec799 Compare June 18, 2026 15:24
@chalmerlowe chalmerlowe force-pushed the fix/bigquery-socket-leak branch from f2ec799 to 43e6eec Compare June 18, 2026 15:43
@chalmerlowe chalmerlowe marked this pull request as ready for review June 18, 2026 16:10
@chalmerlowe chalmerlowe requested review from a team as code owners June 18, 2026 16:10
@chalmerlowe chalmerlowe requested review from tswast and removed request for a team June 18, 2026 16:10

@tswast tswast left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! A few nits.

bigquery_magics = None

if sys.version_info < (3, 10):
if sys.version_info < (3, 10): # pragma: NO COVER

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer we add a comment explaining why this is safe. In this case, it's to protect from the user somehow installing this despite us dropping support in #17187

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the comment on the version check: I looked at the gapic-generator-python templates and how this is handled across the rest of the monorepo. The generator templates do not include an explanatory comment here, and the vast majority of our libraries don't use one. The code is essentially self-documenting since the warnings.warn string states why the check is firing (to warn users who are on an unsupported Python version). To maintain parity with our generated code and the rest of the libraries, I'd prefer to leave the comment off.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the comment on the version check: I looked at the gapic-generator-python templates and how this is handled across the rest of the monorepo. The generator templates do not include an explanatory comment here, and the vast majority of our libraries don't use one. The code is essentially self-documenting since the warnings.warn string states why the check is firing (to warn users who are on an unsupported Python version). To maintain parity with our generated code and the rest of the libraries, I'd prefer to leave the comment off.

client.close()

client.close()
import gc

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've recently experienced pytest still hanging onto resources even after explicit del calls (pola-rs/polars-bigquery-client#3). Might make sense to add a layer of indirection like I did in https://github.com/pola-rs/polars-bigquery-client/pull/3/changes#diff-d9ee89704040c8f986c8e0a4ab3757806eebe9caa54073a72a3672b5626df3cf

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed the polars-bigquery PR. While pytest's reference-holding is a concern, I believe this PR is less susceptible because we’ve moved toward explicit closure of the transport and auth sessions (via the tracker) rather than relying solely on object destruction. Since we aren't dealing with Rust-backed objects that have opaque lifecycles, the explicit .close() calls should release the underlying sockets even if pytest lingers on the Python wrapper. I'll monitor for flakiness, but the current gc.collect() approach seems to resolve the leaks in my testing.

self.assertEqual(len(rows), 100000)

connection.close()
import gc

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same for this function re: splitting the gc logic tests into a separate module.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above comment.

@chalmerlowe chalmerlowe merged commit 0258405 into main Jun 19, 2026
31 checks passed
@chalmerlowe chalmerlowe deleted the fix/bigquery-socket-leak branch June 19, 2026 10:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants