Skip to content

fix(file_transfer): strip query params from pre-signed URLs before logging (CWE-532)#14

Closed
sebastiondev wants to merge 1 commit into
agbcloud:masterfrom
sebastiondev:fix/cwe532-file-transfer-pre-116c
Closed

fix(file_transfer): strip query params from pre-signed URLs before logging (CWE-532)#14
sebastiondev wants to merge 1 commit into
agbcloud:masterfrom
sebastiondev:fix/cwe532-file-transfer-pre-116c

Conversation

@sebastiondev
Copy link
Copy Markdown

Summary

Pre-signed OSS URLs are logged in plaintext at INFO level by FileTransfer.upload() and FileTransfer.download() in python/agb/modules/file_transfer.py. These URLs contain bearer-token query parameters (Signature, OSSAccessKeyId, Expires) — anyone who reads the log line within the URL's expiry window can replay it to read or write the corresponding object, no API key required.

  • CWE: CWE-532 (Insertion of Sensitive Information into Log File)
  • File: python/agb/modules/file_transfer.py
  • Affected functions: FileTransfer.upload() (line ~225), FileTransfer.download() (line ~486)
  • Severity: Medium — credentials-equivalent material written to logs at default verbosity

Data flow

  1. agb.context.get_file_upload_url(...) / get_file_download_url(...) returns a pre-signed OSS URL whose query string carries the OSS signature.
  2. The URL is passed straight into logger.info(f"... URL: {upload_url}") / logger.info(f"... URL: {download_url}").
  3. The default loguru sink writes INFO-level records to stderr and (in typical deployments) to file/aggregator sinks. Anyone with read access to those logs gets the full pre-signed URL.

Fix

Add a small _sanitize_url() helper that uses urllib.parse.urlsplit to strip the query string and fragment, and apply it at the two log sites:

def _sanitize_url(url: str) -> str:
    """Return URL with query string removed to avoid leaking pre-signed tokens."""
    parts = urlsplit(url)
    return parts._replace(query="", fragment="").geturl()

The host and path are still logged (useful for debugging which bucket/object was touched); only the signing parameters are removed. The URL value used for the actual HTTP PUT/GET is unchanged — only the logged representation is sanitized.

I checked python/agb/modules/context.py as well, since it also handles pre-signed URLs; it never logs the URL string (only context_id, file_path, request_id), so no changes are needed there. These two log statements in file_transfer.py are the only places URLs were emitted in this module.

Tests

Added python/tests/unit/test_cwe532_presigned_url_leak.py with two cases:

  • test_upload_does_not_log_presigned_url_query_params — exercises upload() with a fake pre-signed URL containing OSSAccessKeyId, Expires, Signature, captures loguru output, asserts none of those tokens appear.
  • test_download_does_not_log_presigned_url_query_params — same for download().

Both tests fail against the current code on master (the URL with all query params is logged verbatim) and pass with the fix applied. The existing file_transfer test suite continues to pass — the change is confined to log formatting.

Security analysis

Why this is exploitable in practice:

  • Pre-signed OSS URLs are valid for the entire Expires window (typically minutes to hours). During that window, possession of the URL = possession of the object.
  • INFO is the default log level; these messages are emitted during normal SDK use, not just in debug mode.
  • Common log destinations (centralized logging, container stdout collected by the platform, on-disk files, error reports, support bundles) routinely have a broader read audience than the API key itself.

Preconditions for exploitation:

  1. Read access to application logs (often a different and broader trust boundary than the API key).
  2. The log entry is observed before the pre-signed URL expires.

Adversarial review

Before submitting I tried to disprove this. The main "is it really a vuln?" angles:

  • "Maybe logs are already considered as sensitive as API keys." In typical deployments they aren't — log aggregators, ops dashboards, and crash reports are normally accessible to people who don't hold the SDK's API key. Anyone in that group gains the ability to read/write the specific OSS object directly, which the API key alone wouldn't grant them without also going through the SDK.
  • "Maybe the URL doesn't actually contain credentials." Verified by inspecting the OSS pre-signed URL format and the call sites — Signature, OSSAccessKeyId, Expires are present in the query string and are exactly the bearer material OSS validates.
  • "Maybe a framework-level redactor already strips it." There's no logging filter/processor in this SDK that masks query strings; loguru is configured with the default formatter.

The preconditions (log read access) do not already grant equivalent OSS object access, so the fix provides a real reduction in blast radius.

cc @lewiswigmore

FileTransfer.upload() and download() logged the full pre-signed OSS URLs
including query-string parameters that contain authentication tokens
(OSSAccessKeyId, Signature, Expires).  Anyone with access to logs could
use these bearer-token URLs to upload/download files without credentials.

Add a _sanitize_url() helper that strips query and fragment from URLs
before they are interpolated into log messages.  The URL path is still
logged for debugging purposes.
@lewiswigmore
Copy link
Copy Markdown

Closing this to reduce the open-PR pile-up — we have multiple outstanding security contributions to this repo and that volume is not fair on your review queue. Keeping #11 (fix: reject symlinks in extension upload to prevent arbitrary file exfiltration ) as the primary one to focus attention on.

Happy to revisit this finding separately later if it is still relevant. Apologies for the noise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants