Skip to content

[GLUTEN-11673][CORE] Fix URL-encoded paths in ResourceUtil causing JAR loading failures#11672

Open
clee704 wants to merge 1 commit intoapache:mainfrom
clee704:fix-resourceutil-url-encoded-paths
Open

[GLUTEN-11673][CORE] Fix URL-encoded paths in ResourceUtil causing JAR loading failures#11672
clee704 wants to merge 1 commit intoapache:mainfrom
clee704:fix-resourceutil-url-encoded-paths

Conversation

@clee704
Copy link
Contributor

@clee704 clee704 commented Feb 27, 2026

What changes were proposed in this pull request?

Fix URL-encoded path handling in ResourceUtil.getResources() by using URL.toURI() and URI for proper path decoding instead of URL.getPath().

Changes:

  • Added imports for java.net.URI and java.net.URISyntaxException
  • Fixed the file: protocol handler to use new File(containerUrl.toURI()) instead of new File(containerUrl.getPath())
  • Fixed the jar:file: protocol handler to use new File(new URI("file://" + jarPath)) for proper decoding
  • Added unit test verifying URI decoding behavior

Why are the changes needed?

When Gluten is loaded from a JAR file located in a directory with special characters in its path (such as @ or spaces), ResourceUtil.getResources() fails because ClassLoader.getResources() returns URLs where special characters are percent-encoded (e.g., @ becomes %40). ResourceUtil was passing these URL-encoded paths directly to java.io.File(), which expects filesystem paths, not URL-encoded strings.

The fix uses URL.toURI() and new File(URI) which properly decode percent-encoded characters. This also correctly preserves + characters (unlike URLDecoder.decode() which treats + as space).

Does this PR introduce any user-facing change?

No

How was this patch tested?

Added unit test testUriDecodesPercentEncodedPaths that verifies:

  • URI correctly decodes %40 back to @
  • URI preserves + characters (unlike URLDecoder)

Existing unit tests continue to pass.

Related issue: #11673

… loading failures

## What changes were proposed in this pull request?

Fix URL-encoded path handling in ResourceUtil.getResources() by using
URL.toURI() and URI for proper path decoding instead of URL.getPath().

Changes:
- Added imports for java.net.URI and java.net.URISyntaxException
- Fixed the file: protocol handler to use new File(containerUrl.toURI())
  instead of new File(containerUrl.getPath())
- Fixed the jar:file: protocol handler to use
  new File(new URI("file://" + jarPath)) for proper decoding
- Added unit test verifying URI decoding behavior

## Why are the changes needed?

When Gluten is loaded from a JAR file located in a directory with special
characters in its path (such as @ or spaces), ResourceUtil.getResources()
fails because ClassLoader.getResources() returns URLs where special
characters are percent-encoded (e.g., @ becomes %40). ResourceUtil was
passing these URL-encoded paths directly to java.io.File(), which expects
filesystem paths, not URL-encoded strings.

The fix uses URL.toURI() and new File(URI) which properly decode
percent-encoded characters. This also correctly preserves + characters
(unlike URLDecoder.decode() which treats + as space).

## Does this PR introduce _any_ user-facing change?

No

## How was this patch tested?

Added unit test testUriDecodesPercentEncodedPaths that verifies:
- URI correctly decodes %40 back to @
- URI preserves + characters (unlike URLDecoder)
Existing unit tests continue to pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions github-actions bot added the CORE works for Gluten Core label Feb 27, 2026
@github-actions
Copy link

Run Gluten Clickhouse CI on x86

@clee704 clee704 changed the title [GLUTEN-XXXX][CORE] Fix URL-encoded paths in ResourceUtil causing JAR loading failures [GLUTEN-11673][CORE] Fix URL-encoded paths in ResourceUtil causing JAR loading failures Feb 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CORE works for Gluten Core

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants