[GLUTEN-11673][CORE] Fix URL-encoded paths in ResourceUtil causing JAR loading failures#11672
Open
clee704 wants to merge 1 commit intoapache:mainfrom
Open
[GLUTEN-11673][CORE] Fix URL-encoded paths in ResourceUtil causing JAR loading failures#11672clee704 wants to merge 1 commit intoapache:mainfrom
clee704 wants to merge 1 commit intoapache:mainfrom
Conversation
… loading failures
## What changes were proposed in this pull request?
Fix URL-encoded path handling in ResourceUtil.getResources() by using
URL.toURI() and URI for proper path decoding instead of URL.getPath().
Changes:
- Added imports for java.net.URI and java.net.URISyntaxException
- Fixed the file: protocol handler to use new File(containerUrl.toURI())
instead of new File(containerUrl.getPath())
- Fixed the jar:file: protocol handler to use
new File(new URI("file://" + jarPath)) for proper decoding
- Added unit test verifying URI decoding behavior
## Why are the changes needed?
When Gluten is loaded from a JAR file located in a directory with special
characters in its path (such as @ or spaces), ResourceUtil.getResources()
fails because ClassLoader.getResources() returns URLs where special
characters are percent-encoded (e.g., @ becomes %40). ResourceUtil was
passing these URL-encoded paths directly to java.io.File(), which expects
filesystem paths, not URL-encoded strings.
The fix uses URL.toURI() and new File(URI) which properly decode
percent-encoded characters. This also correctly preserves + characters
(unlike URLDecoder.decode() which treats + as space).
## Does this PR introduce _any_ user-facing change?
No
## How was this patch tested?
Added unit test testUriDecodesPercentEncodedPaths that verifies:
- URI correctly decodes %40 back to @
- URI preserves + characters (unlike URLDecoder)
Existing unit tests continue to pass.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Run Gluten Clickhouse CI on x86 |
jinchengchenghh
approved these changes
Mar 5, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Fix URL-encoded path handling in
ResourceUtil.getResources()by usingURL.toURI()andURIfor proper path decoding instead ofURL.getPath().Changes:
java.net.URIandjava.net.URISyntaxExceptionfile:protocol handler to usenew File(containerUrl.toURI())instead ofnew File(containerUrl.getPath())jar:file:protocol handler to usenew File(new URI("file://" + jarPath))for proper decodingWhy are the changes needed?
When Gluten is loaded from a JAR file located in a directory with special characters in its path (such as
@or spaces),ResourceUtil.getResources()fails becauseClassLoader.getResources()returns URLs where special characters are percent-encoded (e.g.,@becomes%40).ResourceUtilwas passing these URL-encoded paths directly tojava.io.File(), which expects filesystem paths, not URL-encoded strings.The fix uses
URL.toURI()andnew File(URI)which properly decode percent-encoded characters. This also correctly preserves+characters (unlikeURLDecoder.decode()which treats+as space).Does this PR introduce any user-facing change?
No
How was this patch tested?
Added unit test
testUriDecodesPercentEncodedPathsthat verifies:URIcorrectly decodes%40back to@URIpreserves+characters (unlikeURLDecoder)Existing unit tests continue to pass.
Related issue: #11673