Conversation
aff0507 to
405a363
Compare
…tion for DataSource.checkAndGlobPathIfNecessary On Databricks Runtime (both Spark 3.5 and 4.0), DataSource.checkAndGlobPathIfNecessary has all required parameters with no defaults, unlike OSS Apache Spark where some parameters have default values. The previous code used named parameters, causing the Scala compiler to generate $default$N() synthetic method calls in bytecode. On Databricks these accessors don't exist, resulting in NoSuchMethodError at runtime. Using reflection resolves the method at runtime, avoiding any dependency on compiler-generated default parameter accessors.
405a363 to
3dae185
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Did you read the Contributor Guide?
Is this PR related to a ticket?
[GH-XXX] my subject. Closes Shapefile table metadata reader does not work for Databricks #2659What changes were proposed in this PR?
On Databricks Runtime (both Spark 3.5 and 4.0), the method
DataSource.checkAndGlobPathIfNecessaryhas all required parameters with no default values. This differs from OSS Apache Spark, where some parameters have defaults.The previous code used Scala named parameters when calling this method, which causes the Scala compiler to emit synthetic
$default$N()accessor method calls in the bytecode. On Databricks these synthetic accessors do not exist, resulting in aNoSuchMethodErrorat runtime when reading Shapefiles, GeoPackages, or GeoParquet metadata files.This PR changes
SedonaFileIndexHelperto invokeDataSource.checkAndGlobPathIfNecessaryvia reflection, resolving the method at runtime and passing all arguments explicitly based on the detected parameter count (5 or 6). This avoids any dependency on compiler-generated default parameter accessors and works across OSS Spark 3.x/4.x and Databricks Runtime.Affected readers (all go through the fixed
SedonaFileIndexHelper.createFileIndex):ShapefileTable)GeoPackageTable)GeoParquetMetadataTable)How was this patch tested?
Verified that Scala compilation succeeds across all Spark versions (3.4, 3.5, 4.0, 4.1). The runtime fix is a reflection-based dispatch that will be validated on Databricks.
Did this PR include necessary documentation updates?