Skip to content

fix(spark): gss initiate failed on hms executors; spark.sql.catalog read options not applied#476

Open
xiaguanglei wants to merge 1 commit intolance-format:mainfrom
xiaguanglei:fix/executor-credential-refresh
Open

fix(spark): gss initiate failed on hms executors; spark.sql.catalog read options not applied#476
xiaguanglei wants to merge 1 commit intolance-format:mainfrom
xiaguanglei:fix/executor-credential-refresh

Conversation

@xiaguanglei
Copy link
Copy Markdown

@xiaguanglei xiaguanglei commented Apr 23, 2026

Summary

Fixes org.apache.thrift.transport.TTransportException: GSS initiate failed thrown on Spark executors when reading Lance tables registered in a Kerberized Hive Metastore (both plain SELECT and SQL DML).

This PR does two things:

  1. Stops executors from rebuilding the namespace client unconditionally. Adds a new read option executor_credential_refresh (default true, preserving current behavior). When set to false, executors skip the eager namespace.describeTable() RPC and open the dataset directly by URI using the storage options the driver already obtained.
  2. Makes catalog-level read options actually reach the typed fields. Catalog-level conf (--conf spark.sql.catalog.<name>.executor_credential_refresh=false) is now parsed in withCatalogDefaults(), so spark.sql(...) queries (including SELECT and SQL DML) — which have no spark.read.option(...) attach point — pick up the flag the same way as DataFrameReader-based reads.

Root Cause

Since #353 removed LanceDatasetCache, LanceFragmentScanner.create() unconditionally rebuilds the LanceNamespace client on each Spark executor and binds it back onto LanceSparkReadOptions. This forces the dataset open through Utils.OpenDatasetBuilder's namespaceClient branch, which in turn calls OpenDatasetBuilder.buildFromNamespaceClient() in the Lance Java SDK — and that path issues an eager namespace.describeTable() RPC before handing off to Rust.

For catalogs where the backing service authenticates per-call (HMS over Kerberos, some REST catalogs), Spark executors typically do not have a Kerberos TGT — the --keytab / --principal credentials only reach the driver / ApplicationMaster, while executors run with Hadoop delegation tokens that cannot be used for HMS Thrift SASL. The describeTable RPC therefore fails with:

org.apache.thrift.transport.TTransportException: GSS initiate failed
  at org.lance.namespace.hive2.Hive2ClientPool.newClient(Hive2ClientPool.java:42)
  at org.lance.namespace.hive2.Hive2Namespace.describeTable(Hive2Namespace.java:285)
  at org.lance.OpenDatasetBuilder.buildFromNamespaceClient(OpenDatasetBuilder.java:205)
  at org.lance.OpenDatasetBuilder.build(OpenDatasetBuilder.java:191)
  at org.lance.spark.utils.Utils$OpenDatasetBuilder.build(Utils.java:140)
  at org.lance.spark.internal.LanceFragmentScanner.create(LanceFragmentScanner.java:67)

Driver-side operations (metadata-only queries, count-via-manifest) succeed because the driver has the TGT. The failure only manifests during fragment scans.

Why the Existing Behavior Exists

Rebuilding the namespace client on the executor is not dead code — it keeps the Rust LanceNamespaceStorageOptionsProvider attached so that short-lived vended credentials (STS tokens for S3 / GCS / Azure) returned by describeTable() can be refreshed when they expire mid-scan. Simply removing the rebuild would break long-running scans against object stores that use credential vending.

Fix

1. Gate the executor-side rebuild behind a new option

Add a boolean read option executor_credential_refresh, defaulting to true:

  • true (default): unchanged — executor rebuilds the namespace client and routes through the namespaceClient branch, preserving credential refresh. Safe for all existing users.
  • false: executor skips the rebuild, reads remain open via URI using the initialStorageOptions the driver already obtained from describeTable() at scan-plan time.

2. Make catalog-level conf actually reach the typed field

Before this PR, Builder.withCatalogDefaults(catalogConfig) only merged the storage-options map and never parsed typed flags. As a result, the catalog-conf syntax looked like it should work but silently ignored the flag. This PR extracts a parseTypedFlags(Map<String, String>) helper and calls it from both fromOptions() and withCatalogDefaults(), so every recognized read option (not just executor_credential_refresh) now flows from catalog conf into the typed field.

This is what makes the fix usable from SQL DML. Without the withCatalogDefaults parse, a user running DELETE FROM kerberized_hms_lance_table WHERE id = 1 has no way to disable the rebuild — SQL DML has no per-statement .option(...) attach point.

Configuration surfaces after this PR

Surface Example Works?
Per-read option spark.read.option("executor_credential_refresh", "false").table(...) (DataFrameReader) Yes (already worked before this PR; not available for spark.sql("SELECT ..."))
Catalog conf + plain SELECT --conf spark.sql.catalog.lance.executor_credential_refresh=false + spark.sql("SELECT * FROM lance.db.t") Yes (fixed by this PR)
Catalog conf + SQL DML --conf spark.sql.catalog.lance.executor_credential_refresh=false + spark.sql("DELETE FROM lance.db.t WHERE id=1") Yes (fixed by this PR)

Intended usage for HMS + Kerberos deployments:

spark-submit ... \
  --keytab /etc/keytabs/my.keytab \
  --principal my/principal@REALM \
  --conf spark.sql.catalog.lance.executor_credential_refresh=false

Per-Namespace Trade-off Analysis

The refresh callback is meaningful only for namespaces that actually return storage_options from describeTable(). Survey of the impls in lance-namespace-impls:

Namespace describeTable() populates storage_options? Cost of executor_credential_refresh=false
Hive2Namespace No — setLocation only None. The refresh callback is a no-op for HMS regardless of underlying storage.
Hive3Namespace No — setLocation only None. Same as Hive2.
GlueNamespace Static config.getStorageOptions() Effectively none for plain Glue. Use the default if you rely on LakeFormation vended creds.
IcebergNamespace (REST) Yes — vended creds typical Long scans against vended creds will fail when the credential expires.
PolarisNamespace Yes — vended creds typical Same as Iceberg REST.
UnityNamespace Yes — Databricks-vended temp creds Same as Iceberg REST.

Concretely, for the HMS + S3 case: HMS does not vend S3 credentials (describeTable() only sets location), so the executor's S3 access is governed entirely by the AWS SDK credential chain (instance profile / hive-site.xml / env vars / ~/.aws/credentials) and the AWS SDK handles all STS rotation independently. The Lance refresh callback would have nothing to refresh, so disabling it costs nothing in practice.

Scope of Change

  • LanceSparkReadOptions.java:
    • New constant CONFIG_EXECUTOR_CREDENTIAL_REFRESH, new field executorCredentialRefresh (default true), builder / getter / withVersion propagation / equals / hashCode, Javadoc covering per-namespace trade-off.
    • Extracted Builder.parseTypedFlags(Map<String, String>) helper from the previously duplicated fromOptions body, now called from both fromOptions() and withCatalogDefaults(). This incidentally also fixes silent ignores of push_down_filters, batch_size, topN_push_down, etc. when set at the catalog level — a pre-existing latent issue uncovered while fixing the primary bug.
  • LanceFragmentScanner.java: add && readOptions.isExecutorCredentialRefresh() to the existing rebuild if, inline comment explaining the trade-off.
  • LanceSparkReadOptionsSerializationTest.java: six new tests covering default value, map parsing, serialization round-trip, withVersion propagation, catalog-defaults path, and per-read override precedence.

No public API signature is changed; no existing behavior is altered for users who do not set the new option.

Test Plan

New unit tests (all 305 tests in lance-spark-base_2.12 pass locally):

  • testExecutorCredentialRefreshDefaultsToTrue — default value preserved.
  • testExecutorCredentialRefreshParsedFromOptions — flag honored from both "true" and "false" map entries.
  • testExecutorCredentialRefreshSurvivesSerialization — flag survives Java serialization (critical: it must reach the executor).
  • testExecutorCredentialRefreshPreservedByWithVersion — flag propagated by withVersion() used during scan-plan version pinning.
  • testExecutorCredentialRefreshFromCatalogDefaults — new; guards the catalog-conf path used by SQL DML.
  • testPerReadOptionOverridesCatalogDefaults — new; pins the precedence rule "per-read .option(...) wins over catalog default".

Integration test (out-of-band, on internal YARN + HMS + Kerberos + HDFS cluster):

  1. spark-submit with --keytab / --principal, Kerberized HMS, SELECT * FROM lance_hms_table via lance-namespace-hive2.
  2. Before fix: executor task fails with GSS initiate failed on describeTable. Reproducible across multiple partitions / runs.
  3. After fix with --conf spark.sql.catalog.lance.executor_credential_refresh=false: scan completes, returns expected row count and sample rows.
  4. Default path (true) with the same fix jar: behavior unchanged.

Backward Compatibility

Default is true, so every existing job behaves identically without touching configs. Only users who explicitly set the new option to false opt into the new path.

@github-actions github-actions Bot added the bug Something isn't working label Apr 23, 2026
@xiaguanglei
Copy link
Copy Markdown
Author

Related #353 cc @LuciferYang @hamersaw ,Could you please take a look,Thank you.

@LuciferYang
Copy link
Copy Markdown
Contributor

LuciferYang commented Apr 24, 2026

this pr documents three ways to set executor_credential_refresh:

Route Status
--conf spark.sql.catalog.<name>.executor_credential_refresh=false Broken — catalog conf never reaches the typed field
spark.read.option("executor_credential_refresh", "false").table(...) Works for plain SELECT
SQL DML (DELETE / UPDATE / MERGE INTO) No escape exists — SQL DML has no .option(...) attach point, and the catalog conf is broken

Why the catalog conf is broken.
In lance-spark-base_2.12/src/main/java/org/lance/spark/LanceSparkReadOptions.java, Builder.withCatalogDefaults() (lines 489-495) only merges storageOptions — it never parses typed flags. So executor_credential_refresh set at the catalog level lands in the raw-options map but never makes it into the typed executorCredentialRefresh field.

Net effect. A user running DELETE FROM kerberized_hms_lance_table WHERE id = 1 cannot disable the rebuild on any Spark version, and GSS initiate failed still fires. The PR fixes the GSS bug for SELECT but not for DML.

Required changes (one file: LanceSparkReadOptions.java)

  1. Extract a private helper in Builder:

    private void parseTypedFlags(Map<String, String> opts) {
        if (opts.containsKey(CONFIG_PUSH_DOWN_FILTERS)) {
            this.pushDownFilters = Boolean.parseBoolean(opts.get(CONFIG_PUSH_DOWN_FILTERS));
        }
        if (opts.containsKey(CONFIG_BLOCK_SIZE)) {
            this.blockSize = Integer.parseInt(opts.get(CONFIG_BLOCK_SIZE));
        }
        if (opts.containsKey(CONFIG_VERSION)) {
            this.version = Integer.parseInt(opts.get(CONFIG_VERSION));
        }
        if (opts.containsKey(CONFIG_INDEX_CACHE_SIZE)) {
            this.indexCacheSize = Integer.parseInt(opts.get(CONFIG_INDEX_CACHE_SIZE));
        }
        if (opts.containsKey(CONFIG_METADATA_CACHE_SIZE)) {
            this.metadataCacheSize = Integer.parseInt(opts.get(CONFIG_METADATA_CACHE_SIZE));
        }
        if (opts.containsKey(CONFIG_BATCH_SIZE)) {
            int parsedBatchSize = Integer.parseInt(opts.get(CONFIG_BATCH_SIZE));
            Preconditions.checkArgument(parsedBatchSize > 0, "batch_size must be positive");
            this.batchSize = parsedBatchSize;
        }
        if (opts.containsKey(CONFIG_TOP_N_PUSH_DOWN)) {
            this.topNPushDown = Boolean.parseBoolean(opts.get(CONFIG_TOP_N_PUSH_DOWN));
        }
        if (opts.containsKey(CONFIG_NEAREST)) {
            nearest(opts.get(CONFIG_NEAREST));
        }
        if (opts.containsKey(CONFIG_EXECUTOR_CREDENTIAL_REFRESH)) {
            this.executorCredentialRefresh =
                Boolean.parseBoolean(opts.get(CONFIG_EXECUTOR_CREDENTIAL_REFRESH));
        }
    }
  2. Replace the inline typed-flag parses in fromOptions() (lines 453-479) with a single call:

    public Builder fromOptions(Map<String, String> options) {
        this.storageOptions = new HashMap<>(options);
        parseTypedFlags(options);
        return this;
    }
  3. Add the same call to withCatalogDefaults() after the merge:

    public Builder withCatalogDefaults(LanceSparkCatalogConfig catalogConfig) {
        Map<String, String> merged = new HashMap<>(catalogConfig.getStorageOptions());
        merged.putAll(this.storageOptions);
        this.storageOptions = merged;
        parseTypedFlags(catalogConfig.getStorageOptions());  // NEW
        return this;
    }

After these edits, --conf spark.sql.catalog.<name>.executor_credential_refresh=false works uniformly for plain SELECT, per-read .option(...), and SQL DML. No changes to LancePositionDeltaOperation.java are required.

Please also update the PR description's "Intended usage" example — the catalog-conf syntax will finally work after this edit, so the example is correct once the edit lands.

@xiaguanglei xiaguanglei force-pushed the fix/executor-credential-refresh branch from 290f2f9 to b832b6f Compare April 24, 2026 10:21
…ptions not taking effect

Add executor_credential_refresh (default true). When false, skip executor-side namespace rebuild to avoid describeTable() on Kerberized HMS without a TGT (GSS initiate failed on fragment scans).

Parse typed read flags in Builder.withCatalogDefaults() so keys under spark.sql.catalog.<name> (e.g. executor_credential_refresh, batch_size, push_down_filters) apply to plain SQL and DML, not only DataFrameReader.option paths.
@xiaguanglei xiaguanglei force-pushed the fix/executor-credential-refresh branch from b832b6f to fb46a06 Compare April 24, 2026 13:36
@xiaguanglei
Copy link
Copy Markdown
Author

this pr documents three ways to set executor_credential_refresh:

Route Status
--conf spark.sql.catalog.<name>.executor_credential_refresh=false Broken — catalog conf never reaches the typed field
spark.read.option("executor_credential_refresh", "false").table(...) Works for plain SELECT
SQL DML (DELETE / UPDATE / MERGE INTO) No escape exists — SQL DML has no .option(...) attach point, and the catalog conf is broken
Why the catalog conf is broken. In lance-spark-base_2.12/src/main/java/org/lance/spark/LanceSparkReadOptions.java, Builder.withCatalogDefaults() (lines 489-495) only merges storageOptions — it never parses typed flags. So executor_credential_refresh set at the catalog level lands in the raw-options map but never makes it into the typed executorCredentialRefresh field.

Net effect. A user running DELETE FROM kerberized_hms_lance_table WHERE id = 1 cannot disable the rebuild on any Spark version, and GSS initiate failed still fires. The PR fixes the GSS bug for SELECT but not for DML.

Required changes (one file: LanceSparkReadOptions.java)

  1. Extract a private helper in Builder:
    private void parseTypedFlags(Map<String, String> opts) {
        if (opts.containsKey(CONFIG_PUSH_DOWN_FILTERS)) {
            this.pushDownFilters = Boolean.parseBoolean(opts.get(CONFIG_PUSH_DOWN_FILTERS));
        }
        if (opts.containsKey(CONFIG_BLOCK_SIZE)) {
            this.blockSize = Integer.parseInt(opts.get(CONFIG_BLOCK_SIZE));
        }
        if (opts.containsKey(CONFIG_VERSION)) {
            this.version = Integer.parseInt(opts.get(CONFIG_VERSION));
        }
        if (opts.containsKey(CONFIG_INDEX_CACHE_SIZE)) {
            this.indexCacheSize = Integer.parseInt(opts.get(CONFIG_INDEX_CACHE_SIZE));
        }
        if (opts.containsKey(CONFIG_METADATA_CACHE_SIZE)) {
            this.metadataCacheSize = Integer.parseInt(opts.get(CONFIG_METADATA_CACHE_SIZE));
        }
        if (opts.containsKey(CONFIG_BATCH_SIZE)) {
            int parsedBatchSize = Integer.parseInt(opts.get(CONFIG_BATCH_SIZE));
            Preconditions.checkArgument(parsedBatchSize > 0, "batch_size must be positive");
            this.batchSize = parsedBatchSize;
        }
        if (opts.containsKey(CONFIG_TOP_N_PUSH_DOWN)) {
            this.topNPushDown = Boolean.parseBoolean(opts.get(CONFIG_TOP_N_PUSH_DOWN));
        }
        if (opts.containsKey(CONFIG_NEAREST)) {
            nearest(opts.get(CONFIG_NEAREST));
        }
        if (opts.containsKey(CONFIG_EXECUTOR_CREDENTIAL_REFRESH)) {
            this.executorCredentialRefresh =
                Boolean.parseBoolean(opts.get(CONFIG_EXECUTOR_CREDENTIAL_REFRESH));
        }
    }
  2. Replace the inline typed-flag parses in fromOptions() (lines 453-479) with a single call:
    public Builder fromOptions(Map<String, String> options) {
        this.storageOptions = new HashMap<>(options);
        parseTypedFlags(options);
        return this;
    }
  3. Add the same call to withCatalogDefaults() after the merge:
    public Builder withCatalogDefaults(LanceSparkCatalogConfig catalogConfig) {
        Map<String, String> merged = new HashMap<>(catalogConfig.getStorageOptions());
        merged.putAll(this.storageOptions);
        this.storageOptions = merged;
        parseTypedFlags(catalogConfig.getStorageOptions());  // NEW
        return this;
    }

After these edits, --conf spark.sql.catalog.<name>.executor_credential_refresh=false works uniformly for plain SELECT, per-read .option(...), and SQL DML. No changes to LancePositionDeltaOperation.java are required.

Please also update the PR description's "Intended usage" example — the catalog-conf syntax will finally work after this edit, so the example is correct once the edit lands.

@LuciferYang Thanks for the review and for catching the withCatalogDefaults issue — that was a great find, and it really helped tighten the fix.
I’ve now verified this on our YARN cluster: with --conf spark.sql.catalog..executor_credential_refresh=false, the option is picked up correctly and fragment scans complete successfully. So the flag works as intended and disables the executor-side namespace rebuild (the path that was hitting HMS and failing with GSS initiate failed on executors).
Could you please take another look at the updated PR when you have a moment?

@xiaguanglei xiaguanglei changed the title fix(spark): fix GSS initiate failed on executors when reading Lance tables via Hive Metastore fix(spark): fix GSS initiate failed on executors when reading Lance tables via Hive Metastore and catalog read options not taking effect Apr 24, 2026
@github-actions
Copy link
Copy Markdown
Contributor

ACTION NEEDED
Lance follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

For details on the error please inspect the "PR Title Check" action.

@xiaguanglei xiaguanglei changed the title fix(spark): fix GSS initiate failed on executors when reading Lance tables via Hive Metastore and catalog read options not taking effect fix(spark): GSS initiate failed on HMS executors; catalog read config not applied Apr 24, 2026
@xiaguanglei xiaguanglei changed the title fix(spark): GSS initiate failed on HMS executors; catalog read config not applied fix(spark): gss initiate failed on hms executors; spark.sql.catalog read options not applied Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants