Add BigQuery Metastore federation support by joyhaldar · Pull Request #4050 · apache/polaris

joyhaldar · 2026-03-24T15:53:50Z

Adds federation support for BigQuery Metastore catalogs, enabling Polaris to serve as a unified REST catalog interface for Iceberg tables managed in BigQuery Metastore.

Changes

BigQueryMetastoreFederatedCatalogFactory - factory for creating BigQuery Metastore catalog handles
BigQueryMetastoreConnectionConfigInfoDpo - connection configuration with all BigQueryProperties
BIGQUERY connection type in ConnectionType
OpenAPI schema for BigQueryMetastoreConnectionConfigInfo
Support for service account impersonation via Iceberg's BigQueryProperties

Checklist

🛡️ Don't disclose security issues! (contact security@apache.org)
🔗 Clearly explained why the changes are needed
🧪 Added/updated tests with good coverage, or manually tested (and explained how)
💡 Added comments for complex logic
🧾 Updated CHANGELOG.md (if needed)
📚 Updated documentation in site/content/in-dev/unreleased (if needed)

dimas-b

Thanks for your contribution, @joyhaldar ! The PR LGTM 👍 Still, I believe we need to open a dev discussion for it before merging as it includes REST API changes.

dimas-b · 2026-03-24T16:20:22Z

          ICEBERG_REST: "#/components/schemas/IcebergRestConnectionConfigInfo"
          HADOOP: "#/components/schemas/HadoopConnectionConfigInfo"
          HIVE: "#/components/schemas/HiveConnectionConfigInfo"
+          BIGQUERY: "#/components/schemas/BigQueryMetastoreConnectionConfigInfo"


Please open a dev ML discussion for this (it's customary for all REST API changes).

Thank you for your review Dmitri.

I have an existing dev ML thread for this. Replied with the PR link: https://lists.apache.org/thread/n3hh5s1zxn6yn7cbowfgf8p029z6m7g1

Would you like me to start a new discussion?

I opened https://lists.apache.org/thread/m05xm7szd7znrm9yos1rnld5ljx04y0h specifically for community awareness of REST API changes.

dimas-b · 2026-03-24T16:21:35Z

+            connectionConfigInfoDpo.asIcebergCatalogProperties(polarisCredentialManager));
+
+    BigQueryMetastoreCatalog bigQueryMetastoreCatalog = new BigQueryMetastoreCatalog();
+    bigQueryMetastoreCatalog.initialize(warehouse, mergedProperties);


How / where does it get BigQuery connection credentials?

Thank you for your review Dmitri.

BigQuery Metastore uses Application Default Credentials via Iceberg's BigQueryProperties.

Credentials are resolved automatically from the environment, GOOGLE_APPLICATION_CREDENTIALS, gcloud auth, or attached Service Account.

So, if storage is also in GCP, the BigQuery connection and storage connections will have to use the same credentials, effectively (in current Polaris code).

It's not a big deal ATM, I think... just trying to understand the full picture :)

Yes, that's my understanding also. Both BigQuery Metastore and GCS storage will use the same ADC credentials, or impersonated SA if configured. Thank you for clarifying the full picture. Sorry for the late reply.

Thanks for the explanation, @joyhaldar. Can we comment out this "hidden" behavior somewhere?

Done. Added a comment explaining the ADC credential behavior in this commit.

joyhaldar · 2026-03-25T03:29:29Z

Thanks for your contribution, @joyhaldar ! The PR LGTM 👍 Still, I believe we need to open a dev discussion for it before merging as it includes REST API changes.

Thank you for your review @dimas-b. I have an existing dev ML thread for this. Replied with the PR link: https://lists.apache.org/thread/n3hh5s1zxn6yn7cbowfgf8p029z6m7g1

Would you like me to start a new discussion?

joyhaldar · 2026-03-25T06:39:30Z

CI failure is unrelated to this PR, MongoDB testcontainer timeout if I understand correctly.

jbonofre

If this PR is solid, I have a concern about the potential NPE. Can you please clarify ?

jbonofre · 2026-03-26T04:44:43Z

+        .add("gcpProjectId", gcpProjectId)
+        .add("gcpLocation", gcpLocation)
+        .add("listAllTables", listAllTables)
+        .add("impersonateServiceAccount", impersonateServiceAccount)


Is it ok to include impersonateServiceAccount for toString() ?
It could include sensitive data I guess, so it could potentially land in log messages.
Maybe we should remove the account details or offuscate ?

Thank you for your review JB.

Done. I have removed all impersonation fields from toString().

jbonofre · 2026-03-26T04:47:58Z

+    // IMPLICIT authentication.
+    AuthenticationParametersDpo authenticationParametersDpo =
+        connectionConfigInfoDpo.getAuthenticationParameters();
+    if (authenticationParametersDpo.getAuthenticationTypeCode()


Should we include a null check here ?

If connectionConfigInfoDpo.getAuthenticationParameters() returns null, then authenticationParametersDpo.getAuthenticationTypeCode() will throw NPE.

Since the constructor marks authenticationParameters as required = false I guess it can happen.

Thank you for your review JB.

I copied this pattern from HiveFederatedCatalogFactory and HadoopFederatedCatalogFactory which also don't have null checks. Should I add the null check here and open a separate PR to fix Hive and Hadoop as well? Or keep it consistent with existing code for now?

I'd say let's make new code robust immediately and fix old code later.

Note: IIRC, Hive federation is not even bundled by default:

polaris/runtime/server/build.gradle.kts

Lines 42 to 43 in b67e1a5

if ((project.findProperty("NonRESTCatalogs") as String?)?.contains("HIVE") == true) {

runtimeOnly(project(":polaris-extensions-federation-hive"))

IIUC, This path seems to rely on authenticationParameters being present, while the public BIGQUERY contract still appears to allow it to be omitted. I wonder if that mismatch should be resolved at validation / schema time rather than here in the runtime path.

Done. Added null check here. If authenticationParameters is null, BigQuery uses ADC by default.

jbonofre · 2026-03-26T04:50:40Z

+            AuthenticationParametersDpo.fromAuthenticationParametersModelWithSecrets(
+                bigqueryConfigModel.getAuthenticationParameters(), secretReferences);
+        String bigqueryUri =
+            bigqueryConfigModel.getUri() != null


nit: If URI is null, we default to https://bigquery.googleapis.com. Maybe this default should live in DPO for visibility (or maybe OpenAPI as you started a discussion to update it 😄 ).

Done. I have moved the default URI to a public constant in BigQueryMetastoreConnectionConfigInfoDpo.

flyingImer

I’m aligned with the direction here overall. I think there are two high-priority issues worth tightening before this moves forward:

The current BIGQUERY auth contract does not look aligned yet. IIUC, the API/model path still allows authenticationParameters to be omitted, while the runtime only supports IMPLICIT auth and later dereferences that field as if it were always present. That feels like more than a local null-safety issue, the boundary contract still looks looser than the implementation.
Since this adds a new federation connector across OpenAPI, model conversion, and runtime wiring, I think we should have at least one round-trip/ wiring test for BIGQUERY before the contract settles.

2 cents: this PR also seems like a useful point to clarify whether federation here is aiming for a common Polaris contract over native Iceberg catalogs, or for backend-specific adapters behind one REST surface. I think that answer affects how strict/explicit the connector contract should be.

flyingImer · 2026-03-26T17:41:18Z

+          items:
+            type: string
+          description: Delegation chain for impersonation
+      required:


IIUC, BIGQUERY can still omit authenticationParameters at the API boundary, but the current implementation only supports IMPLICIT auth and later code paths assume the field is present. Would it make sense to tighten the BIGQUERY contract here instead, so the boundary reflects what the runtime actually supports today?

Thank you for the feedback. I have added null checks in both BigQueryMetastoreFederatedCatalogFactory and asConnectionConfigInfoModel. If authenticationParameters is null, BigQuery uses ADC by default. Please let me know if you'd like me to change the approach. Sorry for the late reply.

flyingImer · 2026-03-26T17:41:18Z

+        .setImpersonateLifetimeSeconds(impersonateLifetimeSeconds)
+        .setImpersonateScopes(impersonateScopes)
+        .setImpersonateDelegates(impersonateDelegates)
+        .setAuthenticationParameters(


Related contract gap: the readback/model-conversion path also seems to assume authenticationParameters is non-null here. So even if the create/update boundary accepts a BIGQUERY payload without an auth block, we can still fail later when converting the stored config back to the API model.

I have added null check in asConnectionConfigInfoModel.

flyingImer · 2026-03-27T17:56:04Z

+    // IMPLICIT authentication.
+    AuthenticationParametersDpo authenticationParametersDpo =
+        connectionConfigInfoDpo.getAuthenticationParameters();
+    if (authenticationParametersDpo.getAuthenticationTypeCode()


IIUC, This path seems to rely on authenticationParameters being present, while the public BIGQUERY contract still appears to allow it to be omitted. I wonder if that mismatch should be resolved at validation / schema time rather than here in the runtime path.

joyhaldar · 2026-03-31T11:35:33Z

I’m aligned with the direction here overall. I think there are two high-priority issues worth tightening before this moves forward:

The current BIGQUERY auth contract does not look aligned yet. IIUC, the API/model path still allows authenticationParameters to be omitted, while the runtime only supports IMPLICIT auth and later dereferences that field as if it were always present. That feels like more than a local null-safety issue, the boundary contract still looks looser than the implementation.

Since this adds a new federation connector across OpenAPI, model conversion, and runtime wiring, I think we should have at least one round-trip/ wiring test for BIGQUERY before the contract settles.

2 cents: this PR also seems like a useful point to clarify whether federation here is aiming for a common Polaris contract over native Iceberg catalogs, or for backend-specific adapters behind one REST surface. I think that answer affects how strict/explicit the connector contract should be.

Thank you for the detailed review. I have tried to address both points:

Added null checks in the factory and asConnectionConfigInfoModel to handle omitted authenticationParameters, which defaults to ADC.
Added testBigQueryMetastoreConnectionConfig in ConnectionConfigInfoDpoTest for round-trip validation.

Please let me know if you'd like any changes.

dimas-b · 2026-03-31T19:14:51Z

+    // IMPLICIT authentication.
+    AuthenticationParametersDpo authenticationParametersDpo =
+        connectionConfigInfoDpo.getAuthenticationParameters();
+    if (authenticationParametersDpo != null


What's the use case for permitting federated catalogs without an AuthenticationParametersDpo object?

Same concern here. Should we fail fast when authenticationParametersDpo is null?

BigQuery always uses Application Default Credentials or the GOOGLE_APPLICATION_CREDENTIALS environment variable to automatically pick up the user's credentials or a JSON file pointing to their Service Account. Making authenticationParameters required would just force users to type IMPLICIT for no reason IMO, but please let me know what you think, or please let me know if you think I am incorrect.

I think we will need an AuthenticationType regardless of how BigQuery picks up credentials. In this case, it should be IMPLICIT(I agreed with line 58 here). However, from a syntax perspective, we may not want users to specify it explicitly. We could default it when the AuthenticationType is missing. That said, I would not recommend handling this defaulting logic here. The defaulting to IMPLICIT should happen elsewhere. WDYT?

I believe declaring the IMPLICIT authentication explicitly 😉 is actually good. It increases clarity about the intended catalog behaviour. It is also more robust in case of future API and backend code changes.

Agreed, the point is even we want a default value for better UX. This is not the place to handle the defaulting logic.

@joyhaldar, sorry to miss this one, I think we will actually need this to be resolved. I'd suggest to add it to the method fromConnectionConfigInfoModelWithSecrets(), something like this

case BIGQUERY: BigQueryMetastoreConnectionConfigInfo bigqueryConfigModel = (BigQueryMetastoreConnectionConfigInfo) connectionConfigurationModel; // Default to IMPLICIT authentication if not provided authenticationParameters = bigqueryConfigModel.getAuthenticationParameters() != null ? AuthenticationParametersDpo.fromAuthenticationParametersModelWithSecrets( bigqueryConfigModel.getAuthenticationParameters(), secretReferences) : new ImplicitAuthenticationParametersDpo();

No, I am sorry, I missed it.

I tried adding the defaulting in fromConnectionConfigInfoModelWithSecrets() but it doesn't work, PolarisServiceImpl.validateAuthenticationParameters() runs before conversion and throws NPE when authenticationParameters is null IIUC.

Fixing that would mean changing PolarisServiceImpl too, which seems out of scope for this PR.
For now I'm going with @dimas-b's suggestion, users must pass authenticationParameters: {authenticationType: IMPLICIT} explicitly (commit).

Let me know if my understanding is correct. Would appreciate your guidance.

flyrain · 2026-04-01T00:42:21Z

+        gcpLocation:
+          type: string
+          description: The GCP location (default = us)
+        listAllTables:
+          type: boolean
+          description: Whether to list all tables (default = true)
+        impersonateServiceAccount:
+          type: string
+          description: Service account email to impersonate
+        impersonateLifetimeSeconds:
+          type: integer
+          description: Token lifetime in seconds for impersonation (default = 3600)
+        impersonateScopes:
+          type: array
+          items:
+            type: string
+          description: OAuth scopes for impersonation (default = cloud-platform)
+        impersonateDelegates:
+          type: array
+          items:
+            type: string
+          description: Delegation chain for impersonation


How often does the BQMS config surface change? For example, when new fields are introduced or existing fields are updated. Locking every field into the spec creates a maintenance burden, since each upstream change requires a Polaris spec PR, client regeneration, and a new release.

Instead, we could consider using a bag of open properties to keep the model more flexible.

properties: type: object additionalProperties: type: string

You are right, I will work on addressing this.

Done. Refactored to use a properties bag. Only warehouse and gcpProjectId remain as required fields, everything else goes into the map. Thank you for your help.

flyrain

Thanks @joyhaldar for adding this. I think it would be really useful. The PR is in the right direction. Left some comments.

XJDKC · 2026-04-01T05:34:38Z

+  private static final String IMPERSONATE_LIFETIME_SECONDS =
+      "gcp.bigquery.impersonate.lifetime-seconds";
+  private static final String IMPERSONATE_SCOPES = "gcp.bigquery.impersonate.scopes";
+  private static final String IMPERSONATE_DELEGATES = "gcp.bigquery.impersonate.delegates";


For these property keys, should we just use the iceberg sdk's properties?

https://github.com/apache/iceberg/blob/main/bigquery/src/main/java/org/apache/iceberg/gcp/bigquery/BigQueryProperties.java

The property keys in Iceberg's BigQueryProperties are package-private, so I had to duplicate them here. I can add a comment referencing the Iceberg source. Let me know what you think.

XJDKC · 2026-04-01T05:36:08Z

+  private final String impersonateServiceAccount;
+  private final Integer impersonateLifetimeSeconds;
+  private final List<String> impersonateScopes;
+  private final List<String> impersonateDelegates;


These are authentication related properties, wandering if we need to create a new authentication for GCP?

private final String impersonateServiceAccount; private final Integer impersonateLifetimeSeconds; private final List<String> impersonateScopes; private final List<String> impersonateDelegates;

Thank you for the review. These fields are now part of the properties bag.

XJDKC

Side Question: if we can federate to BigLake, we should be able to federate to BigQuery through BigLake right?

jbonofre

I have two major questions/"concerns":

As we introduce new dependencies (for BigQuery and transitive), the LICENSE/NOTICE should be updated according to the second point.
The question is: do we ship bigquery metastore federation by default in the server ? If yes, the LICENSE/NOTICE of the server should be updated. If no, no need to update it.

jbonofre · 2026-04-01T08:00:26Z

+        .add("gcpProjectId", gcpProjectId)
+        .add("gcpLocation", gcpLocation)
+        .add("listAllTables", listAllTables)
+        .add("authenticationParameters", getAuthenticationParameters())


I know I had a comment in this toString() method, and you addressed it. Thanks a lot.

Sorry to be a pain, but I have a follow up question 😄

Here we directly call getAuthenticationParameters(). authenticationParameters is @Nullable.
It's similar to HiveConnectionConfigInfoDpo which has the same nullable pattern.
If other DPOs call toString() on the auth params, this could throw NPE.

The existing Hadoop DPO has authenticationParameters as @Nonnull and calls toString() explicitly.

Maybe we should do something similar here to prevent NPE ?

We can do something simple:

Suggested change

.add("authenticationParameters", getAuthenticationParameters())

.add("authenticationParameters",

getAuthenticationParameters() != null ? getAuthenticationParameters().toString() : "null")

Done. Added the null check as suggested. Thank you again JB!

flyrain

Thanks for continuous working on it, @joyhaldar! I think we are getting closer. Left some comments. Could you fix the CI failures as well?

A few followup items(not a blocker):

Document this new feature
CLI changes to support BQMS.

joyhaldar · 2026-04-03T14:37:11Z

I have two major questions/"concerns":

As we introduce new dependencies (for BigQuery and transitive), the LICENSE/NOTICE should be updated according to the second point.

The question is: do we ship bigquery metastore federation by default in the server ? If yes, the LICENSE/NOTICE of the server should be updated. If no, no need to update it.

Made it optional for now to unblock CI. Will follow up on LICENSE/NOTICE updates separately if we decide to ship it by default.

dimas-b · 2026-04-03T19:46:34Z

            connectionConfigInfoDpo.asIcebergCatalogProperties(polarisCredentialManager));

+    // Credentials are resolved via Google Application Default Credentials (ADC).
+    // GCS storage operations use the same ADC credentials.


The second statement is not 100% accurate. In principle a user can override GCS storage credentials:

polaris/runtime/service/src/main/java/org/apache/polaris/service/storage/StorageConfiguration.java

Line 151 in a6ca7cb

default Supplier<GoogleCredentials> gcpCredentialsSupplier(Clock clock) {

I'd propose adding ... by default.

Updated the comment to say by default.

flyingImer

This is directionally better now, but I still think there are two blocker-level contract issues here.

BIGQUERY can still accept auth modes that the connector itself does not actually support. The connector is still IMPLICIT-only, but the create/update path is validating auth too generically, so an invalid BIGQUERY catalog can still be accepted and persisted, then fail later at catalog instantiation time.
The authenticationParameters contract is still inconsistent. The schema/DPO/factory path still treats it as effectively optional, but the create/admin validation path dereferences it as if it were required. So right now a missing auth block can still turn into a null-deref path instead of a deterministic validation/defaulting decision.

My preference would be to make this narrower and explicit before merge:

validate BIGQUERY auth per connection type at create/update time, not only later in the connector, and
pick one contract for missing authenticationParameters (required + IMPLICIT or missing means IMPLICIT) and make the spec/validation/DPO path all match.

flyrain · 2026-04-06T17:31:22Z

+        properties:
+          type: object
+          additionalProperties:
+            type: string
+            description: Additional catalog properties


Thanks for adding this. Thinking more broadly, this properties would be useful for other federated catalog type like HMS. I'd proposed to move it up to its parent ConnectionConfigInfo, so that all subtypes can benefit from it. WDYT? cc @XJDKC @HonahX @singhpk234 @dimas-b

Yes, the more I think about it the more I like the idea of supporting general properties inside ConnectionConfigInfo. That REST API change should be able to support both BigQuery Catalog federation and #3729 (CC: @PhillHenry).

Here is another use case that justifies it. A client for the remote catalog may support more properties than those defined in the spec. Without a freeform properties field, we would need to keep updating the spec to accommodate properties already supported by existing clients (for example HMS or BQMS), which is both unnecessary and heavy. With properties, users can simply update their federated catalog in the metastore instead.

I have moved properties to parent ConnectionConfigInfo so all federation types can use it.

Co-authored-by: Joy Haldar <Joy.Haldar@target.com>

Co-authored-by: Joy Haldar <joy.haldar@target.com>

…am change Co-authored-by: Joy Haldar <joy.haldar@target.com>

dimas-b · 2026-04-14T16:43:13Z

+    properties.put(GCP_BIGQUERY_PROJECT_ID, gcpProjectId);
+    properties.put(CatalogProperties.WAREHOUSE_LOCATION, warehouse);


Why not use the generic getProperties() map (line 108) for these parameter too?

These are required fields in the spec. projectId is validated as required in BigQueryProperties.java, and warehouse is validated in BigQueryMetastoreCatalog.java. The properties bag is for optional settings like gcp.bigquery.location. But let me know if this is not appropriate, would love your opinion.

From my POV this creates a skew in the Polaris API, which becomes dependent on particular Iceberg java classes.

Conceptually, this class merely passes some user-defined properties to the Iceberg catalog impl. This is fine. However, do we have to elevate a sub-set of those properties to explicit Polaris OpenAPI properties?.. I'm not sure.

I believe keeping the Polaris API generic and uniform for all federated catalog properties might be preferable.

Implementations (both Iceberg java classes and Polaris classes) can and should perform config validation as appropriate, of course.

Can I do this in a follow-up PR?

I'm trying to figure out the right pattern here, Hadoop and Hive have warehouse as an explicit field, but BigQueryMetastoreCatalog needs both warehouse and projectId. Should I:

Keep warehouse explicit (consistent with Hadoop/Hive), move projectId to properties.

Move both to properties (fully generic, but diverges from Hadoop/Hive pattern)

Something else?

Would appreciate your guidance.

Retracting API changes is not ideal (in a follow-up).

I'd propose to remove gcpProjectId and warehouse from OpenAPI explicit properties and process corresponding Iceberg properties from the generic properties bag. Also, add description to OpenAPI YAML to list required properties.

My rationale is that it is fine to accept pass-through properties destined to become Iceberg (federated) catalog properties. Yet, specific processing / validation is a feature of the implementation and can change in new Polaris versions without requiring API changes.

I wonder what other reviewers think about this.

@dimas-b proposal works for me, as soon as clear documented. That said, I'm also fine to keep them as it's "bigquery" specific.

Re: Hadoop and Hive: from my POV the same argument applies to them, but I missed out on the review of that code when it was introduced 😅

This is why my request in this thread is non-blocking.

However, I do not really see a reason to promote an old pattern if we're in agreement about the rationale for generic properties in this case (hence my request for more reviewer opinions).

I have moved warehouse and gcpProjectId to the properties map with validation in the factory and documented required properties in the spec in this commit.

flyrain

LGTM. Thanks @joyhaldar !

dimas-b

The PR LGTM overall 👍

Feel free to consider my point about properties as non-blocking.

dimas-b · 2026-04-15T14:37:15Z

+    properties.put(GCP_BIGQUERY_PROJECT_ID, gcpProjectId);
+    properties.put(CatalogProperties.WAREHOUSE_LOCATION, warehouse);


From my POV this creates a skew in the Polaris API, which becomes dependent on particular Iceberg java classes.

Conceptually, this class merely passes some user-defined properties to the Iceberg catalog impl. This is fine. However, do we have to elevate a sub-set of those properties to explicit Polaris OpenAPI properties?.. I'm not sure.

I believe keeping the Polaris API generic and uniform for all federated catalog properties might be preferable.

Implementations (both Iceberg java classes and Polaris classes) can and should perform config validation as appropriate, of course.

flyrain

The defaulting logic need to be moved

…y validation Co-authored-by: Joy Haldar <joy.haldar@target.com>

Co-authored-by: Joy Haldar <joy.haldar@target.com>

nandorKollar · 2026-04-16T08:30:47Z

      @JsonProperty(value = "warehouse", required = false) @Nullable String remoteCatalogName) {
-    super(ConnectionType.HADOOP.getCode(), uri, authenticationParameters, serviceIdentityInfo);
+    super(
+        ConnectionType.HADOOP.getCode(), uri, authenticationParameters, serviceIdentityInfo, null);


nit: I think we can spare adding these ugly 'null' parameters in the super calls. We can overload the public constructor of ConnectionConfigInfoDpo like this:

public ConnectionConfigInfoDpo( @JsonProperty(value = "connectionTypeCode", required = true) int connectionTypeCode, @JsonProperty(value = "uri", required = true) @Nonnull String uri, @JsonProperty(value = "authenticationParameters", required = true) @Nullable AuthenticationParametersDpo authenticationParameters, @JsonProperty(value = "serviceIdentity", required = false) @Nullable ServiceIdentityInfoDpo serviceIdentity, @JsonProperty(value = "properties", required = false) @Nullable Map<String, String> properties) { this(connectionTypeCode, uri, authenticationParameters, serviceIdentity, true, properties); } public ConnectionConfigInfoDpo( @JsonProperty(value = "connectionTypeCode", required = true) int connectionTypeCode, @JsonProperty(value = "uri", required = true) @Nonnull String uri, @JsonProperty(value = "authenticationParameters", required = true) @Nullable AuthenticationParametersDpo authenticationParameters, @JsonProperty(value = "serviceIdentity", required = false) @Nullable ServiceIdentityInfoDpo serviceIdentity) { this(connectionTypeCode, uri, authenticationParameters, serviceIdentity, true, null); }

I have added the overloaded constructor and updated the subclasses.

nandorKollar · 2026-04-16T08:40:45Z

+    if (warehouse == null || warehouse.isEmpty()) {
+      throw new IllegalArgumentException("warehouse is required for BigQuery Metastore federation");
+    }
+    if (properties.get("gcp.bigquery.project-id") == null) {


Is empty string accepted for gcp.bigquery.project-id?

I have added empty string validation for gcp.bigquery.project-id.

nandorKollar · 2026-04-16T08:43:33Z

+ */
+public class BigQueryMetastoreConnectionConfigInfoDpo extends ConnectionConfigInfoDpo {
+
+  public static final String DEFAULT_URI = "https://bigquery.googleapis.com";


nit: I think we can keep this private.

Actually it's public because it's used in ConnectionConfigInfoDpo.fromConnectionConfigInfoModelWithSecrets() for URI defaulting. I kept it in the BQMS DPO since it's specific to BQMS. But let me know if you disagree, would appreciate your inputs.

Oh okay, I missed that it is used there. Makes sense, thanks!

Co-authored-by: Joy Haldar <joy.haldar@target.com>

nandorKollar

I'm not an expert of BigQuery, but overall LGTM.

dimas-b

LGTM 👍 Thanks for bearing with me 🙂

dimas-b · 2026-04-17T23:43:30Z

+    BigQueryMetastoreConnectionConfigInfo:
+      type: object
+      description: |
+        Configuration necessary for connecting to a BigQuery Metastore Catalog.


nit: maybe make a note that properties are effectively Iceberg's BigQuery catalog properties (for clarity)?

dimas-b · 2026-04-17T23:46:25Z

+    if (warehouse == null || warehouse.isEmpty()) {
+      throw new IllegalArgumentException("warehouse is required for BigQuery Metastore federation");
+    }
+    String projectId = properties.get("gcp.bigquery.project-id");


nit: make a constant of refer to a constant from Iceberg jars?

joyhaldar · 2026-04-19T11:24:17Z

LGTM 👍 Thanks for bearing with me 🙂

No, thank you for taking the time to thoroughly review my PR and giving me so much constructive feedback!

flyrain · 2026-04-20T16:56:38Z

Merge is blocked by Github CI. Re-triggered. I will merge it today if there is no more feedback.

dimas-b · 2026-04-20T17:30:00Z

I believe there is an option to re-run failed CI jobs without closing/reopening the PR 🤔

+1 to merging today.

flyrain · 2026-04-20T17:45:42Z

Thanks @joyhaldar for the change! It's very useful feature. Thanks everyone for the review.

I believe there is an option to re-run failed CI jobs without closing/reopening the PR 🤔

@dimas-b, got it. Mind sharing the option?

github-project-automation Bot added this to Basic Kanban Board Mar 24, 2026

github-project-automation Bot moved this to PRs In Progress in Basic Kanban Board Mar 24, 2026

dimas-b reviewed Mar 24, 2026

View reviewed changes

joyhaldar force-pushed the bqms-federation branch from 8f471ad to a3db29b Compare March 25, 2026 03:10

joyhaldar force-pushed the bqms-federation branch 2 times, most recently from c6980e7 to 07b00dc Compare March 25, 2026 07:41

jbonofre self-requested a review March 26, 2026 04:39

jbonofre reviewed Mar 26, 2026

View reviewed changes

dimas-b reviewed Mar 26, 2026

View reviewed changes

Comment thread ...va/org/apache/polaris/core/connection/bigquery/BigQueryMetastoreConnectionConfigInfoDpo.java

flyingImer reviewed Mar 27, 2026

View reviewed changes

dimas-b reviewed Mar 31, 2026

View reviewed changes

flyrain reviewed Apr 1, 2026

View reviewed changes

XJDKC reviewed Apr 1, 2026

View reviewed changes

jbonofre reviewed Apr 1, 2026

View reviewed changes

flyrain reviewed Apr 2, 2026

View reviewed changes

dimas-b reviewed Apr 3, 2026

View reviewed changes

flyingImer reviewed Apr 4, 2026

View reviewed changes

flyrain reviewed Apr 6, 2026

View reviewed changes

joyhaldar and others added 5 commits April 14, 2026 16:22

Add BigQuery Metastore federation support

813bb81

Co-authored-by: Joy Haldar <Joy.Haldar@target.com>

Include BigQuery federation in standard distribution

156c227

Co-authored-by: Joy Haldar <joy.haldar@target.com>

Move default URI to DPO constant for visibility

0a0bf47

Co-authored-by: Joy Haldar <joy.haldar@target.com>

Remove impersonation fields from toString

63047fc

Co-authored-by: Joy Haldar <joy.haldar@target.com>

Add null checks for authenticationParameters and round-trip test

1eac10b

Co-authored-by: Joy Haldar <joy.haldar@target.com>

Move properties to parent ConnectionConfigInfo for all federation types

65ab8be

Co-authored-by: Joy Haldar <joy.haldar@target.com>

joyhaldar force-pushed the bqms-federation branch from 1074346 to 65ab8be Compare April 14, 2026 10:52

Rename ExternalCatalogFactory to FederatedCatalogFactory after upstre…

fbb200c

…am change Co-authored-by: Joy Haldar <joy.haldar@target.com>

dimas-b reviewed Apr 14, 2026

View reviewed changes

flyrain previously approved these changes Apr 14, 2026

View reviewed changes

github-project-automation Bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Apr 14, 2026

jbonofre previously approved these changes Apr 15, 2026

View reviewed changes

dimas-b reviewed Apr 15, 2026

View reviewed changes

flyrain reviewed Apr 15, 2026

View reviewed changes

Move warehouse and gcpProjectId to generic properties map with factor…

33ae93a

…y validation Co-authored-by: Joy Haldar <joy.haldar@target.com>

joyhaldar dismissed stale reviews from jbonofre and flyrain via 33ae93a April 16, 2026 03:48

Require explicit IMPLICIT authentication for BIGQUERY federation

3856a7d

Co-authored-by: Joy Haldar <joy.haldar@target.com>

nandorKollar reviewed Apr 16, 2026

View reviewed changes

Address review: overloaded constructor, empty project-id validation

36895c8

Co-authored-by: Joy Haldar <joy.haldar@target.com>

nandorKollar approved these changes Apr 16, 2026

View reviewed changes

flyrain approved these changes Apr 16, 2026

View reviewed changes

dimas-b approved these changes Apr 17, 2026

View reviewed changes

flyrain closed this Apr 20, 2026

github-project-automation Bot moved this from Ready to merge to Done in Basic Kanban Board Apr 20, 2026

flyrain reopened this Apr 20, 2026

github-project-automation Bot moved this from Done to PRs In Progress in Basic Kanban Board Apr 20, 2026

flyrain merged commit ed1bbb2 into apache:main Apr 20, 2026
41 checks passed

github-project-automation Bot moved this from PRs In Progress to Done in Basic Kanban Board Apr 20, 2026

	if ((project.findProperty("NonRESTCatalogs") as String?)?.contains("HIVE") == true) {
	runtimeOnly(project(":polaris-extensions-federation-hive"))

	.add("authenticationParameters", getAuthenticationParameters())
	.add("authenticationParameters",
	getAuthenticationParameters() != null ? getAuthenticationParameters().toString() : "null")

		properties.put(GCP_BIGQUERY_PROJECT_ID, gcpProjectId);
		properties.put(CatalogProperties.WAREHOUSE_LOCATION, warehouse);

Conversation

joyhaldar commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Checklist

Uh oh!

dimas-b left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joyhaldar Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dimas-b Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dimas-b Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joyhaldar Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joyhaldar commented Mar 25, 2026

Uh oh!

joyhaldar commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jbonofre left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joyhaldar Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

flyingImer left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joyhaldar Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joyhaldar commented Mar 24, 2026 •

edited

Loading

joyhaldar Mar 25, 2026 •

edited

Loading

dimas-b Mar 24, 2026 •

edited

Loading

dimas-b Mar 26, 2026 •

edited

Loading

joyhaldar Mar 31, 2026 •

edited

Loading

joyhaldar commented Mar 25, 2026 •

edited

Loading

joyhaldar Mar 26, 2026 •

edited

Loading

joyhaldar Mar 31, 2026 •

edited

Loading