Skip to content

fix(fs-bq-import-collection): so it respects existing firestore-bigquery-export BigQuery configuration during backfill#2785

Open
IzaakGough wants to merge 6 commits intonextfrom
@invertase/fix-import-script
Open

fix(fs-bq-import-collection): so it respects existing firestore-bigquery-export BigQuery configuration during backfill#2785
IzaakGough wants to merge 6 commits intonextfrom
@invertase/fix-import-script

Conversation

@IzaakGough
Copy link
Copy Markdown
Contributor

@IzaakGough IzaakGough commented Apr 21, 2026

Summary

Fix fs-bq-import-collection so it respects existing firestore-bigquery-export BigQuery configuration during backfill.

The import script was not passing view/clustering settings into FirestoreBigQueryEventHistoryTracker, which caused problems when importing into an extension deployment using materialized latest views and clustering. In that state, the import path could treat *_raw_latest as a standard view and could also trigger unnecessary changelog metadata updates.

This PR:

  • adds import CLI support for viewType and clustering
  • threads those settings through both non-interactive and interactive config flows
  • passes them into FirestoreBigQueryEventHistoryTracker
  • fixes changelog update detection to use fetched table metadata and proper boolean field checks

Problem

For an extension configured with:

  • VIEW_TYPE=materialized_incremental
  • CLUSTERING=timestamp

the import script did not initialize the change tracker with equivalent settings. That meant the backfill path could diverge from the deployed extension’s BigQuery resource configuration.

In practice this caused two issues:

  • *_raw_latest could be handled as a normal view instead of a materialized view
  • existing *_raw_changelog metadata could be treated as needing update even when it already matched

Changes

  • Add --view-type and --clustering options to the import CLI

  • Parse and return viewType and clustering from both:
    - non-interactive mode
    - interactive prompt mode

  • Pass the following into FirestoreBigQueryEventHistoryTracker:
    - clustering
    - useMaterializedView
    - useIncrementalMaterializedView

  • Fix changelog update detection by:
    - comparing against fresh table.getMetadata() results rather than table.metadata
    - using boolean field existence checks instead of find() return values

Validation

Validated locally against an existing extension-managed dataset configured with:

  • VIEW_TYPE=materialized_incremental
  • clustering=timestamp

Observed behavior after this change:

  • existing *_raw_latest materialized view is recognized and skipped when config matches
  • import no longer updates changelog metadata unnecessarily in the verified case
  • import completes successfully in both:
    • non-interactive mode
    • interactive mode

Notes

There is still a log line from clustering initialization that can read as if clustering was updated even when no metadata write occurs. In the verified flow, that appears to be a logging artifact rather than an actual BigQuery table update.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for clustering and materialized views in the BigQuery export import script. Key changes include updating the CLI to accept --view-type and --clustering options, refining the tableRequiresUpdate logic to handle metadata explicitly, and switching to .some() for column existence checks. Feedback focuses on ensuring these new options are supported in the interactive setup path, cleaning up commented-out imports, improving type safety in the parseClustering function, and correcting a typo in the CLI help text.

Comment thread firestore-bigquery-export/scripts/import/src/config.ts
Comment thread firestore-bigquery-export/scripts/import/src/config.ts Outdated
Comment thread firestore-bigquery-export/scripts/import/src/config.ts Outdated
Comment thread firestore-bigquery-export/scripts/import/src/program.ts Outdated
Comment thread firestore-bigquery-export/scripts/import/src/program.ts Outdated
IzaakGough and others added 4 commits April 21, 2026 14:23
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@IzaakGough IzaakGough marked this pull request as ready for review April 21, 2026 14:10
@IzaakGough IzaakGough requested a review from cabljac April 21, 2026 14:11
@cabljac cabljac changed the title @invertase/fix import script fix(fs-bq-import-collection): so it respects existing firestore-bigquery-export BigQuery configuration during backfill Apr 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants