Skip to content

Eliraz/sso#14

Open
stavp-star wants to merge 12 commits into
mainfrom
eliraz/sso
Open

Eliraz/sso#14
stavp-star wants to merge 12 commits into
mainfrom
eliraz/sso

Conversation

@stavp-star

@stavp-star stavp-star commented Jun 15, 2026

Copy link
Copy Markdown
Collaborator

Summary by CodeRabbit

  • New Features

    • Session-based authentication with SSO providers (Google, Keycloak) and dedicated login page.
    • Organization support for secure multi-tenant access.
    • Automated table profiling for Airlines data integration.
    • Protected routes requiring authentication to access app features.
  • Improvements

    • Enhanced form validation and error messaging.
    • Admin approval workflow authentication updates.

@coderabbitai

coderabbitai Bot commented Jun 15, 2026

Copy link
Copy Markdown

Review Change Stack

Important

Review skipped

Too many files!

This PR contains 212 files, which is 62 over the limit of 150.

To get a review, narrow the scope:
• coderabbit review --type committed # exclude uncommitted changes
• coderabbit review --dir # limit to a subdirectory
• coderabbit review --base # compare against a closer base

Upgrade to a paid plan to raise the limit.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 40b5258b-ad6f-438a-8025-bd759c88a44a

📥 Commits

Reviewing files that changed from the base of the PR and between b6a0453 and d9f8eea.

⛔ Files ignored due to path filters (2)
  • backend/uv.lock is excluded by !**/*.lock
  • frontend/pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (212)
  • agent/scripts/upload_all_prompts.py
  • backend/Dockerfile
  • backend/alembic/env.py
  • backend/alembic/versions/4f7c2b9a8e1d_remove_profile_version.py
  • backend/alembic/versions/781457ead87d_add_httpextractor.py
  • backend/alembic/versions/aeb0e2695517_add_status_to_httpextractor.py
  • backend/alembic/versions/e738b4f2c9c2_add_table_embedding.py
  • backend/alembic/versions/f4d69a17d156_add_organization_and_sso_fields.py
  • backend/app/api/auth.py
  • backend/app/config.py
  • backend/app/core/auth.py
  • backend/app/infra_init.py
  • backend/app/main.py
  • backend/app/routers/admin_approval.py
  • backend/app/routers/admin_auth.py
  • backend/app/routers/agent.py
  • backend/app/routers/audit.py
  • backend/app/routers/evaluation.py
  • backend/app/routers/feedback.py
  • backend/app/routers/orchestration.py
  • backend/app/routers/publish.py
  • backend/app/routers/questions.py
  • backend/app/routers/scopes.py
  • backend/app/routers/tables.py
  • backend/app/seed.py
  • backend/app/services/auth.py
  • backend/app/services/evaluator.py
  • backend/app/services/join_detection.py
  • backend/app/services/profiling_engine.py
  • backend/app/services/trino_client.py
  • backend/e2e_test.py
  • backend/pyproject.toml
  • backend/tests/conftest.py
  • backend/tests/test_api.py
  • core/pyproject.toml
  • core/src/core/config.py
  • core/src/core/models/models.py
  • docker-compose.yml
  • frontend/nginx.conf
  • frontend/package.json
  • frontend/src/App.tsx
  • frontend/src/api/auth.ts
  • frontend/src/api/client.ts
  • frontend/src/api/orchestration.ts
  • frontend/src/api/schema.d.ts
  • frontend/src/components/layout/AuthProvider.tsx
  • frontend/src/components/layout/Sidebar.tsx
  • frontend/src/components/tables/TableList.tsx
  • frontend/src/components/wizard/OnboardingWizard.tsx
  • frontend/src/pages/LandingPage.tsx
  • frontend/src/pages/LoginPage.tsx
  • frontend/src/store/authStore.ts
  • frontend/src/styles/globals.css
  • infra/trino/etc/catalog/admin_db.properties
  • infra/trino/etc/catalog/adventureworks.properties
  • infra/trino/etc/catalog/airlines.properties
  • infra/trino/etc/catalog/amazon_vendor_analytics__sample_dataset.properties
  • infra/trino/etc/catalog/austin.properties
  • infra/trino/etc/catalog/bank_sales_trading.properties
  • infra/trino/etc/catalog/baseball.properties
  • infra/trino/etc/catalog/bbc.properties
  • infra/trino/etc/catalog/bls.properties
  • infra/trino/etc/catalog/bowlingleague.properties
  • infra/trino/etc/catalog/braze_user_event_demo_dataset.properties
  • infra/trino/etc/catalog/brazilian_e_commerce.properties
  • infra/trino/etc/catalog/california_traffic_collision.properties
  • infra/trino/etc/catalog/census_bureau_acs_1.properties
  • infra/trino/etc/catalog/census_bureau_acs_2.properties
  • infra/trino/etc/catalog/census_bureau_international.properties
  • infra/trino/etc/catalog/census_bureau_usa.properties
  • infra/trino/etc/catalog/census_galaxy__aiml_model_data_enrichment_sample.properties
  • infra/trino/etc/catalog/census_galaxy__zip_code_to_block_group_sample.properties
  • infra/trino/etc/catalog/chicago.properties
  • infra/trino/etc/catalog/chinook.properties
  • infra/trino/etc/catalog/city_legislation.properties
  • infra/trino/etc/catalog/cms_data.properties
  • infra/trino/etc/catalog/complex_oracle.properties
  • infra/trino/etc/catalog/covid19_jhu_world_bank.properties
  • infra/trino/etc/catalog/covid19_nyt.properties
  • infra/trino/etc/catalog/covid19_open_data.properties
  • infra/trino/etc/catalog/covid19_open_world_bank.properties
  • infra/trino/etc/catalog/covid19_symptom_search.properties
  • infra/trino/etc/catalog/covid19_usa.properties
  • infra/trino/etc/catalog/cptac_pdc.properties
  • infra/trino/etc/catalog/crypto.properties
  • infra/trino/etc/catalog/cymbal_investments.properties
  • infra/trino/etc/catalog/db_imdb.properties
  • infra/trino/etc/catalog/death.properties
  • infra/trino/etc/catalog/delivery_center.properties
  • infra/trino/etc/catalog/deps_dev_v1.properties
  • infra/trino/etc/catalog/dimensions_ai_covid19.properties
  • infra/trino/etc/catalog/e_commerce.properties
  • infra/trino/etc/catalog/ebi_chembl.properties
  • infra/trino/etc/catalog/eclipse_megamovie.properties
  • infra/trino/etc/catalog/ecommerce.properties
  • infra/trino/etc/catalog/education_business.properties
  • infra/trino/etc/catalog/electronic_sales.properties
  • infra/trino/etc/catalog/entertainmentagency.properties
  • infra/trino/etc/catalog/epa_historical_air_quality.properties
  • infra/trino/etc/catalog/ethereum_blockchain.properties
  • infra/trino/etc/catalog/eu_soccer.properties
  • infra/trino/etc/catalog/f1.properties
  • infra/trino/etc/catalog/fda.properties
  • infra/trino/etc/catalog/fec.properties
  • infra/trino/etc/catalog/fhir_synthea.properties
  • infra/trino/etc/catalog/finance__economics.properties
  • infra/trino/etc/catalog/firebase.properties
  • infra/trino/etc/catalog/ga360.properties
  • infra/trino/etc/catalog/ga4.properties
  • infra/trino/etc/catalog/gbif.properties
  • infra/trino/etc/catalog/genomics_cannabis.properties
  • infra/trino/etc/catalog/geo_openstreetmap.properties
  • infra/trino/etc/catalog/geo_openstreetmap_boundaries.properties
  • infra/trino/etc/catalog/geo_openstreetmap_census_places.properties
  • infra/trino/etc/catalog/geo_openstreetmap_worldpop.properties
  • infra/trino/etc/catalog/ghcn_d.properties
  • infra/trino/etc/catalog/github_repos.properties
  • infra/trino/etc/catalog/github_repos_date.properties
  • infra/trino/etc/catalog/global_government.properties
  • infra/trino/etc/catalog/global_weather__climate_data_for_bi.properties
  • infra/trino/etc/catalog/gnomad.properties
  • infra/trino/etc/catalog/goog_blockchain.properties
  • infra/trino/etc/catalog/google_ads.properties
  • infra/trino/etc/catalog/google_dei.properties
  • infra/trino/etc/catalog/google_trends.properties
  • infra/trino/etc/catalog/hacker_news.properties
  • infra/trino/etc/catalog/htan_1.properties
  • infra/trino/etc/catalog/htan_2.properties
  • infra/trino/etc/catalog/human_genome_variants.properties
  • infra/trino/etc/catalog/idc.properties
  • infra/trino/etc/catalog/imdb_movies.properties
  • infra/trino/etc/catalog/iowa_liquor_sales.properties
  • infra/trino/etc/catalog/iowa_liquor_sales_plus.properties
  • infra/trino/etc/catalog/ipl.properties
  • infra/trino/etc/catalog/irs_990.properties
  • infra/trino/etc/catalog/libraries_io.properties
  • infra/trino/etc/catalog/london.properties
  • infra/trino/etc/catalog/meta_kaggle.properties
  • infra/trino/etc/catalog/mitelman.properties
  • infra/trino/etc/catalog/mlb.properties
  • infra/trino/etc/catalog/modern_data.properties
  • infra/trino/etc/catalog/music.properties
  • infra/trino/etc/catalog/ncaa_basketball.properties
  • infra/trino/etc/catalog/ncaa_insights.properties
  • infra/trino/etc/catalog/netherlands_open_map_data.properties
  • infra/trino/etc/catalog/new_york.properties
  • infra/trino/etc/catalog/new_york_citibike_1.properties
  • infra/trino/etc/catalog/new_york_citibike_2.properties
  • infra/trino/etc/catalog/new_york_geo.properties
  • infra/trino/etc/catalog/new_york_ghcn.properties
  • infra/trino/etc/catalog/new_york_noaa.properties
  • infra/trino/etc/catalog/new_york_plus.properties
  • infra/trino/etc/catalog/nhtsa_traffic_fatalities.properties
  • infra/trino/etc/catalog/nhtsa_traffic_fatalities_plus.properties
  • infra/trino/etc/catalog/noaa_data.properties
  • infra/trino/etc/catalog/noaa_data_plus.properties
  • infra/trino/etc/catalog/noaa_global_forecast_system.properties
  • infra/trino/etc/catalog/noaa_gsod.properties
  • infra/trino/etc/catalog/noaa_ports.properties
  • infra/trino/etc/catalog/northwind.properties
  • infra/trino/etc/catalog/nppes.properties
  • infra/trino/etc/catalog/open_images.properties
  • infra/trino/etc/catalog/open_targets_genetics_1.properties
  • infra/trino/etc/catalog/open_targets_genetics_2.properties
  • infra/trino/etc/catalog/open_targets_platform_1.properties
  • infra/trino/etc/catalog/open_targets_platform_2.properties
  • infra/trino/etc/catalog/openaq.properties
  • infra/trino/etc/catalog/oracle_sql.properties
  • infra/trino/etc/catalog/pagila.properties
  • infra/trino/etc/catalog/pancancer_atlas_1.properties
  • infra/trino/etc/catalog/pancancer_atlas_2.properties
  • infra/trino/etc/catalog/patents.properties
  • infra/trino/etc/catalog/patents_google.properties
  • infra/trino/etc/catalog/patents_uspto.properties
  • infra/trino/etc/catalog/patentsview.properties
  • infra/trino/etc/catalog/pypi.properties
  • infra/trino/etc/catalog/san_francisco.properties
  • infra/trino/etc/catalog/san_francisco_plus.properties
  • infra/trino/etc/catalog/school_scheduling.properties
  • infra/trino/etc/catalog/sdoh.properties
  • infra/trino/etc/catalog/sec_filings.properties
  • infra/trino/etc/catalog/sec_quarterly_financials.properties
  • infra/trino/etc/catalog/sf__1000_genomes.properties
  • infra/trino/etc/catalog/sqlite_sakila.properties
  • infra/trino/etc/catalog/stacking.properties
  • infra/trino/etc/catalog/stackoverflow.properties
  • infra/trino/etc/catalog/stackoverflow_plus.properties
  • infra/trino/etc/catalog/sunroof_solar.properties
  • infra/trino/etc/catalog/targetome_reactome.properties
  • infra/trino/etc/catalog/tcga.properties
  • infra/trino/etc/catalog/tcga_bioclin_v0.properties
  • infra/trino/etc/catalog/tcga_hg19_data_v0.properties
  • infra/trino/etc/catalog/tcga_hg38_data_v0.properties
  • infra/trino/etc/catalog/tcga_mitelman.properties
  • infra/trino/etc/catalog/the_met.properties
  • infra/trino/etc/catalog/thelook_ecommerce.properties
  • infra/trino/etc/catalog/us_addresses__poi.properties
  • infra/trino/etc/catalog/us_real_estate.properties
  • infra/trino/etc/catalog/usa_names.properties
  • infra/trino/etc/catalog/usda_nass_agriculture.properties
  • infra/trino/etc/catalog/user_barvaz_hamood.properties
  • infra/trino/etc/catalog/usfs_fia.properties
  • infra/trino/etc/catalog/weather__environment.properties
  • infra/trino/etc/catalog/wide_world_importers.properties
  • infra/trino/etc/catalog/word_vectors_us.properties
  • infra/trino/etc/catalog/world_bank.properties
  • infra/trino/etc/catalog/wwe.properties
  • infra/trino/etc/catalog/yes_energy__sample_data.properties
  • infra/trino/etc/jvm.config
  • scripts/download_images.sh
  • scripts/generate_trino_catalogs.py
  • scripts/trigger_profiling.py

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR adds end-to-end SSO authentication (Google/Keycloak) with organization membership, including DB migration, backend core helpers, FastAPI routing reorganization with session middleware, and a full frontend auth flow (Zustand store, LoginPage, protected routes, sidebar user profile). It also migrates TableList and OnboardingWizard to react-hook-form with Zod, integrates a Snowflake Airlines catalog into Trino with automated profiling at startup, and adds an OpenMetadata cached-token mechanism.

Changes

SSO Authentication End-to-End

Layer / File(s) Summary
Security data models and DB migration
core/src/core/models/models.py, backend/alembic/versions/f4d69a17d156_add_organization_and_sso_fields.py
Adds OrganizationMember join table, Organization model, sso_id/provider fields on SecurityUser, and AuthConfigRead; Alembic migration creates/drops these objects with conditional inspection logic.
Backend auth config and core helpers
backend/app/config.py, backend/app/core/auth.py, backend/pyproject.toml, core/pyproject.toml
Adds AuthSettings(SSOSettings) and auth_settings instance, expands CORS origins, adds python-core-utils[keycloak] dependency; implements sync_user_from_payload, get_current_user, and check_admin.
Auth API endpoints, main.py routing, and admin auth migration
backend/app/api/auth.py, backend/app/main.py, backend/app/routers/admin_approval.py
Adds SSO callback, /config, and /me endpoints; rewires main.py with SessionMiddleware, public/private router split under /api; migrates admin approval endpoints from X-Admin-Email header to check_admin dependency.
Backend test fixture and URL updates
backend/tests/conftest.py, backend/tests/test_api.py
Overrides get_current_user with a fixed admin SecurityUser in the test client fixture; updates all test request paths to /api/... prefix.
Frontend auth store and API client
frontend/src/store/authStore.ts, frontend/src/api/auth.ts, frontend/src/api/client.ts, frontend/src/api/orchestration.ts
Adds useAuthStore Zustand store, typed authApi helpers (getMe, getConfig, logout), enables withCredentials on shared Axios client, consolidates orchestration client.
AuthProvider, LoginPage, and protected routing
frontend/src/components/layout/AuthProvider.tsx, frontend/src/pages/LoginPage.tsx, frontend/src/pages/LandingPage.tsx, frontend/src/App.tsx, frontend/package.json
Adds AuthProvider bootstrap component, LoginPage with provider-gated sign-in buttons, ProtectedRoute/ProtectedAdminRoute guards using useAuthStore, and /login route; LandingPage redirects authenticated users.
Sidebar auth integration and UI styles
frontend/src/components/layout/Sidebar.tsx, frontend/src/styles/globals.css
Sidebar hides Administration nav for non-admin users, renders user profile card and Sign Out button; adds .btn:disabled and --error form control CSS rules.

Frontend Form Validation Migration

Layer / File(s) Summary
TableList react-hook-form migration
frontend/src/components/tables/TableList.tsx
Replaces useState form with useForm+zodResolver for oasis_source_id; adds inline validation errors, mutation error banner, and post-creation form reset.
OnboardingWizard react-hook-form migration
frontend/src/components/wizard/OnboardingWizard.tsx
Replaces local enrichment/questions state with useForm, useFieldArray for columns and questions, Zod schemas per step, step-specific trigger validation in handleNext, and form-driven UI with error banners.

Snowflake Airlines Trino Integration

Layer / File(s) Summary
Trino/Snowflake catalog, JVM, and compose wiring
infra/trino/etc/catalog/airlines.properties, infra/trino/etc/jvm.config, docker-compose.yml, .gitignore
Adds Snowflake connector catalog for Airlines, swaps JVM flag for --add-opens java.nio, updates docker-compose Trino image/volumes/depends_on, activates backend startup command, and parameterizes SSO/Langfuse env vars.
infra_init Trino hardening and Airlines registration
backend/app/infra_init.py
Tightens Trino readiness check; adds MinIO prefix cleanup and DDL retry on non-empty-location errors; adds _verify_custom_catalogs and _ensure_airlines_registered (idempotent DB registration + background profiling threads) wired into init_infrastructure as non-fatal steps.

OpenMetadata Cached Token and Build Tooling

Layer / File(s) Summary
OpenMetadata cached token and request wrapper
backend/app/routers/tables.py
Adds _cached_om_token, get_om_token (login with base64 credentials, fallback to static token), and om_get_request (authenticated GET with 401 retry); updates create_table and sync_table_schema with specific HTTPStatusError handling.
OpenShift image export script and infra tweaks
scripts/download_images.sh, frontend/nginx.conf, backend/Dockerfile
Adds download_images.sh for buildx multi-arch image export to tar files; fixes nginx proxy Host header to $http_host; cleans apt lists in Dockerfile RUN layer.

Sequence Diagram

sequenceDiagram
  participant Browser
  participant LoginPage
  participant SSOProvider as SSO Provider
  participant FastAPI
  participant on_login_success
  participant DB

  rect rgba(30, 80, 160, 0.5)
    note over Browser,LoginPage: Login flow
    Browser->>LoginPage: visit /login
    LoginPage->>FastAPI: GET /v1/auth/config
    FastAPI-->>LoginPage: {ENABLE_GOOGLE, ENABLE_KEYCLOAK}
    Browser->>FastAPI: GET /v1/auth/login/{provider}?next_url=...
    FastAPI-->>Browser: redirect to SSOProvider
    Browser->>SSOProvider: authenticate
    SSOProvider-->>Browser: redirect callback
    Browser->>FastAPI: callback with userinfo
    FastAPI->>on_login_success: userinfo, provider
    on_login_success->>DB: upsert SecurityUser + sync orgs
    DB-->>on_login_success: user
    on_login_success-->>FastAPI: session["user_id"] = user.id
    FastAPI-->>Browser: redirect to /control-center
  end

  rect rgba(20, 120, 60, 0.5)
    note over Browser,FastAPI: Authenticated request flow
    Browser->>FastAPI: GET /api/... (with session cookie)
    FastAPI->>DB: load SecurityUser by session user_id
    DB-->>FastAPI: SecurityUser
    FastAPI-->>Browser: response
  end
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Poem

🐇 Hoppin' through auth gates, new keys in paw,
SSO tokens flow without a flaw.
Zod schemas guard each form field tight,
Airlines tables profiled overnight.
React routes now check your admin cred —
This bunny ships features, not just bread!

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title "Eliraz/sso" is a branch name reference rather than a descriptive summary of the actual changes, which include authentication, SSO integration, organizations, and infrastructure updates. Use a descriptive title that summarizes the main changes, such as "Add SSO authentication with organization support" or "Implement Keycloak/Google SSO and organization memberships".
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch eliraz/sso

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 22

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/app/routers/admin_approval.py (1)

4-4: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Unused imports from auth migration cleanup. After migrating from header-based to session-based authentication via check_admin, two imports are no longer needed.

  • backend/app/routers/admin_approval.py#L4-L4: Remove Header from the fastapi import.
  • backend/app/routers/admin_approval.py#L17-L17: Remove the entire require_admin import line.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/routers/admin_approval.py` at line 4, Clean up unused imports
from the authentication migration in backend/app/routers/admin_approval.py. At
line 4-4, remove `Header` from the fastapi import statement since it is no
longer used after switching to session-based authentication. At line 17-17,
remove the entire `require_admin` import line which is also no longer needed
with the `check_admin` session-based approach.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.gitignore:
- Line 40: The `*.conf` pattern in the .gitignore file is overly broad and will
unintentionally ignore all `.conf` files repository-wide, including legitimate
configuration files that should be tracked. Replace the generic `*.conf` pattern
with a more specific pattern that targets only the configuration files you
actually want to ignore (e.g., `snowflake*.conf`), or move this pattern to a
subdirectory-specific `.gitignore` file if the `.conf` files you want to ignore
are contained within a specific directory.

In `@backend/alembic/versions/f4d69a17d156_add_organization_and_sso_fields.py`:
- Line 4: Update the docstring revision statement to match the actual
down_revision value in the migration file. The Revises line in the docstring
(currently stating d35412e44b34) should be changed to match the actual
down_revision variable (aeb0e2695517) to ensure consistency and prevent
confusion during migration operations.
- Around line 78-83: In the downgrade function, before dropping the sso_id
column from the users table, explicitly drop the ix_security_users_sso_id index
that was created during the upgrade. Add an op.drop_index call with the index
name and schema specification before the op.drop_column call for sso_id to
improve migration clarity and ensure cross-database portability, rather than
relying on PostgreSQL's automatic index cleanup.

In `@backend/app/api/auth.py`:
- Around line 29-31: The user parameter in the get_me function lacks a type
annotation. Add the type annotation to the user parameter by importing
SecurityUser from core.models.models and updating the function signature from
def get_me(user = Depends(get_current_user)): to def get_me(user: SecurityUser =
Depends(get_current_user)): to provide proper type information for the
parameter.

In `@backend/app/config.py`:
- Line 15: The CORS_ORIGINS list in the config.py file is missing the default
Vite development server origin. Add "http://localhost:5173" to the CORS_ORIGINS
list (the list currently contains localhost:3000, localhost:8080, and
host.docker.internal:5173) to ensure that credentialed authentication requests
from local frontend development are not blocked by CORS policy.

In `@backend/app/core/auth.py`:
- Around line 24-30: The SecurityUser instantiation is setting the email
parameter to None when no email is provided alongside the sso_id, but the
SecurityUser model defines email as a non-nullable field (email: str =
Field(unique=True, index=True)), causing a database constraint violation. Fix
this by either requiring the email parameter to be provided before creating the
SecurityUser object, or by generating a fallback email value (such as deriving
it from the sso_id and provider) when email is not supplied. Update the logic
before the SecurityUser instantiation to ensure email is never None when passed
to the constructor.
- Around line 38-53: The group sync logic only adds organizations based on the
SSO payload but never removes memberships for groups the user is no longer part
of. After the loop that adds organizations from cleaned_groups, add logic to
remove any organizations from user.organizations that are not in the
cleaned_groups list. This ensures bidirectional synchronization where
organizations are both added when the user joins a group and removed when
they're no longer part of that group in SSO.

In `@backend/app/infra_init.py`:
- Line 675: The import re statement is currently located inside a function at
line 675 instead of at the module level. Move the import re statement to the top
of the file with the other standard library imports to follow Python conventions
and improve code clarity. Remove the import statement from its current location
inside the function and place it at the beginning of the file with other
module-level imports.
- Around line 1069-1146: The _run_profile function has a critical issue where if
the initial TableProfile insert and session.commit() in the first session block
fails, the profile_id variable will never be assigned, causing an
UnboundLocalError when it is referenced at session.get(TableProfile, profile_id)
in the second session block. Wrap the initial profile creation block (from
TableProfile instantiation through session.refresh and profile_id assignment) in
a separate try-except handler to catch and log profile creation failures
distinctly. Additionally, either initialize profile_id to None before the first
session block and check it is set before using it in the second session, or
ensure that profile creation errors are caught early and the function returns
before attempting to use an unset profile_id. This will prevent
UnboundLocalError and allow the error logging to distinguish between profile
creation failures and profiling execution failures.
- Around line 1046-1054: The Table object being created with
owner_id=_SYSTEM_OWNER_ID (where _SYSTEM_OWNER_ID equals "system") references a
user id that does not have a corresponding SecurityUser record in the database.
To fix this, either create a SecurityUser record with id="system" during the
initialization process, or change the owner_id parameter in the Table
constructor to reference an id that corresponds to an existing SecurityUser
record that is already created during initialization.

In `@backend/app/main.py`:
- Around line 69-74: The SessionMiddleware configuration in the
app.add_middleware call is missing the https_only parameter, which defaults to
False and allows session cookies to be transmitted over unencrypted HTTP. Add
https_only=True as a parameter to the SessionMiddleware configuration to ensure
session cookies are only transmitted over HTTPS in production, enhancing the
security of authentication sessions.

In `@backend/app/routers/tables.py`:
- Around line 38-44: The OpenMetadata admin credentials (email
"admin@open-metadata.org" and password "admin") are hardcoded in the httpx.post
call in the try block. Move these hardcoded values to the Settings class in
config.py as two new fields: OPENMETADATA_ADMIN_EMAIL and
OPENMETADATA_ADMIN_PASSWORD with their respective default values. Then, in the
try block of tables.py, replace the hardcoded email string with
settings.OPENMETADATA_ADMIN_EMAIL and replace the hardcoded password "admin"
with settings.OPENMETADATA_ADMIN_PASSWORD before the base64 encoding step.
- Around line 56-68: The om_get_request function has hardcoded verify=False in
both httpx.get calls, which disables TLS certificate verification and creates a
security vulnerability. Replace the hardcoded False values with a configurable
setting: create a new configuration option (e.g., OPENMETADATA_VERIFY_SSL) in
your config module, import it into the routers/tables.py file, and use this
configuration value instead of False in both httpx.get calls within
om_get_request. Additionally, ensure consistency by applying the same
configurable verification setting to the httpx.post call in the get_om_token
function (referenced at line 40-44) so both functions handle SSL verification
uniformly.

In `@backend/Dockerfile`:
- Line 5: The apt-get install command in backend/Dockerfile is installing
recommended packages unnecessarily, which bloats the image size and increases
the attack surface. Add the --no-install-recommends flag to the apt-get install
command to prevent installation of recommended packages that are not explicitly
required, keeping only the specified packages git and openssh-client. This
should be added as a flag to apt-get install alongside the existing -y flag.

In `@backend/pyproject.toml`:
- Line 27: The python-core-utils dependency currently uses a mutable branch
reference `@feat/google-sso` that can change over time, introducing
unpredictability into builds. In backend/pyproject.toml at line 27, replace the
branch reference `@feat/google-sso` in the python-core-utils[keycloak] dependency
entry with a specific commit SHA or signed release tag. In core/pyproject.toml
at line 15, apply the same immutable SHA or tag replacement to the
python-core-utils dependency to ensure consistent resolution across both files.
Use the exact same commit SHA or tag in both locations to maintain alignment
between backend and core dependencies.

In `@frontend/nginx.conf`:
- Around line 17-20: The proxy_set_header directives for both Host and
X-Forwarded-Host are currently using $http_host, which forwards raw
client-supplied host values and increases the trust surface for potential header
injection attacks. Replace $http_host with $host in both the Host header
proxy_set_header directive and the X-Forwarded-Host header proxy_set_header
directive to use the normalized server-defined hostname instead of the raw
client-supplied value.

In `@frontend/src/components/layout/Sidebar.tsx`:
- Around line 185-211: The icon-only Sign Out button relies solely on the title
attribute for accessibility. Add an aria-label attribute to the button element
that contains the LogOut icon to provide an explicit accessible name that
assistive technologies will consistently announce, ensuring proper accessibility
support alongside the existing title attribute.

In `@frontend/src/components/tables/TableList.tsx`:
- Around line 78-81: The useEffect hook in the TableList component calls
createMutation.reset but omits it from the dependency array. Add
createMutation.reset to the dependency array of the useEffect so it becomes
[watchOasisSourceId, createMutation.reset]. This satisfies ESLint's
react-hooks/exhaustive-deps rule which requires that all values from the
component scope that change over time and are used inside the effect must be
included as dependencies.

In `@frontend/src/components/wizard/OnboardingWizard.tsx`:
- Around line 145-154: The sequential question creation loop in
OnboardingWizard.tsx uses await inside a for loop, meaning if one
createQuestionMutation.mutateAsync call fails, earlier questions are already
persisted, leaving partial data. Replace the for loop that iterates through the
questions array with Promise.all to execute all
createQuestionMutation.mutateAsync calls in parallel, ensuring atomicity where
either all questions are created successfully or the entire operation fails,
preventing partial-success scenarios.
- Around line 91-98: The useEffect dependency array in the OnboardingWizard
component includes watch('columns') and watch('questions'), which return new
array references on each form state change, causing the effect to run
unnecessarily. Replace watching the full array objects with watching their
lengths instead by using watch('columns.length') and watch('questions.length'),
or alternatively implement a form subscription approach with a comparison
callback that only triggers when actual array content changes rather than when
reference identity changes. This will prevent inefficient re-execution of
setSubmitError(null) while maintaining the intended behavior of clearing errors
when relevant form fields change.
- Around line 405-468: The OnboardingWizard component is missing a select input
for the `question_type` field in the questions step UI, even though
`questionSchema` defines it as a required enum field with values 'join',
'simple', 'complex', 'geo', 'aggregate', 'time_series'. Add a new form-group
after the difficulty select in the questions step rendering section (within the
card where difficulty is rendered) that includes a select input for
`question_type` using the same pattern as the existing difficulty select.
Register it with the form using register(`questions.${i}.question_type`) and
provide all enum option values so users can explicitly choose the question type
instead of defaulting to 'simple'.

In `@scripts/download_images.sh`:
- Around line 2-3: The header comment on line 2 references the incorrect script
name "build_openshift.sh" when the actual file is "scripts/download_images.sh".
Update the comment on line 2 to correctly reference "download_images.sh" instead
of "build_openshift.sh" to prevent operator confusion when following runbooks.

---

Outside diff comments:
In `@backend/app/routers/admin_approval.py`:
- Line 4: Clean up unused imports from the authentication migration in
backend/app/routers/admin_approval.py. At line 4-4, remove `Header` from the
fastapi import statement since it is no longer used after switching to
session-based authentication. At line 17-17, remove the entire `require_admin`
import line which is also no longer needed with the `check_admin` session-based
approach.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 8f0bbbda-0c26-42e2-8182-a85eb3e8cc40

📥 Commits

Reviewing files that changed from the base of the PR and between 3a80177 and b6a0453.

⛔ Files ignored due to path filters (2)
  • backend/uv.lock is excluded by !**/*.lock
  • frontend/pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (34)
  • .gitignore
  • backend/Dockerfile
  • backend/alembic/versions/f4d69a17d156_add_organization_and_sso_fields.py
  • backend/app/api/auth.py
  • backend/app/config.py
  • backend/app/core/auth.py
  • backend/app/infra_init.py
  • backend/app/main.py
  • backend/app/routers/admin_approval.py
  • backend/app/routers/tables.py
  • backend/pyproject.toml
  • backend/tests/conftest.py
  • backend/tests/test_api.py
  • core/pyproject.toml
  • core/src/core/models/models.py
  • docker-compose.yml
  • frontend/nginx.conf
  • frontend/package.json
  • frontend/src/App.tsx
  • frontend/src/api/auth.ts
  • frontend/src/api/client.ts
  • frontend/src/api/orchestration.ts
  • frontend/src/api/schema.d.ts
  • frontend/src/components/layout/AuthProvider.tsx
  • frontend/src/components/layout/Sidebar.tsx
  • frontend/src/components/tables/TableList.tsx
  • frontend/src/components/wizard/OnboardingWizard.tsx
  • frontend/src/pages/LandingPage.tsx
  • frontend/src/pages/LoginPage.tsx
  • frontend/src/store/authStore.ts
  • frontend/src/styles/globals.css
  • infra/trino/etc/catalog/airlines.properties
  • infra/trino/etc/jvm.config
  • scripts/download_images.sh

Comment thread .gitignore Outdated
Comment thread backend/alembic/versions/f4d69a17d156_add_organization_and_sso_fields.py Outdated
Comment on lines +78 to +83
if 'users' in security_tables:
columns_users = [c['name'] for c in inspector.get_columns('users', schema='security')]
if 'provider' in columns_users:
op.drop_column('users', 'provider', schema='security')
if 'sso_id' in columns_users:
op.drop_column('users', 'sso_id', schema='security')

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | 💤 Low value

Consider explicitly dropping ix_security_users_sso_id index before the column.

The upgrade creates an index ix_security_users_sso_id on line 34, but the downgrade drops the sso_id column without explicitly dropping the index first. While PostgreSQL auto-drops indexes when the column is removed, explicit handling improves migration clarity and portability.

♻️ Proposed fix
     if 'users' in security_tables:
+        indexes_users = [i['name'] for i in inspector.get_indexes('users', schema='security')]
+        if 'ix_security_users_sso_id' in indexes_users:
+            op.drop_index(op.f('ix_security_users_sso_id'), table_name='users', schema='security')
         columns_users = [c['name'] for c in inspector.get_columns('users', schema='security')]
         if 'provider' in columns_users:
             op.drop_column('users', 'provider', schema='security')
         if 'sso_id' in columns_users:
             op.drop_column('users', 'sso_id', schema='security')
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if 'users' in security_tables:
columns_users = [c['name'] for c in inspector.get_columns('users', schema='security')]
if 'provider' in columns_users:
op.drop_column('users', 'provider', schema='security')
if 'sso_id' in columns_users:
op.drop_column('users', 'sso_id', schema='security')
if 'users' in security_tables:
indexes_users = [i['name'] for i in inspector.get_indexes('users', schema='security')]
if 'ix_security_users_sso_id' in indexes_users:
op.drop_index(op.f('ix_security_users_sso_id'), table_name='users', schema='security')
columns_users = [c['name'] for c in inspector.get_columns('users', schema='security')]
if 'provider' in columns_users:
op.drop_column('users', 'provider', schema='security')
if 'sso_id' in columns_users:
op.drop_column('users', 'sso_id', schema='security')
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/alembic/versions/f4d69a17d156_add_organization_and_sso_fields.py`
around lines 78 - 83, In the downgrade function, before dropping the sso_id
column from the users table, explicitly drop the ix_security_users_sso_id index
that was created during the upgrade. Add an op.drop_index call with the index
name and schema specification before the op.drop_column call for sso_id to
improve migration clarity and ensure cross-database portability, rather than
relying on PostgreSQL's automatic index cleanup.

Comment thread backend/app/api/auth.py
Comment thread backend/app/config.py Outdated
Comment on lines +78 to +81
const watchOasisSourceId = watch('oasis_source_id');
useEffect(() => {
createMutation.reset();
}, [watchOasisSourceId]);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | 💤 Low value

Missing createMutation.reset in useEffect dependency array.

ESLint's react-hooks/exhaustive-deps rule would flag this. While React Query's reset function is referentially stable in practice, the dependency array should be complete.

Suggested fix
   useEffect(() => {
     createMutation.reset();
-  }, [watchOasisSourceId]);
+  }, [watchOasisSourceId, createMutation]);

Alternatively, if you want to avoid the lint warning while keeping the current behavior:

+  // eslint-disable-next-line react-hooks/exhaustive-deps
   useEffect(() => {
     createMutation.reset();
   }, [watchOasisSourceId]);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const watchOasisSourceId = watch('oasis_source_id');
useEffect(() => {
createMutation.reset();
}, [watchOasisSourceId]);
const watchOasisSourceId = watch('oasis_source_id');
useEffect(() => {
createMutation.reset();
}, [watchOasisSourceId, createMutation]);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@frontend/src/components/tables/TableList.tsx` around lines 78 - 81, The
useEffect hook in the TableList component calls createMutation.reset but omits
it from the dependency array. Add createMutation.reset to the dependency array
of the useEffect so it becomes [watchOasisSourceId, createMutation.reset]. This
satisfies ESLint's react-hooks/exhaustive-deps rule which requires that all
values from the component scope that change over time and are used inside the
effect must be included as dependencies.

Comment on lines +91 to +98
const watchOasisSourceId = watch('oasis_source_id');
const watchTableDescription = watch('table_description');
const watchColumns = watch('columns') || [];
const watchQuestions = watch('questions') || [];

useEffect(() => {
setSubmitError(null);
}, [watchOasisSourceId, watchTableDescription, watchColumns, watchQuestions]);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

useEffect watching arrays may run on every render.

watch('columns') and watch('questions') return new array references when form state changes, potentially triggering this effect more frequently than intended. While setSubmitError(null) is idempotent (no infinite loop), it's inefficient.

Suggested fix: watch scalar field count or use JSON comparison
-  const watchColumns = watch('columns') || [];
-  const watchQuestions = watch('questions') || [];
+  const watchColumns = watch('columns');
+  const watchQuestions = watch('questions');
+  const columnsLength = watchColumns?.length ?? 0;
+  const questionsLength = watchQuestions?.length ?? 0;

   useEffect(() => {
     setSubmitError(null);
-  }, [watchOasisSourceId, watchTableDescription, watchColumns, watchQuestions]);
+  }, [watchOasisSourceId, watchTableDescription, columnsLength, questionsLength]);

Alternatively, if you need to detect content changes within arrays, consider using a form subscription or useWatch with a comparison callback.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const watchOasisSourceId = watch('oasis_source_id');
const watchTableDescription = watch('table_description');
const watchColumns = watch('columns') || [];
const watchQuestions = watch('questions') || [];
useEffect(() => {
setSubmitError(null);
}, [watchOasisSourceId, watchTableDescription, watchColumns, watchQuestions]);
const watchOasisSourceId = watch('oasis_source_id');
const watchTableDescription = watch('table_description');
const watchColumns = watch('columns');
const watchQuestions = watch('questions');
const columnsLength = watchColumns?.length ?? 0;
const questionsLength = watchQuestions?.length ?? 0;
useEffect(() => {
setSubmitError(null);
}, [watchOasisSourceId, watchTableDescription, columnsLength, questionsLength]);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@frontend/src/components/wizard/OnboardingWizard.tsx` around lines 91 - 98,
The useEffect dependency array in the OnboardingWizard component includes
watch('columns') and watch('questions'), which return new array references on
each form state change, causing the effect to run unnecessarily. Replace
watching the full array objects with watching their lengths instead by using
watch('columns.length') and watch('questions.length'), or alternatively
implement a form subscription approach with a comparison callback that only
triggers when actual array content changes rather than when reference identity
changes. This will prevent inefficient re-execution of setSubmitError(null)
while maintaining the intended behavior of clearing errors when relevant form
fields change.

Comment thread frontend/src/components/wizard/OnboardingWizard.tsx
Comment on lines 405 to +468
{step === 'questions' && (
<div>
<h3 style={{ fontWeight: 700, marginBottom: 16 }}>{t('wizard.steps.questions')}</h3>
{questions.map((q, i) => (
<div
key={i}
className="card card--elevated"
style={{ padding: 12, marginBottom: 10 }}
>
<div className="form-group">
<label className="form-label">Question</label>
<input
className="form-input"
value={q.question}
onChange={(e) => updateQuestion(i, { question: e.target.value })}
placeholder="How many orders were placed last month?"
/>
</div>
<div className="form-group">
<label className="form-label">Expected SQL</label>
<textarea
className="form-textarea"
rows={2}
value={q.expected_sql}
onChange={(e) => updateQuestion(i, { expected_sql: e.target.value })}
placeholder="SELECT COUNT(*) FROM ..."
/>
</div>
<select
className="form-select"
value={q.difficulty}
onChange={(e) => updateQuestion(i, { difficulty: e.target.value as any })}
{questionFields.map((field, i) => {
const questionError = errors.questions?.[i]?.question;
const expectedSqlError = errors.questions?.[i]?.expected_sql;

return (
<div
key={field.id}
className="card card--elevated"
style={{ padding: 12, marginBottom: 16 }}
>
<option value="simple">Simple</option>
<option value="medium">Medium</option>
<option value="complex">Complex</option>
</select>
</div>
))}
<button className="btn btn--ghost btn--sm" onClick={addQuestion}>
<div style={{ display: 'flex', justifyContent: 'space-between', alignItems: 'center', marginBottom: 8 }}>
<span className="text-sm fw-600 text-muted">Question #{i + 1}</span>
<button
type="button"
className="btn btn--danger btn--sm"
onClick={() => removeQuestion(i)}
style={{ padding: '2px 6px', fontSize: '11px' }}
>
Remove
</button>
</div>

<div className="form-group">
<label className="form-label">Question</label>
<input
className={`form-input${questionError ? ' form-input--error' : ''}`}
placeholder="How many orders were placed last month?"
{...register(`questions.${i}.question`)}
/>
{questionError && (
<div className="form-error">{questionError.message}</div>
)}
</div>

<div className="form-group">
<label className="form-label">Expected SQL</label>
<textarea
className={`form-textarea${expectedSqlError ? ' form-textarea--error' : ''}`}
rows={2}
placeholder="SELECT COUNT(*) FROM ..."
{...register(`questions.${i}.expected_sql`)}
/>
{expectedSqlError && (
<div className="form-error">{expectedSqlError.message}</div>
)}
</div>

<div className="form-group" style={{ marginBottom: 0 }}>
<label className="form-label">Difficulty</label>
<select
className="form-select"
{...register(`questions.${i}.difficulty`)}
>
<option value="simple">Simple</option>
<option value="medium">Medium</option>
<option value="complex">Complex</option>
</select>
</div>
</div>
);
})}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Missing question_type select field in questions step UI.

The questionSchema (line 26) defines question_type as a required enum with values ['join', 'simple', 'complex', 'geo', 'aggregate', 'time_series'], and the API contract (GoldenQuestionCreate in schema.d.ts) requires this field. However, the UI only renders a difficulty select—there's no input for question_type. All questions will be submitted with the hardcoded default 'simple' from addQuestion() (line 164).

Suggested fix: add question_type select
                   <div className="form-group" style={{ marginBottom: 0 }}>
                     <label className="form-label">Difficulty</label>
                     <select
                       className="form-select"
                       {...register(`questions.${i}.difficulty`)}
                     >
                       <option value="simple">Simple</option>
                       <option value="medium">Medium</option>
                       <option value="complex">Complex</option>
                     </select>
                   </div>
+                  <div className="form-group" style={{ marginBottom: 0, marginTop: 12 }}>
+                    <label className="form-label">Question Type</label>
+                    <select
+                      className="form-select"
+                      {...register(`questions.${i}.question_type`)}
+                    >
+                      <option value="simple">Simple</option>
+                      <option value="join">Join</option>
+                      <option value="complex">Complex</option>
+                      <option value="geo">Geo</option>
+                      <option value="aggregate">Aggregate</option>
+                      <option value="time_series">Time Series</option>
+                    </select>
+                  </div>
                 </div>
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
{step === 'questions' && (
<div>
<h3 style={{ fontWeight: 700, marginBottom: 16 }}>{t('wizard.steps.questions')}</h3>
{questions.map((q, i) => (
<div
key={i}
className="card card--elevated"
style={{ padding: 12, marginBottom: 10 }}
>
<div className="form-group">
<label className="form-label">Question</label>
<input
className="form-input"
value={q.question}
onChange={(e) => updateQuestion(i, { question: e.target.value })}
placeholder="How many orders were placed last month?"
/>
</div>
<div className="form-group">
<label className="form-label">Expected SQL</label>
<textarea
className="form-textarea"
rows={2}
value={q.expected_sql}
onChange={(e) => updateQuestion(i, { expected_sql: e.target.value })}
placeholder="SELECT COUNT(*) FROM ..."
/>
</div>
<select
className="form-select"
value={q.difficulty}
onChange={(e) => updateQuestion(i, { difficulty: e.target.value as any })}
{questionFields.map((field, i) => {
const questionError = errors.questions?.[i]?.question;
const expectedSqlError = errors.questions?.[i]?.expected_sql;
return (
<div
key={field.id}
className="card card--elevated"
style={{ padding: 12, marginBottom: 16 }}
>
<option value="simple">Simple</option>
<option value="medium">Medium</option>
<option value="complex">Complex</option>
</select>
</div>
))}
<button className="btn btn--ghost btn--sm" onClick={addQuestion}>
<div style={{ display: 'flex', justifyContent: 'space-between', alignItems: 'center', marginBottom: 8 }}>
<span className="text-sm fw-600 text-muted">Question #{i + 1}</span>
<button
type="button"
className="btn btn--danger btn--sm"
onClick={() => removeQuestion(i)}
style={{ padding: '2px 6px', fontSize: '11px' }}
>
Remove
</button>
</div>
<div className="form-group">
<label className="form-label">Question</label>
<input
className={`form-input${questionError ? ' form-input--error' : ''}`}
placeholder="How many orders were placed last month?"
{...register(`questions.${i}.question`)}
/>
{questionError && (
<div className="form-error">{questionError.message}</div>
)}
</div>
<div className="form-group">
<label className="form-label">Expected SQL</label>
<textarea
className={`form-textarea${expectedSqlError ? ' form-textarea--error' : ''}`}
rows={2}
placeholder="SELECT COUNT(*) FROM ..."
{...register(`questions.${i}.expected_sql`)}
/>
{expectedSqlError && (
<div className="form-error">{expectedSqlError.message}</div>
)}
</div>
<div className="form-group" style={{ marginBottom: 0 }}>
<label className="form-label">Difficulty</label>
<select
className="form-select"
{...register(`questions.${i}.difficulty`)}
>
<option value="simple">Simple</option>
<option value="medium">Medium</option>
<option value="complex">Complex</option>
</select>
</div>
</div>
);
})}
{step === 'questions' && (
<div>
<h3 style={{ fontWeight: 700, marginBottom: 16 }}>{t('wizard.steps.questions')}</h3>
{questionFields.map((field, i) => {
const questionError = errors.questions?.[i]?.question;
const expectedSqlError = errors.questions?.[i]?.expected_sql;
return (
<div
key={field.id}
className="card card--elevated"
style={{ padding: 12, marginBottom: 16 }}
>
<div style={{ display: 'flex', justifyContent: 'space-between', alignItems: 'center', marginBottom: 8 }}>
<span className="text-sm fw-600 text-muted">Question #{i + 1}</span>
<button
type="button"
className="btn btn--danger btn--sm"
onClick={() => removeQuestion(i)}
style={{ padding: '2px 6px', fontSize: '11px' }}
>
Remove
</button>
</div>
<div className="form-group">
<label className="form-label">Question</label>
<input
className={`form-input${questionError ? ' form-input--error' : ''}`}
placeholder="How many orders were placed last month?"
{...register(`questions.${i}.question`)}
/>
{questionError && (
<div className="form-error">{questionError.message}</div>
)}
</div>
<div className="form-group">
<label className="form-label">Expected SQL</label>
<textarea
className={`form-textarea${expectedSqlError ? ' form-textarea--error' : ''}`}
rows={2}
placeholder="SELECT COUNT(*) FROM ..."
{...register(`questions.${i}.expected_sql`)}
/>
{expectedSqlError && (
<div className="form-error">{expectedSqlError.message}</div>
)}
</div>
<div className="form-group" style={{ marginBottom: 0 }}>
<label className="form-label">Difficulty</label>
<select
className="form-select"
{...register(`questions.${i}.difficulty`)}
>
<option value="simple">Simple</option>
<option value="medium">Medium</option>
<option value="complex">Complex</option>
</select>
</div>
<div className="form-group" style={{ marginBottom: 0, marginTop: 12 }}>
<label className="form-label">Question Type</label>
<select
className="form-select"
{...register(`questions.${i}.question_type`)}
>
<option value="simple">Simple</option>
<option value="join">Join</option>
<option value="complex">Complex</option>
<option value="geo">Geo</option>
<option value="aggregate">Aggregate</option>
<option value="time_series">Time Series</option>
</select>
</div>
</div>
);
})}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@frontend/src/components/wizard/OnboardingWizard.tsx` around lines 405 - 468,
The OnboardingWizard component is missing a select input for the `question_type`
field in the questions step UI, even though `questionSchema` defines it as a
required enum field with values 'join', 'simple', 'complex', 'geo', 'aggregate',
'time_series'. Add a new form-group after the difficulty select in the questions
step rendering section (within the card where difficulty is rendered) that
includes a select input for `question_type` using the same pattern as the
existing difficulty select. Register it with the form using
register(`questions.${i}.question_type`) and provide all enum option values so
users can explicitly choose the question type instead of defaulting to 'simple'.

Comment thread scripts/download_images.sh Outdated
from sqlmodel import Session, desc, select

from app.config import settings
from app.services.evaluator import TextToSQLEvaluator

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why add this import without any use?

_cached_om_token: str | None = None


def get_om_token(force_refresh: bool = False) -> str:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we just need to use OPENMETADATA_TOKEN? without any log in?

Comment thread backend/app/config.py
APP_ENV: str = "development"
OPENMETADATA_URL: str = "http://localhost:8585"
OPENMETADATA_SERVICE_NAME: str = "local_trino"
RUN_SEED: bool = True

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe run sett and infra not always True? Because we don't want every time we run our backend it will seed automaticly

Comment thread backend/app/seed.py

if __name__ == "__main__":
seed()
if getattr(settings, "RUN_SEED", True):

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why like this and not

Suggested change
if getattr(settings, "RUN_SEED", True):
if settings.RUN_SEED:

Comment thread backend/pyproject.toml
"minio>=7.2.0",
"core",
"mcp>=1.2.0",
"python-core-utils[keycloak] @ git+ssh://git@github.com/matzpen-agency/python-core-utils.git@feat/google-sso#egg=python-core-utils",

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember before merge or after we ned to change this to be from main and not feat/google-sso

Comment thread core/pyproject.toml
"pgvector>=0.2.0",
"psycopg2-binary==2.9.9",
"python-core-utils[postgres,redis] @ git+https://github.com/matzpen-agency/python-core-utils.git@main",
"python-core-utils[postgres,redis] @ git+ssh://git@github.com/matzpen-agency/python-core-utils.git@feat/google-sso",

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also here remember to revert into main

Comment thread frontend/nginx.conf Outdated

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

notice that this file already exists in file build_openshift.sh
so decide if use it here or in the top level of the project

Comment thread docker-compose.yml
- minio_data:/data
trino:
image: trinodb/trino:435
image: trinodb/trino:443

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why increase the trino? any new feature that we need?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, snowflake connector

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants