Skip to content

[RFC] feat(tables): Add Storage Location and Storage Location Updates#516

Draft
mkuchenbecker wants to merge 8 commits intolinkedin:mainfrom
mkuchenbecker:mkuchenb/sl
Draft

[RFC] feat(tables): Add Storage Location and Storage Location Updates#516
mkuchenbecker wants to merge 8 commits intolinkedin:mainfrom
mkuchenbecker:mkuchenb/sl

Conversation

@mkuchenbecker
Copy link
Copy Markdown
Collaborator

@mkuchenbecker mkuchenbecker commented Mar 27, 2026

Tables may need to be occasionally balanced across paths, and when doing so you want to update the location of a table safely. For example in S3, you may need to balance across prefixes over time depending on the data layout of a bucket, or balance across physical HDFS instances.

The PR contains a 450 line markdown demo as well as docker and shell script changes to run the demo.

Main actual changes:

  1. StorageLocation entity that identifies a particular path. A table may contain N storage locations. Note: operations such as Orphaned File Deletion and Stats Collection need updates to maintain these directories.
  2. A table may update its storage location.

2 - Is handled by allocating the storage location and then making a commit at the destination.

Summary

Issue] Briefly discuss the summary of the changes made in this
pull request in 2-3 lines.

Changes

  • Client-facing API Changes
  • Internal API Changes
  • Bug Fixes
  • New Features
  • Performance Improvements
  • Code Style
  • Refactoring
  • Documentation
  • Tests

For all the boxes checked, please include additional details of the changes made in this pull request.

Testing Done

  • Manually Tested on local docker setup. Please include commands ran, and their output.
  • Added new tests for the changes made.
  • Updated existing tests to reflect the changes made.
  • No tests added or updated. Please explain why. If unsure, please feel free to ask for help.
  • Some other form of testing like staging or soak time in production. Please explain.

For all the boxes checked, include a detailed description of the testing done for the changes made in this pull request.

Additional Information

  • Breaking Changes
  • Deprecations
  • Large PR broken into smaller PRs, and PR plan linked in the description.

For all the boxes checked, include additional details of the changes made in this pull request.

mkuchenbecker and others added 2 commits March 27, 2026 11:13
Add PATCH /v1/databases/{db}/tables/{tbl}/storageLocation endpoint
to swap the active Iceberg metadata.location for a table. Includes
schema for storage_location and table_storage_location HTS tables,
StorageLocationRepository integration, auto-registration of initial
storage location on table create/update, and mock controller tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reorder updateStorageLocation to persist the table-storage-location
association before committing the Iceberg metadata location swap.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@mkuchenbecker mkuchenbecker changed the title [RFC] feat(tables): add storage location swap endpoint [RFC] feat(tables): Add Storage Location and Storage Location Updates Mar 27, 2026
mkuchenbecker and others added 6 commits March 27, 2026 12:15
Add HouseTables StorageLocation CRUD controller, JPA entities, and
service layer. Add StorageLocationRepository in internalcatalog for
cross-service HTTP calls. Wire auto-allocation of storage locations
on table create/update in OpenHouseTablesApiHandler, and include
e2e tests for the StorageLocation controller.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
}
```

### Step 5: Migrate [locked]
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment to link to. The proposal is we manage storage locations with pure SQL based on properties of the table and storage

@@ -0,0 +1,434 @@
# Openhouse Table Location Update and Rollback
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment to link to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant