Skip to content

AWS: Handle S3 Table Bucket purge gracefully in GlueCatalog (#14449)#16073

Open
yadavay-amzn wants to merge 1 commit intoapache:mainfrom
yadavay-amzn:fix/14449-glue-s3-table-purge
Open

AWS: Handle S3 Table Bucket purge gracefully in GlueCatalog (#14449)#16073
yadavay-amzn wants to merge 1 commit intoapache:mainfrom
yadavay-amzn:fix/14449-glue-s3-table-purge

Conversation

@yadavay-amzn
Copy link
Copy Markdown
Contributor

When calling GlueCatalog.dropTable() with purge=true on a table in an S3 Table Bucket, the purge fails because S3 Table Buckets do not allow manual file deletion.

This change wraps CatalogUtil.dropTableData() in a try-catch so that purge failures are logged as warnings instead of propagating and failing the entire drop operation. The table is still successfully dropped from the Glue catalog.

Closes #14449

@github-actions github-actions Bot added the AWS label Apr 21, 2026
…4449)

When calling GlueCatalog.dropTable() with purge=true on a table in an
S3 Table Bucket, the purge fails because S3 Table Buckets do not allow
manual file deletion. This change wraps CatalogUtil.dropTableData() in
a try-catch so that purge failures are logged as warnings instead of
propagating and failing the entire drop operation.

Closes apache#14449
@yadavay-amzn yadavay-amzn force-pushed the fix/14449-glue-s3-table-purge branch from 840236c to e996c80 Compare April 21, 2026 22:02
LOG.info("Glue table {} data purged", identifier);
try {
CatalogUtil.dropTableData(ops.io(), lastMetadata);
LOG.info("Glue table {} data purged", identifier);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to check whether the target table exists in S3 Table bucket?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The location could be checked for S3 Table Bucket ARN patterns, but catching the exception is more robust as it handles any case where purge fails (permissions, bucket policies, etc.) without needing to enumerate all possible URI formats.
Looks like this also aligns with the Trino approach you linked!

Happy to add a URI check if you'd prefer a more targeted approach though.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if we could do both (S3 Table check + try-catch) to avoid redundant S3 requests and warning logs. I think we should keep the try-catch regardless of S3 Table because it may fail for other reasons.

The "Enumerate all possible URI formats" approach doesn't look straightforward. Only adding try-catch looks good to me.

LOG.info("Glue table {} data purged", identifier);
} catch (Exception e) {
LOG.warn(
"Failed to purge data for table: {}, continuing drop without purge", identifier, e);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The table has already been dropped by the time we reach this line, so this change makes sense to me.

The Trino Iceberg connector also suppresses failures when it cannot delete data using the Glue catalog:

https://github.com/trinodb/trino/blob/5a116341b53f9f3a3b29b8b405773010e307e40b/plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/catalog/glue/TrinoGlueCatalog.java#L676-L696

try {
CatalogUtil.dropTableData(ops.io(), lastMetadata);
LOG.info("Glue table {} data purged", identifier);
} catch (Exception e) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to catch the Specific exception rather than catch all Exception ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AWS Glue catalog dropTable purge fails when target is an S3 Table bucket

3 participants