Core: Fix JdbcCatalog to prevent dropping parent namespaces with children#16061
Core: Fix JdbcCatalog to prevent dropping parent namespaces with children#16061sagib1 wants to merge 3 commits intoapache:mainfrom
Conversation
| .isInstanceOf(NamespaceNotEmptyException.class) | ||
| .hasMessageStartingWith("Namespace parent_ns is not empty. Contains 1 child namespace(s)."); | ||
|
|
||
| // 3. Verify the namespaces still safely exist in the catalog |
There was a problem hiding this comment.
nit: I don't think we need to add numeric prefixes. We'd have to renumber them if we ever insert a new operation in the middle of the code. Comments might not even be necessary, since the current test code is already simple enough.
| List<Namespace> childNamespaces = listNamespaces(namespace); | ||
| if (childNamespaces != null && !childNamespaces.isEmpty()) { | ||
| throw new NamespaceNotEmptyException( | ||
| "Namespace %s is not empty. Contains %d child namespace(s).", | ||
| namespace, childNamespaces.size()); | ||
| } |
There was a problem hiding this comment.
Have you confirmed whether other catalogs that support nested namespaces are experiencing the same issue?
There was a problem hiding this comment.
Yes, I reviewed the other SupportsNamespaces implementations. This issue is unique to JdbcCatalog because it must manually emulate hierarchical path validation over a flat SQL table.
Other catalogs avoid this because they defer the drop operation to backends that enforce these constraints natively, or they structurally cannot support nested namespaces:
HadoopCatalog: In its dropNamespace implementation, it calls fs.delete(dir, false). By passing false for the recursive flag, it relies on the native Hadoop FileSystem to throw an IOException if any child directories (which represent sub-namespaces or tables) exist.
HiveCatalog: Its dropNamespace method delegates to HMS via client.dropDatabase(name, false, false, false) (where the final flag is cascade=false). The Hive Metastore natively throws an exception if the database is not empty. Furthermore, HMS structurally does not support nested namespaces.
GlueCatalog: AWS Glue only supports top-level databases. Iceberg's IcebergToGlueConverter.toDatabaseName enforces this by throwing a ValidationException if the namespace has more than 1 level, making this specific nested-drop edge case impossible.
NessieCatalog: Defers entirely to the Nessie API via api.deleteNamespace(). the logic simply sends a delete() request to the remote Nessie Server. Because Nessie is essentially "Git for Data," the Nessie server itself maintains the entire hierarchical tree of commits and references. When the server receives the delete request, its internal engine traverses the commit tree. If it finds any active keys (whether they are tables, views, or child namespaces) under that path, the server rejects the request and throws a NessieNamespaceNotEmptyException back over the wire.
| } | ||
|
|
||
| @Test | ||
| public void testDropNamespaceWithChildNamespace() { |
There was a problem hiding this comment.
this test should go into CatalogTests next to testDropNonEmptyNamespace. You'll have to add an assumption so that it is only executes for catalogs where supportsNestedNamespaces() returns true. Also instead of childNamespaces let's rename the method to say nestedNamespaces
Summary
Solution for #16060
This PR fixes a bug in JdbcCatalog where a parent namespace could be dropped even if it contained child namespaces, provided the parent had no direct tables or views. This behavior violates the Iceberg REST OpenAPI specification, which requires that a namespace must be empty of both tables and child namespaces before deletion.
Without this fix, dropping a parent namespace results in "orphaned" metadata: the child namespaces and their associated tables remain in the database but become unreachable through standard catalog traversal, leading to metadata inconsistency.
Changes
Modified
JdbcCatalog#dropNamespaceto include a validation check usinglistNamespaces(namespace).Ensured
NamespaceNotEmptyExceptionis thrown if the list of child namespaces is not empty.Verifying this change
This change is a code fix and includes a new unit test in
TestJdbcCatalog.Added Tests
TestJdbcCatalog#testDropNamespaceWithChildNamespace: Verifies thatdropNamespacethrowsNamespaceNotEmptyExceptionwhen a child namespace exists and that the catalog state remains unchanged.Manual Verification
Reproduced the issue using a local REST Catalog fixture and verified that the fix correctly blocks the operation:
Documentation
Does this pull request introduce a new feature? No
Does this pull request introduce a breaking change? No (It enforces an existing specification contract that was being bypassed).
Is any documentation update required? No