Skip to content

CI: Add runtime-deps.txt files for all runtimes and bundles#16081

Open
kevinjqliu wants to merge 1 commit intoapache:mainfrom
kevinjqliu:kevinjqliu/add-runtime-deps
Open

CI: Add runtime-deps.txt files for all runtimes and bundles#16081
kevinjqliu wants to merge 1 commit intoapache:mainfrom
kevinjqliu:kevinjqliu/add-runtime-deps

Conversation

@kevinjqliu
Copy link
Copy Markdown
Contributor

@kevinjqliu kevinjqliu commented Apr 22, 2026

Context: #16080

Add all runtime-deps.txt so that CI will start enforcing dependency changes.

I checked out main, git pull and ran

./gradlew generateRuntimeDeps -DallModules --rerun-tasks

Committed the resulting files in this PR.

Note that spark/v4.1/spark-runtime/runtime-deps.txt was added in #16080

@manuzhang
Copy link
Copy Markdown
Member

no open-api module?

@rdblue
Copy link
Copy Markdown
Contributor

rdblue commented Apr 23, 2026

Should we be publishing the iceberg-open-api module? It is for testing so it would make sense to me if we didn't publish one.

I also didn't know about that runtime module and we will need to evaluate its LICENSE and NOTICE files before including it in any more releases.

@rdblue rdblue changed the title ci: add all runtime-deps.txt files CI: Add runtime-deps.txt files for all runtimes and bundles Apr 23, 2026
org.apache.httpcomponents:httpcore:4.4.16
org.apache.logging.log4j:log4j-api:2.20.0
org.apache.logging.log4j:log4j-core:2.20.0
org.apache.logging.log4j:log4j-slf4j-impl:2.20.0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Log4J is not included in the LICENSE file.

software.amazon.awssdk:utils:2.42.33
software.amazon.eventstream:eventstream:1.0.1
software.amazon.s3.accessgrants:aws-s3-accessgrants-java-plugin:2.4.1
software.amazon.s3.analyticsaccelerator:analyticsaccelerator-s3:1.3.1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that these are fine. I didn't check every one against the latest update (fb2c8ac3faf) because they are now grouped into high level SDK modules, but they are all ALv2 and should be okay.

@@ -0,0 +1,70 @@
com.github.ben-manes.caffeine:caffeine:2.9.3
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to other reviewers: The aws-bundle/LICENSE file and azure-bundle/LICENSE file both include "JCTools (via Netty)". This is correct: Netty shades org.jctools under io/netty/util/shaded, which is then shaded in org/apache/iceberg/aws/shaded.

org.apache.avro:avro:1.12.1
org.apache.datasketches:datasketches-java:6.2.0
org.apache.datasketches:datasketches-memory:3.0.2
org.apache.flink:flink-metrics-dropwizard:2.1.0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not listed in the LICENSE file.

I also wonder if this should be excluded and not added to LICENSE because it seems like something that should be included in the Flink runtime. I suspect that we need to add it as a compileOnly dependency or suppress it in the runtime config.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened a PR for this fix: #16093

@@ -0,0 +1,33 @@
com.fasterxml.jackson.core:jackson-annotations:2.21
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not reviewing older versions of Flink yet. I think we should determine what needs to change for the current version and then verify the same changes on the older ones.

com.github.ben-manes.caffeine:caffeine:2.9.3
com.github.luben:zstd-jni:1.5.7-3
com.google.errorprone:error_prone_annotations:2.10.0
dev.failsafe:failsafe:3.3.2
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is leaked by iceberg-aws and should not be bundled.

Copy link
Copy Markdown
Contributor

@rdblue rdblue Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used directly by S3InputStream, which means it needs to be included when iceberg-aws is included becuase it is not provided by the AWS dependencies. I don't think this is a good reason to keep using it and that we should replace it with Tasks, unless it is doing something special.

Since this is in the license docs, I think this isn't a blocker for 1.11.0 or 1.10.2, but we should remove it to keep dependencies to a minimum.

@@ -0,0 +1,33 @@
com.fasterxml.jackson.core:jackson-annotations:2.21
Copy link
Copy Markdown
Contributor

@rdblue rdblue Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LICENSE file stats that this contains many missing libraries:

The reason for most of the extras is that there were no LICENSE updates after a few recent PRs:

Also, LICENSE contains Google Guava, which is present because this shades iceberg-bundled-guava. But shading in that module means we don't have it listed here (FYI).

Action items:

  • Find out why some libraries were there but are no longer:
    • Arrow, Netty, Apache Commons, OpenTelemetry, javax.annotations
  • Fix the Hive entry in LICENSE. Before chore: several fixes on the LICENSE/NOTICE #15449 it was clear that this was shaded by ORC. Now the only Hive reference I see is META-INF files so I think this is probably incorrect.
  • Remove all of the fixed dependency leaks from LICENSE and NOTICE

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apache Commons is from this commit: 760a20b

#2102 copied array methods into ArrayUtil. This isn't a big problem, but it doesn't seem worth the hassle of tracking it down in LICENSE to have array copy methods. The implementations don't match project style or provide value. A good first issue is to remove them.

com.google.cloud:google-cloud-kms:2.91.0
com.google.cloud:google-cloud-monitoring:3.89.0
com.google.cloud:google-cloud-storage:2.64.1
com.google.code.findbugs:jsr305:3.0.2
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findbugs is excluded throughout the codebase because it was originally LGPL and cannot be bundled. The license issues weren't clarified, and a clean implementation was created: https://github.com/stephenc/findbugs-annotations

Although the maven metadata reports ALv2, we need to exclude it. If we need the annotations (which are not required to function), then we should use the stephenc verison.

org.jspecify:jspecify:1.0.0
org.slf4j:slf4j-api:2.0.17
org.threeten:threeten-extra:1.8.0
org.threeten:threetenbp:1.7.0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rest of these appear to be real dependencies from GCP and correctly included in the LICENSE file.

@manuzhang
Copy link
Copy Markdown
Member

@rdblue This is the PR to fix open-api LICENSE and NOTICE issues. Please take a look and we can discuss whether the module needs to be published there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants