Triage summary for 2026-05-04
Triaged 29 open issues that carried requires-triage (31 total in the queue; 2 skipped, see below). Labels have already been applied; please spot-check below and close this issue when satisfied. Any individual relabel can be done directly on the affected issue.
Triage criteria come from docs/source/contributor-guide/bug_triage.md. Most issues in this batch trace back to the Spark 4.1.1 enablement work (#4098), so a number of correctness regressions are flagged as priority:critical even though the affected version is not yet a supported target.
Counts by priority applied
priority:critical: 5
priority:high: 0
priority:medium: 14
priority:low: 10
Triaged
priority:critical
- Spark 4.1: bloom filter result mismatch (might_contain returns wrong answers) (#4193)
- Area labels:
area:expressions, spark 4
- Rationale: silent wrong
might_contain results for the same input; correctness over crashes per the guide.
- Spark 4.1: native parquet reader returns wrong rows for user-defined struct schema (#4192)
- Area labels:
area:scan, spark 4, native_datafusion, native_iceberg_compat
- Rationale: native reader returns different rows than Spark for
c0: struct<y:int,x:string>; silent wrong results.
- Native scan path doesn't honour Parquet field-ID matching when
spark.sql.parquet.fieldId.read.enabled=true (#4189)
- Area labels:
area:scan, native_datafusion
- Rationale: by-name resolution silently returns wrong data when names and IDs disagree (Delta column-mapping
id mode).
- SPARK-53968 SQLViewSuite: decimal arithmetic returns ~10x smaller values through view CTE on Spark 4.1.1 (#4124)
- Area labels:
area:expressions, spark 4
- Rationale:
unit_price + COALESCE(shipping_price, 0) returns 10x-too-small decimals; silent wrong results in arithmetic.
- EXCEPT ALL / INTERSECT ALL with GROUP BY return incorrect results on Spark 4.1.1 (#4122)
- Area labels:
area:aggregation, spark 4
- Rationale: extra/missing rows for both EXCEPT ALL and INTERSECT ALL queries; silent wrong results.
priority:medium
- Spark 4.1 NullType parquet: parquet-rs rejects BOOLEAN + Unknown logical type (#4199)
- Area labels:
area:scan, spark 4
- Rationale: hard read error on
NullType parquet columns; functional gap with workaround (test currently gated).
- Audit Spark SQL configs that affect query semantics and ensure Comet honors them (or falls back) (#4180)
- Area labels:
area:expressions, area:scan
- Rationale: broad correctness audit (rebase modes, time parser policy, etc.); high-leverage but no specific incident.
- Add support for scalar UDFs that operate on Arrow data (#4177)
- Area labels:
area:expressions
- Rationale: feature gap; users have row-based UDFs as a workaround.
- Make JVM-scalar-UDF dispatch responsive to task cancellation (#4175)
- Area labels:
area:expressions
- Rationale: cancellation responsiveness gap; functional bug with a behavioural workaround (tasks still complete).
- Register CometArrowAllocator as a Spark MemoryConsumer for JVM-UDF dispatch (#4174)
- Area labels:
area:expressions, area:ffi
- Rationale: off-heap memory invisible to Spark accounting; production-safety concern, no immediate breakage.
- Tighten CometUDF API with input/return type validation at registration (#4173)
- Area labels:
area:expressions
- Rationale: hard JVM crash on type mismatch; medium because surface is currently dev-only and crash is loud.
- JNI local references accumulate across executor JVM lifetime in native call sites (#4172)
- Area labels:
area:ffi
- Rationale: slow leak that may eventually OOM under heavy load; functional bug with no immediate user impact.
- Track parse_url Spark compatibility work (#4156)
- Area labels:
area:expressions
- Rationale: incompatibility tracker; expression marked
Incompatible and falls back by default, so workaround exists.
- url_decode: try_url_decode (Spark 4.0) errors instead of returning NULL on malformed input (#4155)
- Area labels:
area:expressions, spark 4
- Rationale: visible error divergence (not silent wrong results); functional bug with a known fix path.
- Support expressions already implemented in datafusion-spark crate (#4150)
- Area labels:
area:expressions
- Rationale: 29 expressions are already native but not wired up; functional gap with Spark fallback as workaround.
- Track Spark 4.2 test failures (#4142)
- Area labels:
area:expressions, spark 4
- Rationale: tracking issue for several 4.2 SQL/test failures (ANSI overflow text, INT96 path, Jetty class verifier); functional gaps for the upcoming version.
- Comet native scan returns wrong schema for missing struct fields in Parquet (#4136)
- Area labels:
area:scan, spark 4, native_datafusion
- Rationale: schema divergence (full struct + nulls vs
struct<>) on new SPARK-53535 / SPARK-54220 4.1 tests; functional gap with workaround.
- Comet native sort lacks row-format support for Struct(Map(...)) sort keys (#4123)
- Area labels:
area:expressions, spark 4
- Rationale: native sort errors on Struct(Map) sort keys; should fall back. Workaround: file disabled in 4.1.1 diff.
- Comet native scan rejects invalid UTF-8 byte sequences in STRING column (hll.sql on Spark 4.1) (#4121)
- Area labels:
area:scan, spark 4
- Rationale: visible error on invalid-UTF-8 STRING columns; functional gap with workaround (file disabled).
priority:low
- macOS aarch64 flake: SIGBUS in _pthread_tsd_cleanup after ParquetReadFromFakeHadoopFsSuite (#4200)
- Area labels:
area:ci, area:scan
- Rationale: macOS-only CI flake after a single test, suspected stale TSD destructor in hdfs-opendal; matches the guide's "CI flakes" tier.
- Spark 4.1: native_datafusion bytesRead task metric off by 6-14x vs Spark (#4194)
- Area labels:
area:scan, spark 4
- Rationale: metric reporting divergence in three CometTaskMetricsSuite tests; no functional or correctness impact.
- Set up process for auditing all Spark commits to assess impact on Comet (#4188)
- Area labels: none
- Rationale: process / tooling enhancement; fits the guide's "tooling" tier.
- Add metrics and logging to JVM-scalar-UDF dispatch path (#4176)
- Area labels:
area:expressions
- Rationale: observability enhancement on a prototype branch; no user-visible bug.
- Pending PRs badge showing failed PRs (#4160)
- Area labels:
area:ci
- Rationale: dashboard / cosmetic; no user-visible runtime impact.
- Improve serde framework handling of
StaticInvoke (#4151)
- Area labels:
area:expressions
- Rationale: internal refactor; no user-visible bug.
- AQE DPP SAB wrapping skipped when V2 scan is wrapped in
CometSparkToColumnarExec (#4145)
- Area labels:
area:scan
- Rationale: real bug in the rule, but no current user path hits it (V2 Parquet AQE DPP not yet supported in Spark).
- Upgrade to Spark 4.2.0-preview5 (#4143)
- Area labels:
spark 4
- Rationale: routine version bump; tracking/process work.
- CachedBatchSerializerNoUnwrapSuite: Comet replaces WholeStageCodegenExec (#4137)
- Area labels:
spark sql tests
- Rationale: 4.1.1 SQL test failure caused by Comet plan replacement; test-only.
- Add support for Spark 4.2.0-preview4 (#4113)
Escalations to consider
Skipped — needs more info
- Iceberg reflection failure (#4125)
- The report has only the
IcebergReflection ERROR line and a note that "any query" hits it on Comet 0.15 + Iceberg + Spark 3.5.6. Steps to reproduce, expected behavior, AWS Glue / Iceberg versions, and whether the query actually fails or only logs are all missing. requires-triage left in place.
- Bug triage results: 2026-04-27 (#4110)
- This is the previous triage summary issue that was auto-tagged with
requires-triage when it was opened. It is not a bug — the human reviewer should close it when satisfied with that batch (or remove requires-triage directly). Not classifying it.
Triage summary for 2026-05-04
Triaged 29 open issues that carried
requires-triage(31 total in the queue; 2 skipped, see below). Labels have already been applied; please spot-check below and close this issue when satisfied. Any individual relabel can be done directly on the affected issue.Triage criteria come from
docs/source/contributor-guide/bug_triage.md. Most issues in this batch trace back to the Spark 4.1.1 enablement work (#4098), so a number of correctness regressions are flagged aspriority:criticaleven though the affected version is not yet a supported target.Counts by priority applied
priority:critical: 5priority:high: 0priority:medium: 14priority:low: 10Triaged
priority:critical
area:expressions,spark 4might_containresults for the same input; correctness over crashes per the guide.area:scan,spark 4,native_datafusion,native_iceberg_compatc0: struct<y:int,x:string>; silent wrong results.spark.sql.parquet.fieldId.read.enabled=true(#4189)area:scan,native_datafusionidmode).area:expressions,spark 4unit_price + COALESCE(shipping_price, 0)returns 10x-too-small decimals; silent wrong results in arithmetic.area:aggregation,spark 4priority:medium
area:scan,spark 4NullTypeparquet columns; functional gap with workaround (test currently gated).area:expressions,area:scanarea:expressionsarea:expressionsarea:expressions,area:ffiarea:expressionsarea:ffiarea:expressionsIncompatibleand falls back by default, so workaround exists.area:expressions,spark 4area:expressionsarea:expressions,spark 4area:scan,spark 4,native_datafusionstruct<>) on new SPARK-53535 / SPARK-54220 4.1 tests; functional gap with workaround.area:expressions,spark 4area:scan,spark 4priority:low
area:ci,area:scanarea:scan,spark 4area:expressionsarea:ciStaticInvoke(#4151)area:expressionsCometSparkToColumnarExec(#4145)area:scanspark 4spark sql testsspark 4Escalations to consider
priority:critical. They surface only on Spark 4.1.1, which is not yet a supported target (Tracking: remaining Spark 4.1 CI failures #4098 is the umbrella). Once 4.1 enablement lands, these need to be fixed before the matrix flips green.delta-kernel-phase-1branch (PR feat: Native Delta Lake scan via delta-kernel-rs #3932) wherenativeDeltaScanis being added. The existingnativeDataFusionScangate already declines field-ID requests, but Delta strips field IDs fromrequiredSchema, so the gate misses and produces silent wrong results when the user opts into the experimental Delta path. Worth flagging to the delta-kernel work.priority:medium. If a long-running production executor is observed hitting JNI OOM tied to local-ref accumulation, escalate topriority:high.Skipped — needs more info
IcebergReflectionERROR line and a note that "any query" hits it on Comet 0.15 + Iceberg + Spark 3.5.6. Steps to reproduce, expected behavior, AWS Glue / Iceberg versions, and whether the query actually fails or only logs are all missing.requires-triageleft in place.requires-triagewhen it was opened. It is not a bug — the human reviewer should close it when satisfied with that batch (or removerequires-triagedirectly). Not classifying it.