Skip to content

[BUG] Timestamp range pushdown emits unparseable date format when AND'd with IN/OR clause #5481

@penghuo

Description

@penghuo

Query Information

PPL Command/Query:

source=logs-* 
| where `@timestamp` > DATE_SUB(NOW(), INTERVAL 5 MINUTE)
| where severityText in ('ERROR', 'WARN') 
| head 10

Expected Result:
The query should return up to 10 documents from the last 5 minutes whose severityText is ERROR or WARN.

Actual Result:
HTTP 500. The shard fails to parse the date in the pushed-down range clause.

SearchPhaseExecutionException: Failed to execute phase [query], all shards failed
  shardFailures {[...][logs-000001][0]:
    OpenSearchParseException[failed to parse date field [2026-05-28 16:18:43]
      with format [strict_date_optional_time||epoch_millis]];
    IllegalArgumentException[failed to parse date field [2026-05-28 16:18:43]
      with format [strict_date_optional_time||epoch_millis]];
    DateTimeParseException[Failed to parse with all enclosed parsers]; }

The same query without the IN clause succeeds:

source=logs-* | where `@timestamp` > DATE_SUB(NOW(), INTERVAL 5 MINUTE) | head 10

Likewise replacing IN (...) with a single equality (severityText != 'INFO') succeeds. The bug only triggers when the timestamp range is AND'd with an IN or chained OR of equalities.

Dataset Information

Dataset/Schema Type

  • OpenTelemetry (OTEL)

Index Mapping

{
  "index_patterns": ["logs-*"],
  "template": {
    "mappings": {
      "properties": {
        "@timestamp":     { "type": "date" },
        "body":           { "type": "text", "norms": false },
        "severityText":   { "type": "keyword" },
        "severityNumber": { "type": "integer" },
        "traceId":        { "type": "keyword" },
        "spanId":         { "type": "keyword" }
      }
    }
  }
}

The @timestamp field uses the default strict_date_optional_time||epoch_millis parser.

Sample Data

{ "@timestamp": "2026-05-28T16:18:40.000Z", "severityText": "ERROR", "body": "Error finding unassigned IPs for ENI xyz" }
{ "@timestamp": "2026-05-28T16:18:41.000Z", "severityText": "WARN",  "body": "high latency observed during HTTP request" }
{ "@timestamp": "2026-05-28T16:18:42.000Z", "severityText": "INFO",  "body": "request completed HTTP/1.1 200 in 24ms" }

Bug Description

Issue Summary:

DSL emitted for the failing case (note the timestamp value and missing format field):

{"bool":{"must":[
  {"range":{"@timestamp":{"from":"2026-05-28 16:18:43","include_lower":false,"include_upper":true,"boost":1.0}}},
  {"terms":{"severityText":["ERROR","WARN"]}}
]}}

DSL emitted for the working case (no IN):

{"range":{"@timestamp":{"from":"2026-05-28T16:18:43.000Z","include_lower":false,"include_upper":true,"format":"date_time","boost":1.0}}}

Root cause:
With Calcite pushdown enabled, severityText IN ('ERROR','WARN') is folded into a Sarg literal during RexSimplify. Because the merged AND is simplified as a whole, RexSimplify re-types sibling literals — the constant-folded result of DATE_SUB(NOW(), INTERVAL 5 MINUTE) loses its EXPR_TIMESTAMP UDT and is emitted as :VARCHAR (visible in PushDownContext as '2026-05-28 16:18:43':VARCHAR). Downstream, PredicateAnalyzer.LiteralExpression.isDateTime() checks for ExprUDT.EXPR_TIMESTAMP on the literal, sees VARCHAR, returns false. As a result, SimpleQueryExpression.gt() skips both addFormatIfNecessary (no .format("date_time")) and timestampValueForPushDown (no ISO-8601 normalization). The raw 2026-05-28 16:18:43 ships to the shard, where the default date parser rejects it. A defensive fix is to key the format/encoding decision on the field's type (already known to be EXPR_TIMESTAMP) rather than the literal's surviving UDT.

Steps to Reproduce:

  1. Apply the index template above; index a few sample documents into logs-pr172502-000001.
  2. Set plugins.calcite.enabled: true and plugins.calcite.pushdown.enabled: true.
  3. Run the failing PPL query above.
  4. Observe HTTP 500 with SearchPhaseExecutionException / DateTimeParseException.
  5. Run the same query with the IN clause removed → succeeds. Run with severityText != 'INFO' → succeeds.

Environment Information

OpenSearch Version: 3.7.0-SNAPSHOT (reproduced locally on main, commit ec433f4ff).

Metadata

Metadata

Assignees

No one assigned

    Labels

    PPLPiped processing languagebugSomething isn't workinguntriaged

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status

    Not Started

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions