Meta: GFQL Testing & Verification Roadmap

# GFQL Testing & Verification Roadmap

This meta-issue tracks follow-up work to achieve production-grade and publication-quality testing/verification for GFQL.

## Context

PRs #846 (cudf same-path executor) and #852 (Alloy verification) add significant GFQL functionality. This roadmap extends testing to meet:
- **Production standards**: Conformance suites, metamorphic testing, fuzz testing
- **Publication standards**: Formal verification coverage for SIGMOD/VLDB-tier claims

---

## Bugs Found in PR #846 (Lessons for Verification)

During development of the native vectorized executor in PR #846, several algorithmic bugs were discovered that formal verification could have caught earlier. These inform what the Alloy model should cover.

### Bug 1: Backward Traversal Join Direction (commit `71cda418`)

**Location**: `_find_multihop_start_nodes` in `df_executor.py:754`

**Bug**: When doing backward BFS to find which start nodes can reach end nodes, the join was on the wrong column:
```python
# WRONG: joined on __to__ (predecessor) instead of __from__ (target)
new_frontier = edge_pairs.merge(frontier, left_on='__to__', ...)

# CORRECT: join on __from__ to find predecessors
new_frontier = edge_pairs.merge(frontier, left_on='__from__', ...)
```

**Symptom**: Multi-hop + WHERE queries returned wrong nodes or empty results.

**Alloy gap**: The model's `AlgoBackward` doesn't explicitly verify join direction semantics.

**Recommendation**: Add assertion that backward reachability computes `{ n | ∃ path n →* target }` not `{ n | ∃ path target →* n }`.

### Bug 2: Empty Set Short-Circuit Missing (commit `71cda418`)

**Location**: `_materialize_filtered` in `df_executor.py:1520`

**Bug**: If any step in the chain had an empty allowed node set (e.g., WHERE filtered out all nodes), the executor didn't short-circuit to empty result.

```python
# FIX: Early return when any node step has empty set
if path_state.allowed_nodes:
    for node_set in path_state.allowed_nodes.values():
        if node_set is not None and len(node_set) == 0:
            return empty_result
```

**Symptom**: 4-step chains with impossible WHERE returned nodes instead of empty.

**Alloy gap**: Model implicitly handles via `no` quantifier but doesn't assert materialization behavior.

**Recommendation**: Add assertion: `(some i: NodeStep | AlgoAllowed[c, i] = none) implies AlgoOutN[c] = none`

### Bug 3: Wrong Node Source for Non-Adjacent WHERE (commit `71cda418`)

**Location**: `_apply_non_adjacent_where_post_prune` in `df_executor.py:410`

**Bug**: Used `alias_frames` (populated during forward pass) instead of original graph nodes for value lookups. Forward pass may not populate all aliases.

```python
# WRONG: alias_frames can be incomplete
left_values = alias_frames.get(left_alias)

# CORRECT: Use original graph's node DataFrame
nodes_df = self.inputs.graph._nodes
```

**Symptom**: Non-adjacent WHERE comparisons failed silently or used stale values.

**Alloy gap**: `lowerWhere` doesn't model where attribute values come from.

**Recommendation**: Clarify that WHERE value lookups use the full graph's node attributes, filtered to allowed nodes.

### Bug 4: Multi-hop Path Tracing Through Intermediates (commit `cd579363`)

**Location**: `_filter_multihop_edges_by_endpoints` in `df_executor.py:700`

**Bug**: Backward prune filtered edges by `dst in allowed_end_nodes`, but for multi-hop, START connects to INTERMEDIATE nodes, not END directly.

**Symptom**: Start nodes showed as empty for multi-hop chains.

**Alloy gap**: Hop ranges not modeled at all.

**Recommendation**: **High priority** - Add hop range modeling to Alloy (see Sub-Issue #2).

### Bug 5: Reverse/Undirected Edge Direction Handling (commit `cd579363`)

**Location**: Multiple functions in backward prune phase

**Bug**: `is_reverse` and `is_undirected` flags weren't correctly propagating through path tracing. For reverse edges, src/dst semantics are swapped.

**Symptom**: Reverse edge chains returned wrong or empty results.

**Alloy gap**: Model assumes forward edges only (implicit `e.src → e.dst`).

**Recommendation**: Add `EdgeStep.direction` attribute (forward/reverse/undirected) and verify directional semantics.

### Tests Added

10 new tests in `TestImpossibleConstraints` class (`test_df_executor_inputs.py`):
- `test_contradictory_lt_gt_same_column`
- `test_contradictory_eq_neq_same_column`
- `test_contradictory_lte_gt_same_column`
- `test_no_paths_satisfy_predicate`
- `test_multihop_no_valid_endpoints`
- `test_contradictory_on_different_columns`
- `test_chain_with_impossible_intermediate`
- `test_non_adjacent_impossible_constraint`
- `test_empty_graph_with_constraints`
- `test_no_edges_with_constraints`

---

## Sub-Issues

### 1. Metamorphic Testing for GFQL
**Priority: P2**

Add query equivalence testing under graph/query transformations:
- Graph isomorphism: `query(G) ≡ query(G')` where `G'` is isomorphic to `G`
- Query partitioning: `WHERE x > 5` ≡ `(WHERE x > 5 AND x > 3)`
- Subgraph preservation: Results on subgraph ⊆ results on full graph

Reference: [Gamera (VLDB 2024)](https://dl.acm.org/doi/10.14778/3636218.3636236) - Graph-aware metamorphic relations

### 2. Extend Alloy Model to Cover Hop Ranges
**Priority: P1** ⬆️ (upgraded based on bugs found)

Current Alloy model (`alloy/gfql_fbf_where.als`) does NOT formally verify:
- `min_hops` / `max_hops` traversal bounds
- `output_min_hops` / `output_max_hops` slicing
- Hop labeling (`label_node_hops`, `label_edge_hops`, `label_seeds`)
- **NEW**: Backward traversal join direction for multi-hop
- **NEW**: Intermediate node reachability (start → intermediate → end)

Options:
- **A) Explicit modeling**: Add hop counter to path semantics, verify bounds
- **B) Unrolling approach**: Generate fixed-length chain variants, verify each
- **C) Abstraction**: Prove hop filtering is a sound post-filter on verified core

**Rationale for P1**: Bugs #1 and #4 above were in multi-hop logic that formal verification should catch.

### 3. Add Backward Reachability Assertions to Alloy
**Priority: P1** (new)

Verify that `AlgoBackward` correctly computes predecessors:
```alloy
assert BackwardReachabilityCorrect {
  all c: Chain, n: Node, target: Node |
    n in AlgoBackward[c, target] iff
    (some path: seq Edge | pathConnects[path, n, target])
}
```

This would have caught Bug #1 (join direction).

### 4. Add Empty Set Propagation Assertion
**Priority: P2** (new)

Verify that if any step has an empty allowed set, the output is empty:
```alloy
assert EmptyPropagation {
  all c: Chain |
    (some i: seq/inds[c.seqSteps] | AlgoAllowed[c, i] = none)
    implies (AlgoOutN[c] = none and AlgoOutE[c] = none)
}
```

This would have caught Bug #2.

### 5. Add Contradictory WHERE Scenarios
**Priority: P2** (new)

Verify that contradictory WHERE clauses produce empty results:
```alloy
pred ContradictoryWhere[c: Chain] {
  some w1, w2: c.wheres |
    w1.lhs.a = w2.lhs.a and w1.rhs.a = w2.rhs.a and
    ((w1.op = Lt and w2.op = Gt) or (w1.op = Eq and w2.op = Neq))
}

assert ContradictoryWhereEmpty {
  all c: Chain | ContradictoryWhere[c] implies AlgoOutN[c] = none
}
```

### 6. Fuzz Testing for Chain + WHERE Combinations
**Priority: P3**

Random generation of:
- Graph structures (nodes, edges, types, values)
- GFQL chains (varying length, directions, hop params)
- WHERE clauses (random alias pairs, operators, column refs)

Compare cuDF executor vs oracle on generated inputs.

Tools: Hypothesis (Python), custom generators

### 7. Cypher/GQL Conformance Suite
**Priority: P3**

Work toward [openCypher TCK](https://github.com/opencypher/openCypher) compatibility:
- Map GFQL operations to Cypher equivalents
- Adapt TCK scenarios for GFQL surface syntax
- Track coverage percentage

Note: GFQL currently lacks Cypher surface syntax; this is aspirational.

Reference: [Neo4j GQL Conformance](https://neo4j.com/docs/cypher-manual/current/appendix/gql-conformance/)

---

## Current Coverage

| Category | Status | Location |
|----------|--------|----------|
| Hop ranges (parity) | ✅ 11+ tests | `test_enumerator_parity.py` |
| WHERE (parity) | ✅ 50+ tests | `test_df_executor_inputs.py` |
| Hop + WHERE composition | ✅ 20+ tests | `test_df_executor_inputs.py` |
| Impossible constraints | ✅ 10 tests | `test_df_executor_inputs.py` |
| Alloy (WHERE lowering) | ✅ Verified | `alloy/gfql_fbf_where.als` |
| Alloy (hop ranges) | ❌ Not modeled | - |
| Alloy (backward reachability) | ❌ Not explicit | - |
| Alloy (empty propagation) | ❌ Implicit only | - |

---

## References

- [Gamera: Graph-aware metamorphic testing (VLDB 2024)](https://dl.acm.org/doi/10.14778/3636218.3636236)
- [GDBMeter: Query partitioning oracle (ISSTA 2023)](https://dl.acm.org/doi/10.1145/3597926.3598044)
- [openCypher TCK](https://github.com/opencypher/openCypher)
- [Neo4j GQL Conformance](https://neo4j.com/docs/cypher-manual/current/appendix/gql-conformance/)
- [TLA+ for database verification (VLDB)](https://dl.acm.org/doi/10.5555/579617)
- PR #846: [cudf same-path executor](https://github.com/graphistry/pygraphistry/pull/846) - Bug fixes in commits `71cda418`, `cd579363`

## Labels
testing, verification, gfql, roadmap


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Meta: GFQL Testing & Verification Roadmap #871

GFQL Testing & Verification Roadmap

Context

Bugs Found in PR #846 (Lessons for Verification)

Bug 1: Backward Traversal Join Direction (commit `71cda418`)

Bug 2: Empty Set Short-Circuit Missing (commit `71cda418`)

Bug 3: Wrong Node Source for Non-Adjacent WHERE (commit `71cda418`)

Bug 4: Multi-hop Path Tracing Through Intermediates (commit `cd579363`)

Bug 5: Reverse/Undirected Edge Direction Handling (commit `cd579363`)

Tests Added

Sub-Issues

1. Metamorphic Testing for GFQL

2. Extend Alloy Model to Cover Hop Ranges

3. Add Backward Reachability Assertions to Alloy

4. Add Empty Set Propagation Assertion

5. Add Contradictory WHERE Scenarios

6. Fuzz Testing for Chain + WHERE Combinations

7. Cypher/GQL Conformance Suite

Current Coverage

References

Labels

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Category	Status	Location
Hop ranges (parity)	✅ 11+ tests	`test_enumerator_parity.py`
WHERE (parity)	✅ 50+ tests	`test_df_executor_inputs.py`
Hop + WHERE composition	✅ 20+ tests	`test_df_executor_inputs.py`
Impossible constraints	✅ 10 tests	`test_df_executor_inputs.py`
Alloy (WHERE lowering)	✅ Verified	`alloy/gfql_fbf_where.als`
Alloy (hop ranges)	❌ Not modeled	-
Alloy (backward reachability)	❌ Not explicit	-
Alloy (empty propagation)	❌ Implicit only	-

Meta: GFQL Testing & Verification Roadmap #871

Description

GFQL Testing & Verification Roadmap

Context

Bugs Found in PR #846 (Lessons for Verification)

Bug 1: Backward Traversal Join Direction (commit 71cda418)

Bug 2: Empty Set Short-Circuit Missing (commit 71cda418)

Bug 3: Wrong Node Source for Non-Adjacent WHERE (commit 71cda418)

Bug 4: Multi-hop Path Tracing Through Intermediates (commit cd579363)

Bug 5: Reverse/Undirected Edge Direction Handling (commit cd579363)

Tests Added

Sub-Issues

1. Metamorphic Testing for GFQL

2. Extend Alloy Model to Cover Hop Ranges

3. Add Backward Reachability Assertions to Alloy

4. Add Empty Set Propagation Assertion

5. Add Contradictory WHERE Scenarios

6. Fuzz Testing for Chain + WHERE Combinations

7. Cypher/GQL Conformance Suite

Current Coverage

References

Labels

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Bug 1: Backward Traversal Join Direction (commit `71cda418`)

Bug 2: Empty Set Short-Circuit Missing (commit `71cda418`)

Bug 3: Wrong Node Source for Non-Adjacent WHERE (commit `71cda418`)

Bug 4: Multi-hop Path Tracing Through Intermediates (commit `cd579363`)

Bug 5: Reverse/Undirected Edge Direction Handling (commit `cd579363`)