Introduce a shared payload bridge for Rust-backed plugins

## Summary

Several Rust-backed plugins receive nearly identical gateway payload objects, then each plugin manually:

- reads the same payload attributes (`args`, `result`, `content`, `prompt_id`, `name`, `uri`)
- traverses nested Python containers
- applies plugin-specific logic
- mutates the payload back into framework result objects

We already have a tiny shared crate for framework object construction in [`crates/framework_bridge/src/lib.rs`](https://github.com/IBM/cpex-plugins/blob/main/crates/framework_bridge/src/lib.rs#L7), but the payload access/traversal layer is still reimplemented per plugin.

This is now large enough to justify a deeper abstraction, but the reuse is uneven: the scanning/redaction plugins share much more structure than the policy/enforcement plugins.

## Problem

Today, each plugin owns its own version of "read payload -> walk Python object tree -> transform -> write back":

- `pii_filter` has a generic nested traversal/mutation engine over strings, dicts, lists, tuples, sets, and objects with `__dict__` in [`plugins/rust/python-package/pii_filter/src/detector.rs`](https://github.com/IBM/cpex-plugins/blob/main/plugins/rust/python-package/pii_filter/src/detector.rs#L301).
- `encoded_exfil_detection` has a separate recursive walker over strings, dicts, and lists in [`plugins/rust/python-package/encoded_exfil_detection/src/lib.rs`](https://github.com/IBM/cpex-plugins/blob/main/plugins/rust/python-package/encoded_exfil_detection/src/lib.rs#L697).
- `secrets_detection` repeats stage-level payload extraction and write-back for prompt/tool/resource hooks in [`plugins/rust/python-package/secrets_detection/src/plugin.rs`](https://github.com/IBM/cpex-plugins/blob/main/plugins/rust/python-package/secrets_detection/src/plugin.rs#L28).
- `pii_filter` repeats similar stage wiring in [`plugins/rust/python-package/pii_filter/src/plugin.rs`](https://github.com/IBM/cpex-plugins/blob/main/plugins/rust/python-package/pii_filter/src/plugin.rs#L34).
- `rate_limiter` also manually extracts payload/context fields and constructs result objects, but its core logic is request-scoped policy evaluation rather than deep tree traversal in [`plugins/rust/python-package/rate_limiter/src/plugin.rs`](https://github.com/IBM/cpex-plugins/blob/main/plugins/rust/python-package/rate_limiter/src/plugin.rs#L38).

This creates a few concrete costs:

- duplicated traversal logic and path formatting
- inconsistent supported container types across plugins
- repeated hook-stage boilerplate
- higher porting cost for future Rust migrations
- harder-to-fix edge cases when payload shape handling is wrong in more than one plugin

## What looks shared vs. what does not

### Clearly shared

- framework result construction
- hook-stage payload field access
- common payload subjects:
  - prompt args
  - prompt result/messages
  - tool args
  - tool result
  - resource content
- nested traversal of Python objects into transformed Python objects
- path tracking for findings/metadata
- "changed vs unchanged" result handling

### Not uniformly shared

- plugin-specific detection logic and findings schemas
- policy-only plugins like `rate_limiter`
- some hook stages have special shapes:
  - `pii_filter.prompt_post_fetch` walks `result.messages[*].content.text`, not just one top-level field
  - `secrets_detection.resource_post_fetch` only scans `content.text`
  - `encoded_exfil_detection` scans richer mixed payloads and also parses JSON-like strings

So the right target is not "one crate that does everything for all plugins." The better target is "one shared payload bridge with pluggable traversal/scanner callbacks."

## Options

### Option 1: Expand `cpex_framework_bridge` into a real payload bridge

Add modules to the existing crate for:

- hook payload readers/writers
- stage result builders
- nested Python traversal helpers
- common path formatting helpers

Pros:

- least crate sprawl
- plugins already depend on `cpex_framework_bridge`
- easy incremental rollout

Cons:

- current crate is tiny and narrowly scoped; this would broaden its responsibility a lot
- risk of mixing two concerns:
  - framework object construction
  - payload normalization/traversal

### Option 2: Add a new crate dedicated to payload access/traversal

Create something like `cpex_payload_bridge` or `cpex_plugin_payload` with:

- typed stage adapters
- tree walker utilities
- mutation result types
- optional helpers for finding aggregation/path bookkeeping

Keep `cpex_framework_bridge` focused on framework object creation.

Pros:

- cleaner separation of concerns
- easier to test independently
- clearer scope boundary for future contributors

Cons:

- one more internal crate
- slightly more wiring for each plugin

### Option 3: Share only stage adapters, not tree traversal

Extract just:

- read/write of `args` / `result` / `content`
- common result/violation helpers

Leave recursive traversal inside each plugin.

Pros:

- lowest risk
- easiest short-term refactor

Cons:

- leaves the biggest duplication in place
- does not help much with future scanner-style plugins

### Option 4: Normalize payloads into a canonical Rust value model

Convert incoming Python objects into an owned intermediate form, likely `serde_json::Value` plus a few escape hatches for non-JSON Python values, then run plugins on that model.

Pros:

- strongest consistency
- easiest scanner implementations once converted
- opens door to non-Python-specific testing

Cons:

- high risk of semantic mismatch with arbitrary Python objects
- awkward for tuples, sets, custom objects, and rich framework payload objects
- likely overkill for current repo

## Recommendation

Take **Option 2** in stages.

Start with a new internal crate focused on the scanning/redaction family:

- `pii_filter`
- `secrets_detection`
- `encoded_exfil_detection`

Do **not** force `rate_limiter`, `retry_with_backoff`, or `url_reputation` onto the same abstraction immediately. They share some framework plumbing, but not enough deep-tree behavior to justify coupling them to a scanner-oriented bridge yet.

## Suggested scope for phase 1

Build a crate that provides:

1. Stage adapters
   - read common source values from payloads
   - write back modified values
   - build unchanged / modified / blocked results

2. Generic recursive walker
   - strings
   - dicts
   - lists
   - optionally tuples / sets / `__dict__` objects
   - configurable depth and collection limits
   - stable path formatting

3. Callback-based transform API
   - plugin supplies string or node transform
   - bridge owns recursion, cloning, and mutation rebuild

4. Shared mutation result type
   - changed flag
   - rebuilt value
   - findings payload

5. Test fixtures for framework payload shims
   - so each plugin does not re-prove the same payload mechanics

`pii_filter` already contains the most complete traversal logic, so it is the strongest donor implementation for the walker. `encoded_exfil_detection` adds a useful extra requirement: optional JSON-string parsing during traversal.

## Design constraints

- Preserve Python object shapes on output. Do not silently coerce everything to JSON.
- Keep plugin-specific findings/metadata schemas outside the shared crate.
- Support incremental adoption plugin by plugin.
- Avoid a macro-heavy API at first. Prefer plain Rust traits/functions until the common shape stabilizes.
- Keep the crate internal to the workspace until at least two plugin migrations prove the boundary is correct.

## Proposed rollout

1. Extract common stage/payload helpers.
2. Migrate `secrets_detection` first.
   - smallest useful target
   - should validate stage adapter ergonomics quickly
3. Extract recursive walker from `pii_filter` into the new crate.
4. Migrate `pii_filter`.
5. Evaluate whether `encoded_exfil_detection` can reuse the walker directly or needs hook points for JSON-string parsing.
6. Reassess whether `rate_limiter` benefits from only the stage/result helpers.

## Acceptance criteria

- At least `secrets_detection` and `pii_filter` use the shared crate for payload access and mutation.
- The shared crate has direct unit tests for traversal behavior and path generation.
- Plugin tests still cover end-to-end hook behavior.
- `encoded_exfil_detection` either migrates or documents the gap that still blocks migration.
- `cpex_framework_bridge` stays small, or its expanded scope is explicitly documented if we choose not to add a new crate.

## Open questions

- Should tuples/sets/custom Python objects be part of the phase-1 contract, or copied later from `pii_filter` if another plugin truly needs them?
- Should path formatting be standardized across plugins now, even if that slightly changes existing finding output?
- Is JSON-string parsing a plugin-specific extension point, or should the shared walker support optional secondary parses natively?
- Do we want a typed enum for hook stages, or is a small stringly-typed adapter enough for now?

## Why this matters now

The repo has already crossed the point where each new Rust port re-discovers the same payload mechanics. A shared payload bridge will reduce porting cost, reduce subtle payload-shape bugs, and let future plugins focus on domain logic instead of PyO3 tree surgery.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce a shared payload bridge for Rust-backed plugins #32

Summary

Problem

What looks shared vs. what does not

Clearly shared

Not uniformly shared

Options

Option 1: Expand `cpex_framework_bridge` into a real payload bridge

Option 2: Add a new crate dedicated to payload access/traversal

Option 3: Share only stage adapters, not tree traversal

Option 4: Normalize payloads into a canonical Rust value model

Recommendation

Suggested scope for phase 1

Design constraints

Proposed rollout

Acceptance criteria

Open questions

Why this matters now

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Introduce a shared payload bridge for Rust-backed plugins #32

Description

Summary

Problem

What looks shared vs. what does not

Clearly shared

Not uniformly shared

Options

Option 1: Expand cpex_framework_bridge into a real payload bridge

Option 2: Add a new crate dedicated to payload access/traversal

Option 3: Share only stage adapters, not tree traversal

Option 4: Normalize payloads into a canonical Rust value model

Recommendation

Suggested scope for phase 1

Design constraints

Proposed rollout

Acceptance criteria

Open questions

Why this matters now

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Option 1: Expand `cpex_framework_bridge` into a real payload bridge