Summary
Several Rust-backed plugins receive nearly identical gateway payload objects, then each plugin manually:
- reads the same payload attributes (
args, result, content, prompt_id, name, uri)
- traverses nested Python containers
- applies plugin-specific logic
- mutates the payload back into framework result objects
We already have a tiny shared crate for framework object construction in crates/framework_bridge/src/lib.rs, but the payload access/traversal layer is still reimplemented per plugin.
This is now large enough to justify a deeper abstraction, but the reuse is uneven: the scanning/redaction plugins share much more structure than the policy/enforcement plugins.
Problem
Today, each plugin owns its own version of "read payload -> walk Python object tree -> transform -> write back":
This creates a few concrete costs:
- duplicated traversal logic and path formatting
- inconsistent supported container types across plugins
- repeated hook-stage boilerplate
- higher porting cost for future Rust migrations
- harder-to-fix edge cases when payload shape handling is wrong in more than one plugin
What looks shared vs. what does not
Clearly shared
- framework result construction
- hook-stage payload field access
- common payload subjects:
- prompt args
- prompt result/messages
- tool args
- tool result
- resource content
- nested traversal of Python objects into transformed Python objects
- path tracking for findings/metadata
- "changed vs unchanged" result handling
Not uniformly shared
- plugin-specific detection logic and findings schemas
- policy-only plugins like
rate_limiter
- some hook stages have special shapes:
pii_filter.prompt_post_fetch walks result.messages[*].content.text, not just one top-level field
secrets_detection.resource_post_fetch only scans content.text
encoded_exfil_detection scans richer mixed payloads and also parses JSON-like strings
So the right target is not "one crate that does everything for all plugins." The better target is "one shared payload bridge with pluggable traversal/scanner callbacks."
Options
Option 1: Expand cpex_framework_bridge into a real payload bridge
Add modules to the existing crate for:
- hook payload readers/writers
- stage result builders
- nested Python traversal helpers
- common path formatting helpers
Pros:
- least crate sprawl
- plugins already depend on
cpex_framework_bridge
- easy incremental rollout
Cons:
- current crate is tiny and narrowly scoped; this would broaden its responsibility a lot
- risk of mixing two concerns:
- framework object construction
- payload normalization/traversal
Option 2: Add a new crate dedicated to payload access/traversal
Create something like cpex_payload_bridge or cpex_plugin_payload with:
- typed stage adapters
- tree walker utilities
- mutation result types
- optional helpers for finding aggregation/path bookkeeping
Keep cpex_framework_bridge focused on framework object creation.
Pros:
- cleaner separation of concerns
- easier to test independently
- clearer scope boundary for future contributors
Cons:
- one more internal crate
- slightly more wiring for each plugin
Option 3: Share only stage adapters, not tree traversal
Extract just:
- read/write of
args / result / content
- common result/violation helpers
Leave recursive traversal inside each plugin.
Pros:
- lowest risk
- easiest short-term refactor
Cons:
- leaves the biggest duplication in place
- does not help much with future scanner-style plugins
Option 4: Normalize payloads into a canonical Rust value model
Convert incoming Python objects into an owned intermediate form, likely serde_json::Value plus a few escape hatches for non-JSON Python values, then run plugins on that model.
Pros:
- strongest consistency
- easiest scanner implementations once converted
- opens door to non-Python-specific testing
Cons:
- high risk of semantic mismatch with arbitrary Python objects
- awkward for tuples, sets, custom objects, and rich framework payload objects
- likely overkill for current repo
Recommendation
Take Option 2 in stages.
Start with a new internal crate focused on the scanning/redaction family:
pii_filter
secrets_detection
encoded_exfil_detection
Do not force rate_limiter, retry_with_backoff, or url_reputation onto the same abstraction immediately. They share some framework plumbing, but not enough deep-tree behavior to justify coupling them to a scanner-oriented bridge yet.
Suggested scope for phase 1
Build a crate that provides:
-
Stage adapters
- read common source values from payloads
- write back modified values
- build unchanged / modified / blocked results
-
Generic recursive walker
- strings
- dicts
- lists
- optionally tuples / sets /
__dict__ objects
- configurable depth and collection limits
- stable path formatting
-
Callback-based transform API
- plugin supplies string or node transform
- bridge owns recursion, cloning, and mutation rebuild
-
Shared mutation result type
- changed flag
- rebuilt value
- findings payload
-
Test fixtures for framework payload shims
- so each plugin does not re-prove the same payload mechanics
pii_filter already contains the most complete traversal logic, so it is the strongest donor implementation for the walker. encoded_exfil_detection adds a useful extra requirement: optional JSON-string parsing during traversal.
Design constraints
- Preserve Python object shapes on output. Do not silently coerce everything to JSON.
- Keep plugin-specific findings/metadata schemas outside the shared crate.
- Support incremental adoption plugin by plugin.
- Avoid a macro-heavy API at first. Prefer plain Rust traits/functions until the common shape stabilizes.
- Keep the crate internal to the workspace until at least two plugin migrations prove the boundary is correct.
Proposed rollout
- Extract common stage/payload helpers.
- Migrate
secrets_detection first.
- smallest useful target
- should validate stage adapter ergonomics quickly
- Extract recursive walker from
pii_filter into the new crate.
- Migrate
pii_filter.
- Evaluate whether
encoded_exfil_detection can reuse the walker directly or needs hook points for JSON-string parsing.
- Reassess whether
rate_limiter benefits from only the stage/result helpers.
Acceptance criteria
- At least
secrets_detection and pii_filter use the shared crate for payload access and mutation.
- The shared crate has direct unit tests for traversal behavior and path generation.
- Plugin tests still cover end-to-end hook behavior.
encoded_exfil_detection either migrates or documents the gap that still blocks migration.
cpex_framework_bridge stays small, or its expanded scope is explicitly documented if we choose not to add a new crate.
Open questions
- Should tuples/sets/custom Python objects be part of the phase-1 contract, or copied later from
pii_filter if another plugin truly needs them?
- Should path formatting be standardized across plugins now, even if that slightly changes existing finding output?
- Is JSON-string parsing a plugin-specific extension point, or should the shared walker support optional secondary parses natively?
- Do we want a typed enum for hook stages, or is a small stringly-typed adapter enough for now?
Why this matters now
The repo has already crossed the point where each new Rust port re-discovers the same payload mechanics. A shared payload bridge will reduce porting cost, reduce subtle payload-shape bugs, and let future plugins focus on domain logic instead of PyO3 tree surgery.
Summary
Several Rust-backed plugins receive nearly identical gateway payload objects, then each plugin manually:
args,result,content,prompt_id,name,uri)We already have a tiny shared crate for framework object construction in
crates/framework_bridge/src/lib.rs, but the payload access/traversal layer is still reimplemented per plugin.This is now large enough to justify a deeper abstraction, but the reuse is uneven: the scanning/redaction plugins share much more structure than the policy/enforcement plugins.
Problem
Today, each plugin owns its own version of "read payload -> walk Python object tree -> transform -> write back":
pii_filterhas a generic nested traversal/mutation engine over strings, dicts, lists, tuples, sets, and objects with__dict__inplugins/rust/python-package/pii_filter/src/detector.rs.encoded_exfil_detectionhas a separate recursive walker over strings, dicts, and lists inplugins/rust/python-package/encoded_exfil_detection/src/lib.rs.secrets_detectionrepeats stage-level payload extraction and write-back for prompt/tool/resource hooks inplugins/rust/python-package/secrets_detection/src/plugin.rs.pii_filterrepeats similar stage wiring inplugins/rust/python-package/pii_filter/src/plugin.rs.rate_limiteralso manually extracts payload/context fields and constructs result objects, but its core logic is request-scoped policy evaluation rather than deep tree traversal inplugins/rust/python-package/rate_limiter/src/plugin.rs.This creates a few concrete costs:
What looks shared vs. what does not
Clearly shared
Not uniformly shared
rate_limiterpii_filter.prompt_post_fetchwalksresult.messages[*].content.text, not just one top-level fieldsecrets_detection.resource_post_fetchonly scanscontent.textencoded_exfil_detectionscans richer mixed payloads and also parses JSON-like stringsSo the right target is not "one crate that does everything for all plugins." The better target is "one shared payload bridge with pluggable traversal/scanner callbacks."
Options
Option 1: Expand
cpex_framework_bridgeinto a real payload bridgeAdd modules to the existing crate for:
Pros:
cpex_framework_bridgeCons:
Option 2: Add a new crate dedicated to payload access/traversal
Create something like
cpex_payload_bridgeorcpex_plugin_payloadwith:Keep
cpex_framework_bridgefocused on framework object creation.Pros:
Cons:
Option 3: Share only stage adapters, not tree traversal
Extract just:
args/result/contentLeave recursive traversal inside each plugin.
Pros:
Cons:
Option 4: Normalize payloads into a canonical Rust value model
Convert incoming Python objects into an owned intermediate form, likely
serde_json::Valueplus a few escape hatches for non-JSON Python values, then run plugins on that model.Pros:
Cons:
Recommendation
Take Option 2 in stages.
Start with a new internal crate focused on the scanning/redaction family:
pii_filtersecrets_detectionencoded_exfil_detectionDo not force
rate_limiter,retry_with_backoff, orurl_reputationonto the same abstraction immediately. They share some framework plumbing, but not enough deep-tree behavior to justify coupling them to a scanner-oriented bridge yet.Suggested scope for phase 1
Build a crate that provides:
Stage adapters
Generic recursive walker
__dict__objectsCallback-based transform API
Shared mutation result type
Test fixtures for framework payload shims
pii_filteralready contains the most complete traversal logic, so it is the strongest donor implementation for the walker.encoded_exfil_detectionadds a useful extra requirement: optional JSON-string parsing during traversal.Design constraints
Proposed rollout
secrets_detectionfirst.pii_filterinto the new crate.pii_filter.encoded_exfil_detectioncan reuse the walker directly or needs hook points for JSON-string parsing.rate_limiterbenefits from only the stage/result helpers.Acceptance criteria
secrets_detectionandpii_filteruse the shared crate for payload access and mutation.encoded_exfil_detectioneither migrates or documents the gap that still blocks migration.cpex_framework_bridgestays small, or its expanded scope is explicitly documented if we choose not to add a new crate.Open questions
pii_filterif another plugin truly needs them?Why this matters now
The repo has already crossed the point where each new Rust port re-discovers the same payload mechanics. A shared payload bridge will reduce porting cost, reduce subtle payload-shape bugs, and let future plugins focus on domain logic instead of PyO3 tree surgery.