Summary
Migrate the retry_with_backoff plugin from its current hybrid architecture (Python integration + Rust fast path) to a Rust-majority architecture (thin Python shim + complete Rust implementation), following the url_reputation pattern.
Current State
Architecture: Hybrid (60% Python + 40% Rust)
- Python integration layer: 280 lines (config, text parsing, metadata, path selection)
- Rust fast path: 433 lines (retry algorithm only)
- Performance: 2-4x improvement over Python-only
- Test coverage: 78% effective (47/60 tests)
Proposed Target State
Architecture: Rust-majority (10% Python + 90% Rust)
- Python thin shim: ~50 lines (plugin interface, delegation, error handling)
- Rust core: ~800 lines (complete business logic)
- Performance: 3-5x improvement over Python-only (50% better than current)
- Test coverage: 100% (60/60 tests)
Architecture Comparison
Current Architecture (Hybrid)
┌─────────────────────────────────────────────────────────────┐
│ Python Integration Layer (280 lines) │
│ ✅ Plugin framework interface (tool_post_invoke, etc.) │
│ ✅ Configuration management (Pydantic, clamping) │
│ ✅ Text content parsing (check_text_content=True) │
│ ✅ Metadata attachment (retry_policy) │
│ ✅ Path selection (Rust vs Python) │
│ ✅ Fallback state management │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ RetryStateManager (Rust - 433 lines) │
│ ✅ Exponential backoff calculation │
│ ✅ Failure detection (structured content only) │
│ ✅ State tracking (consecutive_failures, TTL) │
│ ✅ check_and_update() - atomic retry decision │
└─────────────────────────────────────────────────────────────┘
Split: 40% Rust (retry algorithm) + 60% Python (integration, config, parsing)
Target Architecture (Rust-Majority)
┌─────────────────────────────────────────────────────────────┐
│ Python Thin Shim (~50 lines) │
│ - Plugin framework interface (tool_post_invoke) │
│ - Delegates to RetryPluginCore │
│ - Error handling wrapper │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ RetryPluginCore (Rust - ~800 lines) │
│ ✅ Configuration management (validation, clamping) │
│ ✅ Text content parsing (JSON extraction) │
│ ✅ Failure detection (structured + text) │
│ ✅ Exponential backoff calculation │
│ ✅ State management (TTL, eviction) │
│ ✅ Metadata generation (retry_policy dict) │
│ ✅ Complete business logic │
└─────────────────────────────────────────────────────────────┘
Split: 90% Rust (complete logic) + 10% Python (thin shim)
Benefits
Performance
- 3-5x improvement over Python-only (vs current 2-4x)
- 1.5-2x improvement over current hybrid implementation
- Faster JSON parsing (serde_json vs Python json)
- Faster state management (Rust HashMap vs Python dict)
- No Python/Rust boundary crossing for hot path
Maintainability
- Single source of truth in Rust (vs split logic)
- 82% reduction in Python code (280 → 50 lines)
- Type-safe implementation with compile-time guarantees
- Easier to test (pure Rust unit tests)
- No logic duplication between Python and Rust
Consistency
- Follows url_reputation pattern (standard plugin architecture)
- Predictable behavior across all plugins
- Easier onboarding for new developers
Scalability
- Better memory efficiency
- Lower CPU overhead
- Handles high-throughput scenarios
Migration Strategy
Phase 1: Move Configuration to Rust (Week 1)
Goal: Migrate configuration management from Python (Pydantic) to Rust
Tasks:
- Create
RetryConfig struct with validation and clamping
- Implement
from_dict() constructor for Python interop
- Add
to_metadata_dict() for metadata generation
- Write unit tests for configuration
Benefits:
- Type-safe configuration in Rust
- Validation and clamping in Rust
- No Pydantic dependency
- Faster configuration parsing
Deliverables: Rust config module, 10 unit tests
Phase 2: Move Text Content Parsing to Rust (Week 2)
Goal: Migrate JSON parsing and content extraction from Python to Rust
Tasks:
- Define
ContentItem enum (Text, Image, Resource)
- Implement
is_failure() function with JSON parsing
- Add serde_json for fast JSON extraction
- Handle all content types
Benefits:
- Fast JSON parsing with serde_json
- No Python JSON overhead
- Type-safe content handling
- Unified failure detection logic
Deliverables: Content parsing module, 15 unit tests
Phase 3: Move Metadata Generation to Rust (Week 3)
Goal: Migrate metadata dictionary construction from Python to Rust
Tasks:
- Implement
to_metadata_dict() in Rust
- Add PyDict construction helpers
- Verify Python compatibility
Benefits:
- Metadata generation in Rust
- No Python dict construction overhead
- Type-safe metadata
Deliverables: Metadata generation, 5 unit tests
Phase 4: Create Complete Rust Core (Week 4)
Goal: Implement complete business logic in Rust
Tasks:
- Create
RetryPluginCore struct
- Implement
tool_post_invoke() method
- Add state management (HashMap with TTL)
- Implement eviction logic
- Add exponential backoff calculation with jitter
Benefits:
- Complete business logic in Rust
- Single source of truth
- Type-safe state management
- Fast payload extraction
- Integrated eviction logic
Deliverables: Complete Rust core, 20 integration tests
Phase 5: Create Python Thin Shim (Week 5)
Goal: Create minimal Python wrapper for plugin framework
Tasks:
- Create new
retry_with_backoff.py (~50 lines)
- Implement
RetryWithBackoffPlugin class
- Add error handling wrapper
- Optional Pydantic validation
Benefits:
- Minimal Python code
- Simple delegation to Rust
- Optional Pydantic validation
- Error handling wrapper
Deliverables: Python shim (~50 lines), package exports
Phase 6: Testing & Validation (Week 6)
Goal: Ensure migration maintains functionality and improves performance
Test Strategy:
- Unit Tests (Rust): Configuration, parsing, failure detection, backoff, state management, metadata
- Integration Tests (Python): Plugin loading, hook behavior, error handling, performance
- Migration Tests: Compare Rust-majority vs hybrid behavior, verify identical results
Validation Checklist:
- All 60 original tests pass
- Performance improvement: 3-5x (vs 2-4x current)
- Memory usage: similar or better
- Error handling: equivalent behavior
- Configuration: identical validation
- Text parsing: identical results
Deliverables: Test suite, benchmarks, migration guide
Implementation Checklist
Week 1: Configuration Migration
Week 2: Text Content Parsing
Week 3: Metadata Generation
Week 4: Complete Rust Core
Week 5: Python Thin Shim
Week 6: Testing & Validation
Success Criteria
Functional Requirements
Performance Requirements
Code Quality Requirements
Documentation Requirements
Timeline Options
Option 1: 6-Week Standard Track (Recommended)
Risk: Low (comfortable timeline)
| Week |
Focus |
Deliverables |
| 1 |
Configuration |
Rust config, validation, 10 tests |
| 2 |
Text Parsing |
Content parsing, JSON, 15 tests |
| 3 |
Metadata |
Metadata generation, 5 tests |
| 4 |
Rust Core |
Complete core, 20 tests |
| 5 |
Python Shim |
Thin shim, integration, 10 tests |
| 6 |
Testing |
All tests, benchmarks, docs |
Option 2: 4-Week Fast Track (Aggressive)
Risk: High (tight timeline)
| Week |
Focus |
Deliverables |
| 1 |
Config + Parsing |
Rust config, text parsing, 20 tests |
| 2 |
Rust Core |
Complete RetryPluginCore, 30 tests |
| 3 |
Python Shim |
Thin shim, integration tests |
| 4 |
Testing |
All 60 tests, benchmarks, docs |
Option 3: 8-Week Conservative Track (Safe)
Risk: Very Low (plenty of buffer)
| Week |
Focus |
Deliverables |
| 1-2 |
Configuration |
Rust config, validation, tests |
| 3-4 |
Text Parsing |
Content parsing, JSON, tests |
| 5 |
Metadata |
Metadata generation, tests |
| 6 |
Rust Core |
Complete core, tests |
| 7 |
Python Shim |
Thin shim, integration |
| 8 |
Testing |
All tests, benchmarks, docs |
Trade-offs Analysis
Advantages of Rust-Majority Architecture
✅ Performance:
- 3-5x improvement (vs 2-4x current)
- Faster JSON parsing (serde_json vs Python json)
- Faster state management (HashMap vs Python dict)
- No Python/Rust boundary crossing for hot path
✅ Maintainability:
- Single source of truth (Rust)
- No logic duplication
- Type-safe implementation
- Easier to test (pure Rust tests)
✅ Consistency:
- Follows url_reputation pattern
- Standard plugin architecture
- Predictable behavior
✅ Scalability:
- Better memory efficiency
- Lower CPU overhead
- Handles high-throughput scenarios
Disadvantages of Rust-Majority Architecture
❌ Development Effort:
- 4-6 weeks migration time
- Requires Rust expertise
- More complex PyO3 bindings
❌ Flexibility:
- Harder to modify logic (Rust vs Python)
- Longer iteration cycles (compile time)
- Less dynamic behavior
❌ Dependencies:
- Requires serde_json (JSON parsing)
- Requires rand (jitter)
- Larger binary size
❌ Debugging:
- Harder to debug Rust code
- Less visibility into state
- Requires Rust tooling
Comparison: Hybrid vs Rust-Majority
| Aspect |
Hybrid (Current) |
Rust-Majority (Target) |
| Performance |
2-4x improvement |
3-5x improvement |
| Python Code |
280 lines |
50 lines |
| Rust Code |
433 lines |
800 lines |
| Maintainability |
Medium (split logic) |
High (single source) |
| Flexibility |
High (Python) |
Medium (Rust) |
| Development Time |
N/A (existing) |
4-6 weeks |
| Testing |
47/60 tests |
60/60 tests |
| Pattern |
Unique hybrid |
Standard (url_reputation) |
Alternatives Considered
Alternative 1: Keep Hybrid Architecture (Current)
Effort: 0 weeks (no change)
Performance: 2-4x improvement (current)
Maintainability: Medium (split logic)
When to choose:
- Current performance is sufficient (2-4x)
- Need maximum flexibility (Python)
- Don't have time for migration
- Prefer Python for business logic
- Hybrid pattern works for your use case
Alternative 2: Hybrid with Improvements (Middle Ground)
Effort: 2-3 weeks
Performance: 2.5-3.5x improvement
Changes:
- Move text content parsing to Rust (Week 1-2)
- Move metadata generation to Rust (Week 3)
- Keep Python integration layer (no change)
Outcome:
- Python integration: ~200 lines (vs 280)
- Rust fast path: ~600 lines (vs 433)
- Performance: 2.5-3.5x improvement
- Maintainability: Medium-High
When to choose:
- Want better performance (2.5-3.5x)
- Want to keep flexibility
- Have 2-3 weeks for improvements
- Want incremental migration
Recommendation
Migrate to Rust-majority architecture (Option 1) using the 6-week standard track timeline.
Rationale:
- Significant performance improvement (50% better than current)
- Follows established pattern (url_reputation)
- Better maintainability (single source of truth)
- Reasonable timeline (6 weeks with low risk)
- Complete test coverage (60/60 tests)
- Standard architecture across all plugins
Dependencies
- Rust toolchain (stable)
- PyO3 (Python bindings)
- serde_json (JSON parsing)
- rand (jitter generation)
Related Issues
- Original implementation: [Link to original PR]
- Test coverage analysis: [Link to test analysis]
- Performance benchmarks: [Link to benchmarks]
Summary
Migrate the
retry_with_backoffplugin from its current hybrid architecture (Python integration + Rust fast path) to a Rust-majority architecture (thin Python shim + complete Rust implementation), following theurl_reputationpattern.Current State
Architecture: Hybrid (60% Python + 40% Rust)
Proposed Target State
Architecture: Rust-majority (10% Python + 90% Rust)
Architecture Comparison
Current Architecture (Hybrid)
Split: 40% Rust (retry algorithm) + 60% Python (integration, config, parsing)
Target Architecture (Rust-Majority)
Split: 90% Rust (complete logic) + 10% Python (thin shim)
Benefits
Performance
Maintainability
Consistency
Scalability
Migration Strategy
Phase 1: Move Configuration to Rust (Week 1)
Goal: Migrate configuration management from Python (Pydantic) to Rust
Tasks:
RetryConfigstruct with validation and clampingfrom_dict()constructor for Python interopto_metadata_dict()for metadata generationBenefits:
Deliverables: Rust config module, 10 unit tests
Phase 2: Move Text Content Parsing to Rust (Week 2)
Goal: Migrate JSON parsing and content extraction from Python to Rust
Tasks:
ContentItemenum (Text, Image, Resource)is_failure()function with JSON parsingBenefits:
Deliverables: Content parsing module, 15 unit tests
Phase 3: Move Metadata Generation to Rust (Week 3)
Goal: Migrate metadata dictionary construction from Python to Rust
Tasks:
to_metadata_dict()in RustBenefits:
Deliverables: Metadata generation, 5 unit tests
Phase 4: Create Complete Rust Core (Week 4)
Goal: Implement complete business logic in Rust
Tasks:
RetryPluginCorestructtool_post_invoke()methodBenefits:
Deliverables: Complete Rust core, 20 integration tests
Phase 5: Create Python Thin Shim (Week 5)
Goal: Create minimal Python wrapper for plugin framework
Tasks:
retry_with_backoff.py(~50 lines)RetryWithBackoffPluginclassBenefits:
Deliverables: Python shim (~50 lines), package exports
Phase 6: Testing & Validation (Week 6)
Goal: Ensure migration maintains functionality and improves performance
Test Strategy:
Validation Checklist:
Deliverables: Test suite, benchmarks, migration guide
Implementation Checklist
Week 1: Configuration Migration
RetryConfigstruct in Rustfrom_dict()constructorto_metadata_dict()methodWeek 2: Text Content Parsing
ContentItemenum in Rustis_failure()functionWeek 3: Metadata Generation
to_metadata_dict()in RustWeek 4: Complete Rust Core
RetryPluginCorestructtool_post_invoke()methodWeek 5: Python Thin Shim
retry_with_backoff.py(50 lines)RetryWithBackoffPluginclassWeek 6: Testing & Validation
Success Criteria
Functional Requirements
Performance Requirements
Code Quality Requirements
Documentation Requirements
Timeline Options
Option 1: 6-Week Standard Track (Recommended)
Risk: Low (comfortable timeline)
Option 2: 4-Week Fast Track (Aggressive)
Risk: High (tight timeline)
Option 3: 8-Week Conservative Track (Safe)
Risk: Very Low (plenty of buffer)
Trade-offs Analysis
Advantages of Rust-Majority Architecture
✅ Performance:
✅ Maintainability:
✅ Consistency:
✅ Scalability:
Disadvantages of Rust-Majority Architecture
❌ Development Effort:
❌ Flexibility:
❌ Dependencies:
❌ Debugging:
Comparison: Hybrid vs Rust-Majority
Alternatives Considered
Alternative 1: Keep Hybrid Architecture (Current)
Effort: 0 weeks (no change)
Performance: 2-4x improvement (current)
Maintainability: Medium (split logic)
When to choose:
Alternative 2: Hybrid with Improvements (Middle Ground)
Effort: 2-3 weeks
Performance: 2.5-3.5x improvement
Changes:
Outcome:
When to choose:
Recommendation
Migrate to Rust-majority architecture (Option 1) using the 6-week standard track timeline.
Rationale:
Dependencies
Related Issues