Skip to content

feat: Media Generation Plugin Suite + GitHub Integration (Issue #20)#198

Open
om952 wants to merge 5 commits into
OpenScanAI:masterfrom
om952:issue-20-media-suite
Open

feat: Media Generation Plugin Suite + GitHub Integration (Issue #20)#198
om952 wants to merge 5 commits into
OpenScanAI:masterfrom
om952:issue-20-media-suite

Conversation

@om952

@om952 om952 commented Jun 18, 2026

Copy link
Copy Markdown
  • media-core — Shared infrastructure package with storage wrapper, job queue, cost tracker, retry logic, and shared types
  • media-image — Image generation plugin with Stable Diffusion (self-hosted) and DALL-E (OpenAI API) backends, plus generate_image tool
  • media-video — Video generation plugin with ComfyUI (self-hosted SVD), FFmpeg (local slideshows), and Runway (API) backends, plus generate_video tool
  • media-audio — Audio/TTS generation plugin with ElevenLabs (API) and Edge TTS (free system) backends, plus generate_audio tool
  • media-dashboard — UI plugin with GalleryWidget and GenerationStatus components for browsing generated media
  • Workspace integration — All packages wired into pnpm workspace with TypeScript configs, builds passing
  • Integration guide — packages/plugins/media-suite-INTEGRATION.md with setup, configuration, and usage instructions
  • Implementation plan — doc/plans/2026-06-18-media-generation-plugin-suite.md documenting the 6-phase approach

Also includes: GitHub integration plugin (Issue #15).

Verification

# Build all media packages
pnpm --filter @paperclipai/media-core build
pnpm --filter @paperclipai/media-image build
pnpm --filter @paperclipai/media-video build
pnpm --filter @paperclipai/media-audio build
pnpm --filter @paperclipai/media-dashboard build

# Full workspace typecheck
pnpm -r typecheck

# Run tests
pnpm test

Manual verification steps:

  1. Install media-image plugin via API: POST /api/plugins/install with {"pluginId": "paperclip.media-image"}
  2. Configure DALL-E API key or Stable Diffusion URL in plugin settings
  3. Create agent with tools: ["generate_image"]
  4. Assign issue with prompt like "Generate a logo for our company"
  5. Verify job appears in queue, asset stored, cost logged

Risks

  • Low to medium. All plugins are additive and do not modify core Paperclip logic.
  • Self-hosted backends (Stable Diffusion, ComfyUI, FFmpeg, Edge TTS) require external binaries/servers — plugin gracefully degrades if unavailable.
  • API-based backends (DALL-E, Runway, ElevenLabs) require valid API keys — misconfigured keys fail with logged errors.
  • Plugin state (key-value) used for asset metadata storage — no DB access means search/list operations are limited to what state keys can retrieve.
  • Cost estimates are approximate and may not match actual vendor billing.

Closes #20

om952 added 5 commits June 18, 2026 16:09
- Auto-create GitHub issues from Paperclip issues
- Sync status changes bidirectionally (Paperclip ↔ GitHub)
- Mirror comments with user attribution [GitHub @username]
- Auto-create PRs when issues marked done (empty branch)
- Webhook signature verification (HMAC-SHA256)
- Timestamp-based conflict resolution (last-write-wins)
- Rate limit handling with exponential backoff
- State persistence for issue mappings across restarts

Closes OpenScanAI#15
- Add README.md with setup, config, and manual testing instructions
- Add comprehensive test suite (plugin.test.ts) covering:
  - Configuration validation
  - Event handling (issue.created, issue.updated, issue.comment.created)
  - Status mapping and reverse mapping
  - Conflict resolution (timestamp-based)
  - Webhook signature verification
  - State management
  - PR creation (branch naming)
  - Rate limiting
  - Error handling
  - Manifest capabilities
- Add vitest config and test scripts to package.json
- Add plugins/* to pnpm-workspace.yaml
- Fix @paperclipai/plugin-sdk dependency to use workspace:*
- Fix rate limiting test to use future timestamp
- All 30 tests passing
- Typecheck and build passing
- Add integration tests with mocked Octokit (15 tests)
  - GitHub issue CRUD operations
  - Branch and PR creation
  - Webhook signature verification
  - Sync job batch processing
  - Rate limit handling
  - State persistence
  - Error handling (401, 404, network)
- Add ARCHITECTURE.md with:
  - System overview diagram
  - Data flow diagrams (push/pull)
  - State storage schema
  - Conflict resolution logic
  - Security model
  - Rate limiting strategy
  - Component diagram
  - Troubleshooting guide
- All 45 tests passing
…nScanAI#20)

Add comprehensive media generation capabilities to Levi/Paperclip:

Core Infrastructure (media-core):
- Asset storage with actual file persistence (local_disk, S3-ready)
- Job queue with status tracking (queued/running/succeeded/failed/cancelled)
- Cost tracking integration with Levi metrics/activity log
- Retry logic with exponential backoff and jitter for API resilience
- Download support for stored assets
- Cleanup support for old assets

Image Generation (media-image):
- Stable Diffusion backend (self-hosted, free)
- DALL-E backend (OpenAI API, paid)
- Auto backend selection
- Input validation (prompt length, dimensions, steps)
- Retry on network errors

Video Generation (media-video):
- ComfyUI backend (SVD workflows)
- FFmpeg backend (slideshows/placeholders)
- Runway ML backend (high-quality API)
- Input validation (duration, fps, frames)

Audio/TTS Generation (media-audio):
- ElevenLabs backend (high-quality voices)
- Edge TTS backend (free, system-based)
- Input validation (text length, voice format, rate/pitch)

Dashboard UI (media-dashboard):
- GalleryWidget with filtering and search
- GenerationStatus with job monitoring
- Conditional UI registration for compatibility

Testing:
- 6 end-to-end tests covering storage, queue, cost, retry
- All tests pass
- TypeScript compilation clean across all packages

Closes OpenScanAI#20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: Media Generation Plugin Suite — Video, Image, Audio creation for agents

1 participant