diff --git a/Backend.md b/Backend.md index 4f7ff501..cfa2964b 100644 --- a/Backend.md +++ b/Backend.md @@ -9,14 +9,10 @@ * Reliability (retries, idempotency, job state machine, observability) * Cost + scalability thinking (MVP → v1) ---- - ## **Problem 1: Video-to-Notes Platform (Architecture + Schema)** **Goal:** Upload video → async processing → outputs: transcript, **Summary.md**, highlights (timestamps), screenshot/clip references. [READ MORE ABOUT THE PROJECT](./Video-summary-platform.md) -**Your solution must include** - * **System design:** API + worker(s) + storage + external AI/transcription boundaries * **DB choice:** Postgres vs other (short justification) * **Schema:** tables only (User, VideoAsset, Job, JobEvent, Artifact, Highlight) @@ -26,9 +22,116 @@ **Your Solution for problem 1:** -You need to put your solution here. - ---- +System Design +The Video-to-Notes platform allows users to upload long videos and automatically generate structured outputs such as transcripts, summaries, highlights with timestamps, and screenshot references. +To ensure scalability and performance, the system uses asynchronous processing with worker services. + +Architecture flow: +User → API Server → Object Storage → Job Queue → Worker → AI/Transcription Service → Database + +Components: +1. API Server +Handles user requests such as video upload, job status checking, and retrieving generated notes. It validates the upload and creates processing jobs. + +3. Object Storage +Video files and generated artifacts are stored in object storage such as AWS S3 or Google Cloud Storage. This avoids storing large files in the database. + +5. Job Queue +A queue system (Redis Queue / RabbitMQ) is used to handle background processing tasks and ensure reliable asynchronous execution. + +7. Worker Services +Workers consume jobs from the queue and perform processing tasks such as: +extracting audio +generating transcripts +summarizing text +detecting highlights +capturing screenshots + +9. External AI / Transcription Services +Workers call external services (such as Whisper or OpenAI APIs) to perform speech-to-text transcription and summarization. + +Database Choice +PostgreSQL is used because it provides strong relational integrity, indexing support, and reliable transaction handling for structured metadata. +Large video files are stored in object storage, while only metadata and artifact references are stored in the database. + +Database Schema +User +- id (UUID, Primary Key) +- email (Unique) +- name +- created_at + +VideoAsset +- id (UUID, Primary Key) +- user_id (Foreign Key → User.id) +- file_url +- duration +- status (uploaded, processing, completed) +- created_at + +Job +- id (UUID, Primary Key) +- video_id (Foreign Key → VideoAsset.id) +- status (queued, processing, success, failed) +- retry_count +- created_at +- completed_at + +JobEvent +- id (UUID) +- job_id (Foreign Key → Job.id) +- event_type +- message +- created_at + +Artifact +- id (UUID) +- video_id (Foreign Key → VideoAsset.id) +- artifact_type (transcript, summary, screenshot) +- file_url +- created_at + +Highlight +- id (UUID) +- video_id (Foreign Key → VideoAsset.id) +- start_timestamp +- end_timestamp +- description + +Indexes are added on frequently queried fields such as user_id, video_id, and status. + +Job Lifecycle +1. User uploads a video. +2. API stores the file in object storage. +3. Video metadata is saved in the VideoAsset table. +4. A processing job is created with status = queued. +5. The job is pushed to the queue. +6. A worker picks the job and updates status to processing. +7. Worker performs transcription, summary generation, highlight detection, and screenshot extraction. +8. Generated artifacts are stored in object storage. +9. Artifact references are saved in the database. +10. Job status is updated to success. +If processing fails, the system retries up to 3 times before marking the job as failed. + +Storage Layout +Example structure in object storage: + +videos/{user_id}/{video_id}.mp4 +artifacts/{video_id}/transcript.txt +artifacts/{video_id}/summary.md +artifacts/{video_id}/screenshots/ + +Artifacts are delivered to users using signed URLs for secure downloads. + +Security +- JWT authentication for API access +- Signed URLs for secure file downloads +- File type and size validation during upload +- Rate limiting to prevent abuse + +Reliability and Scalability +The platform is designed to scale horizontally by adding more worker instances. Queue-based processing ensures reliable job execution, and retry mechanisms prevent job loss. Object storage allows efficient handling of large video files while keeping the database lightweight. +* ## **Problem 2: LinkedIn Automation Platform (Backend Architecture)** @@ -44,9 +147,91 @@ You need to put your solution here. **Your Solution for problem 2:** -You need to put your solution here. - ---- +System Design +The LinkedIn Automation Platform allows users to connect their LinkedIn account, generate AI-assisted post drafts based on a persona, review the drafts, schedule posts, and automatically publish them while maintaining audit logs. + +Architecture flow: +User → API Server → OAuth Service → Database → Job Scheduler → Worker → LinkedIn API + +Components: +1. API Server +Handles user actions such as connecting LinkedIn, managing personas, generating drafts, approving posts, and scheduling publishing tasks. + +2. OAuth Integration +Users connect their LinkedIn account using OAuth. Access tokens and refresh tokens are securely stored to allow scheduled posting. + +3. Draft Generation Service +When a user provides a topic, the backend calls the GenAI service which generates multiple post drafts based on the user's persona configuration. + +4. Scheduler / Queue +A scheduler service manages future publishing times. Jobs are pushed into a queue so workers can publish posts automatically. + +5. Worker Services +Workers process scheduled jobs, publish posts via LinkedIn API, and store results in audit logs. + +Database Schema +User +- id (UUID, Primary Key) +- email +- created_at + +LinkedInAccount +- id (UUID) +- user_id (Foreign Key → User.id) +- access_token (encrypted) +- refresh_token (encrypted) +- expires_at + +Persona +- id (UUID) +- user_id (Foreign Key → User.id) +- tone +- audience +- writing_style + +Draft +- id (UUID) +- user_id +- persona_id +- topic +- content +- status (generated, approved, scheduled) + +Schedule +- id (UUID) +- draft_id +- publish_time +- status (pending, posted, failed) + +PostLog / PostAttempt +- id (UUID) +- draft_id +- status +- linkedin_post_id +- created_at + +Indexes are added on user_id, publish_time, and status for efficient scheduling queries. +Security +- OAuth tokens are encrypted at rest. +- Access scopes follow the least-privilege principle. +- Only authorized users can access their LinkedIn accounts and drafts. + +Reliability +- Scheduled posts are processed through a queue-based worker system. +- Retry logic prevents failed posts due to temporary API errors. +- Deduplication checks ensure the same post is not published twice. +- Rate limiting is applied to avoid LinkedIn API limits. + +Prompt (Config Storage) +Prompt templates or configuration packs used by the GenAI team are stored in the database with version control. + +Fields include: +- version +- prompt_template +- created_at +- status (active / deprecated) + +This enables safe rollback if a prompt change introduces issues. ## **Problem 3: DOCX Template → Bulk DOCX/PDF Generator (Backend + Storage)** @@ -62,9 +247,93 @@ You need to put your solution here. **Your Solution for problem 3:** -You need to put your solution here. +System Design +The DOCX Template → Bulk DOCX/PDF Generator allows users to upload a DOCX template, automatically detect editable fields, and generate documents either individually or in bulk using spreadsheet data. + +Architecture flow: +User → API Server → Template Processing Service → Database → Job Queue → Worker → Storage + +Components: +1. Template Ingestion Service +When a DOCX file is uploaded, the system parses the document and extracts placeholders (e.g., {{name}}, {{date}}, {{amount}}). These fields are converted into a reusable template schema. + +2. Field Detection (GenAI) +GenAI services can assist in identifying possible editable fields and generating a structured field schema from the uploaded document. + +3. Single Generation +Users fill fields via a form interface. The system generates a single DOCX or PDF output. + +4. Bulk Generation +Users upload an Excel / Google Sheet. Each row represents one document instance. A background job generates documents for all rows. + +5. Worker Services +Workers process bulk generation jobs asynchronously to avoid blocking the API server. + +Database Schema +Template +- id (UUID, Primary Key) +- user_id +- template_name +- created_at + +TemplateVersion +- id (UUID) +- template_id (Foreign Key → Template.id) +- version_number +- file_url +- created_at + +TemplateField +- id (UUID) +- template_version_id +- field_name +- field_type + +BulkRun +- id (UUID) +- template_version_id +- status (queued, processing, completed, failed) +- created_at + +BulkRow +- id (UUID) +- bulk_run_id +- row_data (JSON) +- status (pending, success, failed) + +Artifact +- id (UUID) +- bulk_row_id +- file_url +- created_at + +JobEvent +- id (UUID) +- job_id +- event_type +- created_at + +Indexes are added on template_id, bulk_run_id, and status. + +Storage Strategy +Templates and generated documents are stored in object storage (AWS S3 / Cloud Storage). + +Example layout: +templates/{template_id}/version_{n}.docx +outputs/{bulk_run_id}/{row_id}.pdf + +Bulk outputs can optionally be packaged into a ZIP file for easy download. + +Reliability +- Bulk generation runs as background jobs using queue workers. +- Each row has its own status so partial failures can be retried. +- Failed rows can be regenerated without restarting the entire job. + +Security +- Templates and generated files are isolated per user or tenant. +- Signed URLs are used for secure downloads. +- Input validation prevents path traversal or malicious file uploads. ---- ## **Problem 4: Character-Based Video Series Generator (Backend Architecture)** @@ -80,7 +349,92 @@ You need to put your solution here. **Your Solution for problem 4:** -You need to put your solution here. +System Design +The Character-Based Video Series Generator allows users to define characters once (including image, personality traits, and relationships) and then generate multiple short video episodes based on story prompts. Each episode produces a package containing script, scene breakdown, asset plan, and optionally a rendered video. + +Architecture flow: +User → API Server → Story/Character Service → Job Queue → Worker → Media/GenAI Services → Storage → Database + +Components: +1. Character Management +Users define characters with attributes such as name, personality, appearance, and relationships. This data is reused across multiple episodes. + +2. Episode Generation +The user submits a story idea or scenario. The backend generates an episode plan including script, scene structure, and required assets using AI services. + +3. Asset Generation +Images, voice lines, or video clips are generated using external AI tools. These assets are stored and referenced for the episode. + +4. Render Pipeline +A render job combines scenes and assets to produce the final video (optional depending on workflow). + +5. Worker Services +Workers process generation and rendering tasks asynchronously to handle large workloads. + +Database Schema +Character +- id (UUID, Primary Key) +- user_id +- name +- description +- personality_traits + +Relationship +- id (UUID) +- character_id +- related_character_id +- relationship_type + +Episode +- id (UUID) +- user_id +- title +- story_prompt +- status (queued, processing, completed) + +Scene +- id (UUID) +- episode_id +- scene_number +- description +- dialogue + +Asset +- id (UUID) +- episode_id +- asset_type (image, audio, video) +- file_url + +RenderJob +- id (UUID) +- episode_id +- status (queued, rendering, completed, failed) + +Artifact +- id (UUID) +- render_job_id +- file_url + +Indexes are added on episode_id, character_id, and status fields. + +Consistency Strategy +Character definitions and relationships are stored persistently so that personality traits and relationships remain consistent across episodes. Each episode references the same character records. + +Storage +Media assets such as images, audio, and generated video clips are stored in object storage (AWS S3 / Cloud Storage). + +Example layout: +characters/{character_id}/images +episodes/{episode_id}/assets +renders/{episode_id}/final_video.mp4 + +Deduplication strategies can be used for reused assets. + +Security and Cost Control +- API authentication ensures only authorized users generate episodes. +- Rate limits and quotas prevent excessive generation. +- File size limits and asset quotas help control infrastructure cost. + ## Problem 5: Cross-Cutting @@ -94,4 +448,67 @@ Answer briefly for the whole platform: **Your Answer for problem 5:** -You need to put your solution here. +System Design +The Character-Based Video Series Generator allows users to define characters once (image, personality traits, relationships) and reuse them across multiple episodes. For each episode, the user provides a story prompt and the system generates a structured episode package including script, scene plan, assets, and optionally a rendered video. + +Architecture flow: +User → API Server → Character & Episode Service → Job Queue → Worker → AI/Media Services → Storage → Database + +Workers process generation tasks asynchronously to handle heavy media workloads. + +Database Schema +Character +- id (UUID) +- user_id +- name +- traits + +Relationship +- id (UUID) +- character_id +- related_character_id +- relation_type + +Episode +- id (UUID) +- user_id +- story_prompt +- status + +Scene +- id (UUID) +- episode_id +- scene_number +- description + +Asset +- id (UUID) +- episode_id +- asset_type (image, audio, video) +- file_url + +RenderJob +- id (UUID) +- episode_id +- status + +Artifact +- id (UUID) +- render_job_id +- file_url + +Consistency Strategy +Character data and relationships are stored persistently so that all episodes reference the same character definitions, ensuring continuity across the video series. + +Storage +Media assets and generated videos are stored in object storage. + +Example layout: + +episodes/{episode_id}/assets +renders/{episode_id}/final_video.mp4 + +Security and Cost Control +- Authentication for API access +- Rate limiting and user quotas +- File size limits for media assets diff --git a/Frontend.md b/Frontend.md index 748188aa..0cc58f3a 100644 --- a/Frontend.md +++ b/Frontend.md @@ -26,7 +26,51 @@ **Your Solution for problem 1:** -You need to put your solution here. +Frontend System Design +Tech Stack +- Framework:React with Next.js for fast routing and SSR support. +- State Management:React Query for server state (job status, results) and local state with React hooks. +- UI Library:TailwindCSS + reusable component library for consistent design. + +Screens +1. Upload Page + - Video upload with progress indicator. + - Shows validation errors (size/type). + +2. Job List + - Displays all processing jobs. + - Shows status badges (Queued, Processing, Success, Failed). + +3. Job Detail + - Live status updates with logs and progress. + +4. Results Page + - Displays generated Summary.md + - Highlights with clickable timestamps + - Screenshot assets. + +UI States +- Loading → spinner +- Queued → waiting indicator +- Processing → progress bar +- Success → results view +- Failed → retry button +- Partial results → show completed artifacts first. + +API Strategy +- Upload via POST /videos +- Poll /jobs/{id} every few seconds for progress. +- Abort polling when user leaves page. + +Browser Caching +- Cache job list and job details using React Query. +- TTL caching for results data. +- Invalidate cache after job completion. +Debugging Plan +- Display job ID and correlation ID in UI. +- Log network requests in browser console. +- If job stuck in processing, inspect job status API responses. + --- @@ -44,7 +88,41 @@ You need to put your solution here. **Your Solution for problem 2:** -You need to put your solution here. +LinkedIn Automation Platform – Frontend Design + +Screens +1. Connect LinkedIn + - OAuth connection flow. +2. Persona Editor + - Inputs for tone, audience, writing style. +3. Draft Generator + - Shows 3 generated drafts. +4. Approval Screen + - User selects or edits draft. +5. Scheduler + - Select date/time for posting. +6. Post History + - Shows published posts and status. + +Form UX +- Validate persona inputs before generation. +- Show character limit for LinkedIn posts. +- Prevent scheduling posts in the past. + +API Calling Strategy +- POST /generate-draft +- POST /approve +- POST /schedule +- Use optimistic UI updates when approving drafts. + +Caching +- Cache drafts to avoid regeneration. +- Cache schedule list and post history. +- Refetch data after publishing. + +Debugging +- Display error messages for failed LinkedIn posting. +- Show request ID to help backend debugging. --- @@ -62,7 +140,34 @@ You need to put your solution here. **Your Solution for problem 3:** -You need to put your solution here. +DOCX Template → Bulk Generator (Frontend Design) + +Screens +1. Template Upload +2. Field Detection Review +3. Single Fill Form +4. Bulk CSV Upload +5. Bulk Run Status +6. Download Results + +Field UI +- Editable fields detected from template. +- Field types supported: text, number, date. +- Validation rules (required fields, format validation). + +Bulk Upload UX +- CSV validation before upload. +- Show mapping preview between CSV columns and template fields. +- Progress bar for generation status. + +Browser Caching +- Cache template metadata and field schema. +- Cache bulk run results for pagination. + +Downloads +- Use signed URLs for secure downloads. +- Show download progress indicator. + --- @@ -79,7 +184,27 @@ You need to put your solution here. **Your Solution for problem 4:** -You need to put your solution here. +Character-Based Video Series Generator (Frontend Design) + +Screens +1. Character Library +2. Relationship Editor +3. Episode Creator +4. Episode Detail (Scenes) +5. Asset Gallery + +Consistency UX +- Character profiles locked per episode to maintain story consistency. +- Version badges show character updates. + +API Calling +- Episode generation runs as async job. +- UI shows job progress and status updates. + +Caching +- Cache character library data. +- Cache episode packages and asset thumbnails. + --- @@ -89,30 +214,39 @@ Answer these in **bullet points** (max 1 page total): 1. **Frontend stack choice** -* EDIT YOUR ANSWER HERE: Framework (Next.js/Vue/etc), state management, router, UI kit, why. - `` + - Framework: Next.js (React) + - State: React Query + - Router: Next.js routing + - UI Kit: TailwindCSS + - Chosen for performance, modular components, and good developer experience. 2. **API layer design** * Fetch/Axios choice, typed client generation (OpenAPI), error normalization, retries, request dedupe, abort controllers. - ` - ` + - Use Axios for API requests. + - Central API client with error normalization. + - Implement retry logic and abort controllers for cancelled requests. 3. **Browser caching plan** * What you cache (GET responses, derived state), where (memory, IndexedDB, localStorage), TTL/invalidation rules. * How you handle “job status updates” without stale UI. ` - ` + - Cache GET responses with React Query. + - Use memory cache for job status. + - Use IndexedDB or localStorage for lightweight persistence. 4. **Debugging & observability** * Error boundaries, client-side logging approach, correlation id propagation, “report a problem” payload. * How you would debug: slow uploads, failed downloads, intermittent 500s. - ` - ` + - Implement error boundaries in React. + - Log API failures to monitoring service. + - Display correlation IDs for debugging support issues. 5. **Security basics** * Token storage approach, CSRF considerations (if cookies), XSS avoidance for markdown rendering, safe file download patterns. - ` A` + - Store tokens in secure HTTP-only cookies when possible. + - Sanitize markdown rendering to prevent XSS. + - Use signed URLs for secure file downloads. diff --git a/GenAI.md b/GenAI.md index 3c1fd31b..edb3d9f9 100644 --- a/GenAI.md +++ b/GenAI.md @@ -26,7 +26,34 @@ No code required. We want a **clear, practical proposal** with architecture and ### Your Solution for problem 1: -You need to put your solution here. +Proposal: Video-to-Notes Platform + +Idea +The Video-to-Notes platform converts long educational videos into structured learning artifacts such as transcripts, summaries, highlights, and screenshots. The goal is to help users quickly understand key ideas without watching the entire video. + +Workflow +1. User uploads a video. +2. The system extracts audio from the video. +3. A speech-to-text model generates a transcript. +4. The transcript is analyzed by a language model to produce: + - concise summary + - key highlights with timestamps + - topic segmentation +5. Important frames are extracted as screenshots. + +AI Models Used +- Speech-to-Text:Whisper or similar ASR model +- Text Summarization: Large Language Model (LLM) +- Highlight Detection: LLM + timestamp mapping + +Output +The user receives a structured notes package containing: +- Full transcript +- Short summary +- Highlight timestamps +- Screenshots referencing key moments + +This helps students and professionals review long videos quickly and efficiently. ## Problem 2: **Zero-Shot Prompt to generate 3 LinkedIn Post** @@ -36,7 +63,34 @@ Design a **single zero-shot prompt** that takes a user’s persona configuration ### Your Solution for problem 2: -You need to put your solution here. +Prompt Design for LinkedIn Post Generation + +Prompt +You are a professional LinkedIn content writer. + +Generate 3 LinkedIn post drafts based on the following input. + +Topic: {topic} +User Persona: +- Tone: professional but conversational +- Audience: technology professionals +- Style: insightful and engaging + +Each post should: +- be between 120–180 words +- start with a strong hook +- include a clear idea or insight +- end with a question or call to action +- avoid emojis and excessive hashtags + +Return the results as three separate drafts. + +Safety +The system should filter outputs to prevent: +- spam-like promotional content +- misleading claims +- inappropriate or offensive language + ## Problem 3: **Smart DOCX Template → Bulk DOCX/PDF Generator (Proposal + Prompt)** @@ -54,7 +108,30 @@ Submit a **proposal** for building this system using GenAI (OpenAI/Gemini) for ### Your Solution for problem 3: -You need to put your solution here. +Smart DOCX Template Field Detection + +Idea +When a user uploads a DOCX template, the system automatically detects editable placeholders and converts them into structured fields that can be filled through a form or spreadsheet. + +Approach +1. Parse the DOCX document and extract text blocks. +2. Detect placeholders such as: + - {{name}} + - {{date}} + - {{amount}} +3. Use a language model to identify potential variable fields that are not explicitly marked. +4. Convert detected fields into a structured schema. + +Output +Example detected fields: +- name +- address +- date +- invoice_amount + +These fields are presented to the user for confirmation and editing. +This approach allows the platform to transform static templates into dynamic document generation workflows. + ## Problem 4: Architecture Proposal for 5-Min Character Video Series Generator @@ -66,4 +143,28 @@ Create a **small, clear architecture proposal** (no code, no prompts) describing ### Your Solution for problem 4: -You need to put your solution here. +Character-Based Video Series Generator + +Idea +The platform allows users to define characters once and generate multiple episodes using those characters. Each episode is created from a story prompt. + +Workflow +1. User defines characters with attributes: + - name + - personality + - relationships +2. User submits an episode prompt. +3. A language model generates: + - episode storyline + - scene breakdown + - character dialogue +4. Image or video generation models produce visual assets for each scene. + +Output +Each episode produces a package containing: +- script +- scene plan +- character dialogue +- generated visual assets + +This enables scalable creation of story-based video content using reusable characters.