Skip to content

GenAI developer task completed#9

Open
ShivaKumarKaranam2 wants to merge 1 commit intodivamtech:mainfrom
ShivaKumarKaranam2:ShivaKumarKaranam-GenAI
Open

GenAI developer task completed#9
ShivaKumarKaranam2 wants to merge 1 commit intodivamtech:mainfrom
ShivaKumarKaranam2:ShivaKumarKaranam-GenAI

Conversation

@ShivaKumarKaranam2
Copy link
Copy Markdown

@ShivaKumarKaranam2 ShivaKumarKaranam2 commented Feb 21, 2026

GenAI Developer Assignment Submission - Shiva Kumar Karanam

Video Summary - LinkedIn Generator - DOCX Bulk Engine - Character Video Pipeline

Overview

This PR contains my complete submission for the GenAI Developer assignment. It demonstrates end-to-end system design thinking across multiple GenAI use cases, focusing on architectural clarity, structured prompting, scalability, and production-readiness.

The solutions emphasize practical trade-offs (privacy, cost, quality, bulk processing), structured JSON outputs, multimodal reasoning, and modular pipeline design aligned with real-world deployment considerations.


Scope of Work

Problem 1 --- Video-to-Notes System

  • Compared three architectures: SaaS-based, hybrid cloud LLM, and fully offline.

  • Included Mermaid diagrams, trade-off analysis, cost considerations, and structured JSON schema for summarization.

  • Addressed privacy, scalability, and review-loop design.

Problem 2 --- Zero-Shot LinkedIn Post Generator

  • Designed a single-call prompt producing three stylistically distinct drafts.

  • Strict JSON schema enforcement for structured output.

  • Persona alignment and formatting consistency prioritized.

Problem 3 --- Smart DOCX Template → Bulk DOCX/PDF Engine

  • Multimodal GenAI-based field detection from template files.

  • Automated bulk document generation via Excel/Sheets.

  • Fidelity-focused rendering using docxtpl + LibreOffice PDF conversion.

  • Error handling, reporting, and scalability considerations included.

Problem 4 --- 5-Min Character Video Series Pipeline

  • Modular pipeline: script → storyboard → assets → audio → render.

  • Character consistency via structured "Bible" injection.

  • Forward-looking 2026 tool stack integration (Runway Gen-4.5, Kling 2.6+, Gemini 2.5/3, Qwen models, etc.).

  • Emphasis on repeatability and production workflows.


Design Principles Demonstrated

  • Structured JSON outputs for reliability

  • Zero-shot prompt robustness

  • Clear architecture trade-offs and recommendations

  • Human-in-the-loop review stages

  • Bulk processing and error management

  • Practical production constraints (cost, latency, privacy)


Status

  • All four problems fully addressed

  • Trade-offs and recommendations clearly articulated

  • Prompts and schemas validated for structural reliability

  • Content self-contained and professionally structured

I welcome feedback on architectural decisions, prompt robustness, or areas where deeper technical validation would strengthen the proposal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant