- About
- How It Works
- Architecture
- Tech Stack
- Features
- Getting Started
- Project Structure
- Contributing
- License
- Acknowledgments
BookNote is an AI-powered platform that lets you have real-time voice conversations with your books. Upload any PDF, and our AI will process it into an interactive entity that you can chat with using natural voice synthesis.
- Voice-First Experience: Have natural conversations with your books using AI-powered voice technology
- Smart Document Processing: Automatic text extraction, intelligent chunking, and embeddings for precise context retrieval
- Multiple AI Personas: Choose from various AI personalities powered by ElevenLabs
- Real-time Transcripts: Get live transcripts of all your conversations
- Modern Tech Stack: Built with Next.js 16, TypeScript, and modern UI components
User Upload β PDF Parsing β Text Extraction β Chunking β Embedding Generation β Vector Storage
When a user uploads a PDF:
- The PDF is processed using
pdfjs-distfor text extraction - Text is split into meaningful chunks (overlapping for context)
- Each chunk is converted to vector embeddings using Mistral AI API
- Embeddings are stored in Convex for similarity search
User Voice Input β Vapi (Voice AI) β Query Processing β Similarity Search β Context Retrieval β AI Response β Voice Output
- User speaks to the AI via Vapi's voice interface
- The spoken query is transcribed
- Convex performs similarity search against book embeddings
- Relevant context is retrieved and sent to the AI
- ElevenLabs generates the voice response
- User hears the AI's response in real-time
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β Frontend ββββββΆβ Next.js API ββββββΆβ Convex β
β (Next.js 16) β β Routes β β (Database) β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β β
β βΌ
β βββββββββββββββββββ
β β Vector Store β
β β (Embeddings) β
β βββββββββββββββββββ
βΌ
βββββββββββββββββββ βββββββββββββββββββ
β Vapi ββββββΆβ ElevenLabs β
β (Voice AI) β β (TTS) β
βββββββββββββββββββ βββββββββββββββββββ
{
_id: Id<"books">,
title: string,
author: string,
description: string,
coverURL?: string,
coverBlobKey?: string,
userId: string,
persona?: string,
isPublic: boolean,
totalSegments: number,
_creationTime: number
}{
_id: Id<"bookSegments">,
bookId: Id<"books">,
content: string,
embedding: number[],
_creationTime: number
}- Next.js 16 - Full-stack React framework with App Router
- TypeScript - Type-safe JavaScript
- Convex - Real-time database and serverless functions
- Vapi - Voice AI platform for real-time conversations
- ElevenLabs - AI-powered text-to-speech
- Clerk - Authentication and user management
- Tailwind CSS - Utility-first CSS framework
- Shadcn UI - Accessible UI component library
- Mistral AI - AI embeddings for semantic search
- PDF Upload & Ingestion: Seamlessly upload PDF books with automated text extraction
- Voice-First Conversations: Engage in natural, real-time voice dialogues with your books
- AI Voice Personas: Choose from distinct AI personalities with ElevenLabs voices
- Smart Summaries: Extract key insights and summaries from any chapter
- Session Transcripts: Auto-generated text transcripts of all conversations
- Library Management: Organize and search through your personal collection
- Authentication: Secure access via Clerk with social login support
Ensure you have the following installed:
- Git
- Node.js (v18 or later)
- npm or yarn
- Convex CLI (
npm install -g convex)
- Clone the repository:
git clone https://github.com/yourusername/booknote.git
cd booknote- Install dependencies:
npm install- Set up Convex:
npx convex devCreate a .env.local file in the root directory:
# Deployment (from npx convex dev)
CONVEX_DEPLOYMENT=your_convex_deployment
NEXT_PUBLIC_CONVEX_URL=your_convex_url
# Base URL
NEXT_PUBLIC_BASE_URL=http://localhost:3000
# Clerk Authentication
NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY=your_clerk_publishable_key
CLERK_SECRET_KEY=your_clerk_secret_key
NEXT_PUBLIC_CLERK_SIGN_IN_URL=/sign-in
NEXT_PUBLIC_CLERK_SIGN_UP_URL=/sign-up
NEXT_PUBLIC_CLERK_SIGN_IN_FALLBACK_REDIRECT_URL=/
NEXT_PUBLIC_CLERK_SIGN_UP_FALLBACK_REDIRECT_URL=/
# Vercel Blob (legacy - using Convex Storage)
BLOB_READ_WRITE_TOKEN=your_blob_token
# Vapi Voice AI
NEXT_PUBLIC_VAPI_API_KEY=your_vapi_key
VAPI_SERVER_SECRET=your_vapi_secret
# AI Services
MISTRAL_API_KEY=your_mistral_key
ELEVENLABS_API_KEY=your_elevenlabs_key| Service | Sign Up Link |
|---|---|
| Clerk | clerk.com |
| Vapi | vapi.ai |
| ElevenLabs | elevenlabs.io |
| Mistral AI | console.mistral.ai |
Start the development server:
npm run devOpen http://localhost:3000 in your browser.
booknote/
βββ app/ # Next.js App Router
β βββ (root)/ # Main routes
β β βββ page.tsx # Home/Library page
β β βββ books/
β β βββ new/ # Add new book
β β βββ [slug]/ # Book landing page
β βββ api/ # API routes
β β βββ upload/ # File upload handler
β βββ read/
β β βββ [id]/ # Reading/chat page
β βββ layout.tsx # Root layout
β βββ globals.css # Global styles
βββ components/ # React components
β βββ ui/ # Shadcn UI components
β βββ BookCard.tsx # Book display card
β βββ Navbar.tsx # Navigation
β βββ HeroSection.tsx # Landing hero
β βββ Search.tsx # Book search
β βββ UploadForm.tsx # PDF upload form
β βββ VapiControls.tsx # Voice controls
βββ convex/ # Convex backend
β βββ books.ts # Book queries/mutations
β βββ ... # Other Convex functions
βββ lib/ # Utility functions
β βββ actions/ # Server actions
β βββ constants.ts # App constants
β βββ utils.ts # Helper utilities
βββ public/ # Static assets
β βββ assets/ # Images, icons
βββ types/ # TypeScript types
βββ .env.local # Environment variables
βββ package.json
βββ tailwind.config.ts
βββ tsconfig.json
We welcome contributions! Here's how you can help:
By participating in this project, you agree to follow our Code of Conduct. Please be respectful and inclusive.
-
Fork the Repository Click the "Fork" button on GitHub or run:
git fork https://github.com/yourusername/booknote
-
Clone Your Fork
git clone https://github.com/yourusername/booknote.git cd booknote -
Create a Feature Branch
git checkout -b feature/your-feature-name # or git checkout -b fix/your-bug-fix -
Make Changes
- Follow the existing code style
- Write meaningful commit messages
- Add tests if applicable
-
Push to GitHub
git push origin feature/your-feature-name
-
Create a Pull Request
- Go to the original repository
- Click "New Pull Request"
- Fill out the template
- Link any related issues
- Setup: Follow the Getting Started guide
- Coding Standards:
- Use TypeScript for all new code
- Follow ESLint rules (
npm run lint) - Use meaningful variable and function names
- Testing:
- Test your changes locally
- Verify no linting errors
- Commit Messages:
feat: add new voice persona selection fix: resolve PDF upload timeout issue docs: update API documentation refactor: simplify book chunking logic
- π Bug Reports: Found a bug? Open an issue
- π‘ Features: Suggest new features
- π Documentation: Improve docs
- π¨ UI/UX: Improve the interface
- π§ Code: Submit pull requests
This project is licensed under the MIT License.
MIT License
Copyright (c) 2024 BookNote
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
- JavaScript Mastery for the original inspiration
- Vapi for the voice AI infrastructure
- ElevenLabs for exceptional text-to-speech
- Clerk for authentication
- Convex for the backend infrastructure
- Shadcn for beautiful UI components
Built with β€οΈ using Next.js, Vapi, and ElevenLabs
Star us on GitHub if you find this project useful!