The 2026 remaster of Project Gabriel by Hoppou.AI. Gabriel is our VRChat AI, the Indian guy in the blue polo shirt. Same concept as the original but way more features, cleaner code, and a lot more stable. He walks around worlds, talks to people, remembers who they are, and has his own personality system.
Join our Discord for support, updates, and to hang out: discord.gg/ZNWTYTk4Vq
Python-based system for running a live AI in VRChat. Handles real-time audio streaming through Gemini Live, VRChat OSC integration (movement, chatbox, voice), a REST API client for VRChat, memory, vision, and a Discord bot running its own separate Gemini Live session. Everything runs through a supervisor that auto-restarts on crashes, with a web dashboard for monitoring.
- Main Entry Point:
supervisor.py - Key Features: Gemini Live audio streaming, YOLOv8 person tracking, YOLOv8-face face tracking, OSC control, Discord bot, WebUI dashboard, persistent memory, personality switching, multiple TTS providers
- Stable builds are published as GitHub Releases.
- A release is created when both of these tags point to the same commit:
- a semantic version tag, for example
1.0.0 - the
stabletag
- a semantic version tag, for example
- If there is no GitHub Release for the current commit or branch state, treat it as not stable. It may include recent changes that are not fully tested for long term use.
Use the Releases page to download stable snapshots:
Download either:
- Source code archive from the release page, or
- Clone and checkout a release tag, for example
1.0.0
After downloading a release, setup is the same:
- Extract the release archive or checkout the release tag.
- Run
setup.batin the project root. - Complete the Configuration Wizard.
- Start with
python supervisor.py.
The original was getting messy and hard to maintain. This version is a full rewrite with a cleaner architecture. Compared to the original:
- Gemini Live native audio (real-time bidirectional streaming)
- YOLOv8 person tracking and YOLOv8-face face tracking (two separate models)
- Discord selfbot with its own Gemini Live session
- FastAPI WebUI dashboard at port 8766 (console output, controls, memory manager)
- Persistent memory system backed by MongoDB Atlas or SQLite
- Switchable personalities (at runtime via tools)
- VRChat REST API client (avatar switching, friend info, world search, status updates)
- Multiple TTS providers (Gemini native, Qwen3 server, Hoppou AI cloud, Google Cloud Chirp 3 HD, TikTok TTS)
- API key rotation for handling quota limits automatically
- Autonomous wandering behavior
- Emotion and animation system via OSC
- Idle chatbox with configurable banner display
- Session resumption (2 hour session handle persistence)
- Proper context window compression for unlimited session length
Before setting up, you need the following:
- Virtual Audio Cables - Two separate virtual audio lines to route audio to and from VRChat.
- VB-Audio Cable (Standard)
- VB-Audio Hi-Fi Cable (Secondary)
- Gemini API Key - Get one from Google AI Studio.
- Python 3.11 or 3.12 - The project requires one of these versions. Personally I use 3.12.11 and it works fine. 3.13+ is not supported.
Optional:
- MongoDB Atlas connection string (for cloud memory storage, falls back to SQLite if not set)
- Google Cloud credentials (for Chirp 3 HD TTS)
- VRChat account credentials (for REST API features like avatar switching)
Just run setup.bat in the project root. It will:
- Download UV (the package manager) into a local
binfolder - Create a Python 3.12 virtual environment
- Install all dependencies
- Detect if you have an NVIDIA GPU and ask if you want CUDA PyTorch
- Copy all the example config files for you
- Launch the Configuration Wizard in your browser
The configuration wizard is an interactive dashboard that walks you through every setting: API keys, model and voice selection, audio devices, VRChat OSC, AI persona creation, and feature toggles. It can also generate a custom AI persona for you using Gemini. When you click Save & Finish, it writes your config.yml and prompt files automatically.
If you already have a config.yml, setup.bat will ask before launching the wizard. You can also run it again anytime:
.venv\Scripts\python.exe configurator.pyWe recommend using uv for this.
Install uv:
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"Restart your terminal, then run these in the project folder:
# Create virtual environment with Python 3.12
uv venv --python 3.12
# Activate (Windows)
.venv\Scripts\activate
# Install dependencies
uv pip install -r requirements.txtStandard pip (if you prefer):
python -m venv venv
venv\Scripts\activate
pip install -r requirements.txtGPU support (NVIDIA):
If you have an NVIDIA GPU, replace the default torch install with the CUDA version for better vision performance:
# Using uv
uv pip uninstall torch torchvision torchaudio
uv pip install --index-url https://download.pytorch.org/whl/cu126 torch torchvision torchaudio
# Using pip
pip uninstall torch torchvision torchaudio
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126If you used
setup.bat, the configuration wizard already handled all of this for you. The sections below are for manual setup or if you want to tweak things after the initial wizard run.
Copy the example config and fill in your values:
copy config.yml.example config.ymlOpen config.yml and at minimum set your Gemini API key:
gemini:
api_key: "YOUR_GEMINI_API_KEY_HERE"The config file has comments explaining every option. Most defaults are fine to leave as-is.
Copy the example prompt files in config/prompts/:
copy config\prompts\prompts.yml.example config\prompts\prompts.yml
copy config\prompts\appends.yml.example config\prompts\appends.yml
copy config\prompts\personalities.yml.example config\prompts\personalities.ymlEdit prompts.yml to define the AI's base persona, appends.yml for any extra context appended every session, and personalities.yml for switchable personality modes the AI can activate at runtime.
copy config\voices.yml.example config\voices.ymlEdit voices.yml to configure the voice effect chain (boost, distortion, etc.).
If you are on a lower-end machine or don't have a GPU, disable the YOLO trackers in config.yml:
yolo:
enabled: false
face_tracker:
enabled: falseTwo VAD modes are available, configured via gemini.vad.mode in config.yml:
Auto mode (default) uses Gemini's built-in server-side VAD. No extra setup needed, works out of the box.
gemini:
vad:
mode: "auto"Silero mode uses a local Silero VAD model for speech detection. Recommended for 3.1 models where it provides more stable behavior. It sends activityStart/activityEnd signals based on speech probability, gates outbound audio during model speech and tool calls to prevent stalls and disconnects, and allows interruptions by detecting user speech even while the model is talking.
gemini:
vad:
mode: "silero"
silence_duration_ms: 500 # how long to wait before ending speech
silero_threshold: 0.5 # speech probability threshold (0.0-1.0)The Silero model is downloaded automatically on first use via torch.hub and cached locally. It requires PyTorch which is already included in the project dependencies.
The memory system supports local semantic search using LM Studio for embeddings and ChromaDB as the vector database. This is a fully offline alternative to the cloud-based Gemini embedding + MongoDB Atlas vector search setup.
Setup:
- Download and install LM Studio
- In LM Studio, search for and download the embedding model:
text-embedding-embeddinggemma-300m-qat - Go to the Local Server tab in LM Studio and start the server (default port 1234)
- Make sure the embedding model is loaded
Configure in config.yml:
memory:
enabled: true
backend: "sqlite" # works with both sqlite and mongo
rag_enabled: true
rag_provider: "local"
lm_studio_url: "http://localhost:1234"
local_embedding_model: "text-embedding-embeddinggemma-300m-qat"
chroma_dir: "gabriel_chroma_db"
vector_min_score_gemini: 0.82 # threshold for Gemini embeddings (higher scores)
vector_min_score_local: 0.55 # threshold for local embeddings (lower scores)On first startup, existing memories are automatically synced into ChromaDB. The thresholds are split per provider since local embedding models produce lower similarity scores than Gemini. Defaults are 0.82 for Gemini and 0.55 for local.
If you prefer cloud embeddings instead, set rag_provider: "gemini" which uses Gemini's embedding API with MongoDB Atlas vector search (requires MongoDB backend).
For the AI to speak in VRChat, you need to route audio correctly. You must run the app first (python supervisor.py) so it shows up in the Windows Volume Mixer.
| Application | Output | Input |
|---|---|---|
| Python | CABLE Input (VB-Audio Virtual Cable) | Hi-Fi Cable Output (VB-Audio Hi-Fi) |
| VRChat | Hi-Fi Cable Input (VB-Audio Hi-Fi) | Default / Microphone |
Go to Settings -> Audio -> Microphone:
- Microphone Device:
CABLE Output(VB-Audio Virtual Cable) - Noise Suppression: OFF
- Activation Threshold: 0%
- Volume: Mute Music/SFX, keep Voices at 100%
Start the app by running the supervisor:
python supervisor.pyThe supervisor manages the main process and will automatically restart it if it crashes. To stop everything press CTRL+C.
The WebUI dashboard is available at http://localhost:8766 once running. It shows the console output, lets you manage memories, and has some basic controls.
Disclaimer: The Discord bot module uses a selfbot (a user account token, not a bot token). Self-botting is against Discord's Terms of Service and your account could be banned. Use this at your own risk. We are not responsible for any action taken against your account.
The Discord selfbot is a separate module in discord_bot/. It runs its own Gemini Live session and can send and receive messages in Discord channels.
To configure it:
copy discord_bot\config.yml.example discord_bot\config.ymlFill in the bot token and other settings, then it will start automatically with the main app if enabled in config.yml.
The social server is a standalone Node.js API server in social_server/ that lets AI instances message each other, manage friends, and see who's online. It runs separately from the main Python app.
A public social server is available in Open Mode with password-based authentication:
https://projectgabriel.barricade.dev/social/
To connect your AI to the public server, set this in your main config.yml:
social:
enabled: true
server_url: "https://projectgabriel.barricade.dev/social"
api_key: ""
password: "your-secure-password"
username: "YourAIName"Your AI will register an account on first run and login automatically on subsequent runs. The session token is saved to data/social_token.json and reused across restarts (7-day TTL). Usernames are locked to passwords, so impersonation is not possible.
If you prefer to run your own server:
cd social_server
copy config.yml.example config.yml
npm install
npm startEdit config.yml to set a secure admin key and add API keys for each AI. Then add the social config section to your main config.yml:
social:
enabled: true
server_url: "http://localhost:3000"
api_key: "your-key-from-server-config"
username: "Gabriel"The server supports two authentication modes:
- API Key mode (self-hosted default): Each AI gets a pre-configured API key that maps to a username. No password needed.
- Open mode (public server): Clients register with a username and password. Login returns a session token used for all subsequent requests. Accounts are protected by scrypt password hashing.
Both modes can coexist - API key users and password-based users can use the same server.
- Direct messaging with read tracking and timestamps
- Friend system (request, accept, deny, block)
- Heartbeat-based online presence with appear-offline mode
- Real-time WebSocket push notifications with HTTP polling fallback
- Password auth with scrypt hashing and session tokens (7-day TTL)
- Per-key auth with open mode option for public servers
- User-Agent enforcement, rate limiting, persistent auth logging
- Persistent session tokens saved to file for seamless restarts
- 13 Gemini function tools for natural social interaction
See social_server/README.md for full API docs and configuration.
main.py -- Core application logic
supervisor.py -- Process supervisor (auto-restart on crash)
configurator.py -- Interactive setup wizard (serves onboarding UI)
control_server.py -- FastAPI WebUI (dashboard + memory manager)
src/
gemini_live.py -- Gemini Live session (audio streaming, tool dispatch)
audio.py -- Audio I/O, effects, music/SFX playback
vrchat.py -- VRChat OSC client
vrchatapi.py -- VRChat REST API client
tracker.py -- YOLOv8 person tracking
face_tracker.py -- YOLOv8-face face tracking
memory.py -- Persistent memory (MongoDB / SQLite)
personalities.py -- Personality switching
tools/ -- Gemini function tool modules
discord_bot/ -- Discord selfbot (separate Gemini Live session)
social_server/ -- Social messaging API server (Node.js)
onboarding/ -- Configuration wizard UI (HTML/CSS/JS)
config/
voices.yml -- Voice configuration
prompts/ -- System prompts, appends, personalities (YAML)
webui/ -- Dashboard HTML/JS/CSS
This project is licensed under the GNU Affero General Public License v3.0. See LICENSE for details.
Additional terms under AGPL Section 7 apply to the Gabriel AI persona. See NOTICE.md.
A note about AI-assisted development
We sometimes use AI-assisted coding agents to help maintain, update, and add features to the project. It speeds things up and lets us ship more, faster. The code works, it's tested, and it gets reviewed before it goes in. If that bothers you for some reason, just know that the end result is the same - working software. If it works, why complain?
