Open-Cluely is an Electron desktop copilot for technical interviews and live meetings. It combines AssemblyAI streaming transcription, screenshot capture, and Gemini-powered responses in a compact always-on-top window.
Use it only in environments where recording, transcription, screenshots, and AI assistance are allowed.
Open source alternative for Cluely and Parakeetai. Your Real-Time AI Interview Assistant π
- Dual-source live transcription for host/system audio and microphone input, with per-source toggles and a live monitor.
- Four AI action buttons, each with a distinct purpose β described in detail below.
- Per-message
AI/Offcontrols let you keep transcript chunks, screenshots, and prior AI replies visible while excluding them from future prompts. - Multiple Gemini API keys are supported as a comma-separated list, with automatic failover on quota or authentication errors.
- Settings support Gemini model selection, AssemblyAI speech model selection, programming language preference, and window opacity.
- Session state is persisted to
cache/app-state.json, and screenshot retention is bounded byMAX_SCREENSHOTS.
Each button sends a different slice of context to the AI and is designed for a different moment in the workflow.
The full-context answer button. Use this when you want a complete, thorough response.
What it sends: all enabled transcript messages + all enabled screenshots + full conversation history.
What it does: reads the entire context as one unified thread, silently corrects speech-to-text recognition errors, identifies the actual question being asked (even across fragmented or imperfect transcript messages), and produces a complete answer.
Output:
- Understanding β one sentence confirming what it understood the question to be
- Answer β full response, as deep as the question requires
- For coding and algorithmic questions: Approach β Full solution code β Time/Space complexity β Key points
Use Ask AI when you need the complete answer, not just the opening move.
The screenshot interpreter. Use this when the question or problem is visible on screen.
What it sends: only the screenshots currently enabled in context.
What it does: reads all visible text in the screenshot (constraints, function signatures, error messages, sample I/O), identifies what type of content it is (LeetCode problem, stack trace, terminal output, UI layout, architecture diagram), and responds accordingly.
Output (for coding/debugging):
- Understanding β Approach β Complexity β Full runnable solution code β Explanation (only if it adds value)
Output (for non-coding screenshots β UI, architecture, docs):
- What I see β Answer β Key Points
Use Screen AI when the problem is on your screen and you want a direct solution without needing to describe it in words.
The opening-move button. Use this when you want something ready to say right now, without the full depth of Ask AI.
What it sends: only the enabled transcript messages.
What it does: reads the full conversation flow, identifies where the discussion stands, and generates a concise spoken response β something you can say out loud immediately, not a written essay.
Output:
- Best response (say this) β 2β4 sentences, natural spoken language, technically accurate but not exhaustive
- Key points β 2β3 short anchor concepts to hold in mind and expand on if pushed
- Optional follow-ups β questions or angles the other person is likely to raise next
Use Suggest to open confidently. Use Ask AI when the interviewer pushes deeper and you need the full answer.
The structured record button. Use this at any point to capture what has happened in the session.
What it sends: all enabled transcript messages and context.
What it does: organizes the conversation into a clean, topic-grouped document β correcting for STT noise throughout. Does not add inferences or assumptions not grounded in the actual conversation.
Output (always all five sections, even if empty):
- Key Discussion Points β main topics covered
- Decisions Made β with owner if mentioned
- Action Items β checkboxed, with owner and deadline if mentioned
- Open Questions / Unresolved Items β what was raised but not resolved
- Next Steps β what happens next based on the conversation
Use Notes to produce a shareable record at the end of a meeting or interview debrief.
- Windows 10/11 is the primary development target for this repo.
- Node.js
20.xis recommended. The existing docs and environment were prepared around20.20.1. - npm
10+ - At least one Gemini API key (configured in the app Settings UI)
- One AssemblyAI API key (configured in the app Settings UI)
nvm install 20.20.1
nvm use 20.20.1
npm ci
Copy-Item .env.example .envAPI keys are configured from the in-app Settings panel after launch.
Start the app:
npm startUseful variants:
npm run dev
npm run start:hiddenFor day-to-day use on Windows, prefer building the portable app and running the generated .exe instead of launching from source every time.
npm run build:winThis creates:
dist/GoogleChrome.exe
You can then run the packaged app directly by double-clicking dist/GoogleChrome.exe.
This app depends on native modules. If npm ci fails with node-gyp or Visual Studio toolchain errors, install the C++ build tools and Python:
winget install --id Microsoft.VisualStudio.2022.BuildTools --exact --accept-package-agreements --accept-source-agreements --override "--passive --wait --add Microsoft.VisualStudio.Workload.VCTools --includeRecommended"| Variable | Required | Notes |
|---|---|---|
HIDE_FROM_SCREEN_CAPTURE |
No | Defaults to true. Controls BrowserWindow.setContentProtection(...). |
START_HIDDEN |
No | Defaults to false. Also available at runtime via npm run start:hidden or --start-hidden. |
MAX_SCREENSHOTS |
No | Defaults to 50. Old screenshots are deleted when the limit is exceeded. |
SCREENSHOT_DELAY |
No | Defaults to 300 ms. Delay used while briefly hiding the window before capture. |
NODE_ENV |
No | Defaults to production. development opens DevTools automatically. |
NODE_OPTIONS |
No | Defaults to --max-old-space-size=4096. |
src/config.js defines the app's configurable lists and defaults:
- Gemini models
- AssemblyAI speech models
- Programming language options for code-oriented prompts
- Global keyboard shortcuts
The first item in each model/language list is treated as the default.
- In development, state is written to
cache/app-state.jsonat the repo root. Portable builds create the samecache/app-state.jsonstructure next to the executable. - Development screenshots are stored in
.stealth_screenshots/. Packaged builds store screenshots under the app's user-data path. - Saving settings from the UI writes API-key values and selection state to
cache/app-state.json.
- Launch the app and confirm your API keys and models in Settings.
- Start transcription and enable whichever sources you need:
Host,Mic, or both. - Take screenshots when visual context is needed β a problem statement, error, or UI.
- Use the right button for the moment:
- Suggest to get a quick, speakable opening response from the transcript
- Ask AI when you need the full, complete answer from all context
- Screen AI when the problem is on your screen and you want a direct solution
- Notes to capture a structured record of what was discussed and decided
- Toggle noisy messages to
Offbefore the next AI call so the prompt stays focused on what matters.
src/main-process/is the Electron control plane (startup flow, window behavior, global shortcuts, and IPC registration).src/services/contains reusable domain logic (Gemini prompts/runtime behavior, AssemblyAI streaming/transcript history, persisted app-state).src/windows/assistant/preload/is the renderer-safe API boundary (window.electronAPIinvoke + event wrappers).src/windows/assistant/renderer/features/contains modular UI logic (chat, listeners, settings, transcription, context bundling, layout).src/windows/legacy/contains old experiments and is not part of the active runtime path.
Detailed, file-by-file ownership is documented in notes.md.
src/
bootstrap/ Environment loading, validation, and persistence
main-process/ Startup orchestration, IPC wiring, window control, assistant runtime
services/
ai/ Gemini service + prompt builders
assembly-ai/ Streaming STT service + transcript history manager
state/ App-state load/save helpers
windows/
assistant/
preload/ `window.electronAPI` invoke/listener bridge
renderer/features/ Renderer feature modules (chat, listeners, settings, transcription, AI context, layout)
window.js BrowserWindow creation/config
renderer.js Renderer composition root
legacy/ Older experimental files kept out of the active flow
assets/ Build icons and packaging assets
cache/ Generated app state in development
.stealth_screenshots/ Session screenshots in development
dist/ Packaged build output
repomix-output.txt Single-file repository snapshot for AI/code review tooling
All keyboard shortcuts are customizable. Configure them in src/config.js to match your preference before building or running the app.
npm startruns the app from source.npm run start:hiddenlaunches it in background mode from source.npm run devenables Electron logging.npm run build:wincreates the portable Windows executable.npm run buildruns the defaultelectron-builderflow.
The recommended Windows build is the portable executable:
npm run build:winExpected output:
dist/GoogleChrome.exe
Notes:
- This is the recommended way to use the app outside development because it gives you a standalone
.exeto launch directly. .envis bundled as an extra resource during packaging.- The current Windows build is configured as a portable
x64target with:- Product name:
Google Chrome (2) - Executable name:
GoogleChrome.exe - App ID:
com.google.chrome - Publisher name:
Google LLC
- Product name:
- If the build fails with a symlink privilege error, enable Windows Developer Mode or run the build from an elevated terminal.
- The repo already includes
assets/chrome.icofor the Windows target. Addassets/chrome.icnsandassets/chrome.pngbefore relying on the macOS or Linux targets defined inpackage.json.
After building:
- Open the
dist/folder. - Run
GoogleChrome.exe. - If you want background launch behavior, either set
START_HIDDEN=truebefore building or launch with:
.\dist\GoogleChrome.exe --start-hiddenAfter packaging, verify:
dist/GoogleChrome.exeexists- the executable shows the Chrome icon
- the app launches correctly without needing
npm start
For a build-focused walkthrough, see BUILD_INSTRUCTIONS.md.
- Keep
src/config.jsas the single source of truth for model lists, programming languages, and keyboard shortcuts. - When adding or changing environment variables, update all three places together:
src/bootstrap/environment.js,.env.example, and this README. - Preserve Electron boundaries: renderer code should go through
preloadand IPC, not import main-process modules directly. - Keep cursor behavior stealth-safe: interactive controls intentionally do not switch to per-button pointer cursors. This prevents screen-sharing viewers from inferring user actions from cursor-shape changes while hidden mode is active.
- Add new UI logic under
src/windows/assistant/renderer/features/and new domain logic undersrc/services/orsrc/main-process/features/. - Treat
src/windows/legacy/as reference material unless you are intentionally reviving an old experiment. - Re-test both
npm startand the relevant packaging path when changing startup flow, window behavior, screenshots, IPC, or global shortcuts. - Keep real keys out of Git. Use
.env, and rely on.env.examplefor the documented contract.
To regenerate the packed repository snapshot:
npx repomix . --style plain -o repomix-output.txtIf you want to exclude generated artifacts while experimenting:
npx repomix . --style plain -o repomix-output.txt -i "repomix-output.txt,cache/**"