DeepChat is a small Python CLI that launches an Ollama-backed chat session. It supports two runtime modes:
- Local mode: uses the Ollama server installed on your machine.
- Container mode: starts the official
ollama/ollamaDocker image and talks to it overhttp://localhost:11434.
On startup, the script prepares the selected runtime, pulls the requested model, runs a quick test prompt, and then drops into an interactive prompt loop.
- Python 3
- Internet access for the first model pull
- For local mode: Ollama installed and available on your
PATH - For container mode: Docker installed and running
python main.py [model] [--local|-l]modelis optional. If omitted, the script usesdeepseek-r1:1.5b.--localor-lis optional.- If no flag is provided, the script uses Docker mode.
python main.py deepseek-r1:7b --local
python main.py deepseek-r1:7b -lLocal mode:
- upgrades
pip - installs the Python
ollamapackage if needed - starts
ollama serve - pulls the requested model locally
- opens an interactive chat session through the Python Ollama client
python main.py deepseek-coder-v2Docker mode:
- installs the Python
requestspackage if needed - stops and removes any existing container named
ollama - starts a fresh
ollama/ollamacontainer - pulls the requested model inside the container
- opens an interactive chat session through the Ollama HTTP API
# Use the default model in Docker mode
python main.py
# Run a local Ollama model
python main.py deepseek-r1:7b --local
# Run a model in Docker mode
python main.py deepseek-coder-v2If you want to paste larger prompts or code blocks, you may need a model variant with a higher context window. This repository includes a modelfile that sets:
num_ctx 24576num_predict 8192
Create a patched model like this:
ollama pull deepseek-coder-v2
ollama create deepseek-coder-v2-patch -f modelfile
python main.py deepseek-coder-v2-patch --localYou can replace deepseek-coder-v2 with any compatible base model.
- The script rejects unknown flags and exits if too many arguments are provided.
- In Docker mode, the container is always named
ollama. - The interactive output strips
<think>...</think>content before printing visible answers.
