DrawAI: AI-Powered Image Generation from Natural Language

DrawAI is a Python-based application that translates natural language prompts into images. It leverages a sophisticated agentic architecture to understand user requests, select the appropriate tools, and generate artwork using various drawing backends.

Key Features

Natural Language to Image: Describe what you want to see, and DrawAI will generate it.
Multiple Drawing Backends: Supports different rendering engines, including Pillow (for raster images), SVG (for vector graphics), and Turtle (for procedural drawings).
Intelligent Strategy Selection: Automatically chooses between a direct "one-go" generation for simple prompts and a more detailed "tool-calling" approach for complex scenes.
Human-in-the-Loop: If a prompt is ambiguous, the system can ask for clarification to ensure the output matches the user's intent.

Architecture and Technology

This project is built around a robust and observable agentic workflow, using cutting-edge tools to manage complexity and provide clear insights into the generation process.

Orchestration with LangGraph

The core of DrawAI is an agentic state machine orchestrated by LangGraph. This allows for a highly modular and resilient workflow where each step of the image generation process is represented as a node in a graph. The state, including the prompt, intermediate analysis, and selected strategy, flows through the graph, enabling complex logic like conditional branching and loops.

The graph consists of several key nodes:

analyze_prompt: Evaluates the user's prompt for clarity and determines if human feedback is needed.
select_strategy: Decides whether to use a simple, direct generation method or a more intricate tool-based approach.
route_backend: Selects the most suitable drawing library (Pillow, SVG, or Turtle) based on the prompt's requirements.
one_go_executor / tool_call_executor: Executes the chosen strategy to produce the final image.

This graph-based architecture makes the system easy to extend and debug, as each logical step is isolated and independently testable.

Observability with LangFuse

To ensure reliability and provide deep insights into the agent's behavior, the project is integrated with LangFuse. Every run of the LangGraph is traced, capturing detailed information about the inputs, outputs, and transitions between nodes. This allows for:

Debugging: Visualizing the execution flow to pinpoint errors and inefficiencies.
Performance Monitoring: Analyzing the latency and success rate of different strategies and backends.
Evaluation: Tracking the quality of generated images and the accuracy of the agent's decisions over time.

This level of observability is crucial for understanding and improving the performance of a complex, multi-step AI system.

How It Works

Prompt Input: The user provides a natural language description of the desired image.
Graph Execution: The run_graph.py script initializes the LangGraph state machine with the prompt.
Analysis and Clarification: The analyze_prompt node processes the prompt. If it's ambiguous, it can pause and wait for user clarification.
Strategy and Backend Selection: The graph dynamically selects the best strategy (one-go or tool-call) and the most appropriate backend (Pillow, SVG, or Turtle).
Image Generation: The corresponding executor node is invoked.
- In a one-go scenario, the model generates the image directly.
- In a tool-call scenario, the model is given access to a set of primitive drawing functions (e.g., draw_circle, draw_rectangle) and orchestrates them to build the image step-by-step.
Output: The final image is saved to the outputs directory.

Project Structure

.
├── graph/                # LangGraph state, nodes, and graph definition
│   ├── nodes/            # Individual nodes for the graph
│   └── drawing_graph.py  # Graph construction
├── primitives/           # Drawing function definitions and implementations
│   ├── definitions.py    # Abstract drawing tool definitions
│   ├── pillow_impl.py    # Pillow backend implementation
│   └── ...
├── main.py               # Main application entry point
├── run_graph.py          # Script to execute the LangGraph workflow
├── observability.py      # LangFuse integration
└── requirements.txt      # Project dependencies

Setup and Usage

Prerequisites

Python 3.8+
An OpenAI API key (or another compatible LLM provider)
LangFuse account (optional, for tracing)

Installation

Clone the repository:

git clone https://github.com/JohnnyPro/DrawAi.git
cd DrawAi

Create a virtual environment and install dependencies:

python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate`
pip install -r requirements.txt

Configure environment variables: Create a .env file in the root of the project and add your API keys:
```
copy the .env.example file to .env and fill in the values
`cp .env.example .env`
```

Running the Application

To generate an image, run the run_graph.py script with a prompt:

python run_graph.py "A red house with a blue door and two windows"

The generated image will be saved in the outputs/ directory.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
graph		graph
primitives		primitives
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
base_draw.py		base_draw.py
docker-compose.yaml		docker-compose.yaml
live_viewer.py		live_viewer.py
main.py		main.py
observability.py		observability.py
one-go.py		one-go.py
pillow_draw.py		pillow_draw.py
requirements.txt		requirements.txt
run_graph.py		run_graph.py
svg.py		svg.py
tool-call.py		tool-call.py
turtle_draw.py		turtle_draw.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DrawAI: AI-Powered Image Generation from Natural Language

Key Features

Architecture and Technology

Orchestration with LangGraph

Observability with LangFuse

How It Works

Project Structure

Setup and Usage

Prerequisites

Installation

Running the Application

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DrawAI: AI-Powered Image Generation from Natural Language

Key Features

Architecture and Technology

Orchestration with LangGraph

Observability with LangFuse

How It Works

Project Structure

Setup and Usage

Prerequisites

Installation

Running the Application

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages