Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 18 additions & 4 deletions docs/src/examples/llm_inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,12 @@ This guide shows how to run a **router + worker** LLM service with Pulsing, and
The router needs an **actor system address** so workers can join the same cluster:

```bash
pulsing actor router --addr 0.0.0.0:8000 --http_port 8080 --model_name my-llm
pulsing actor pulsing.actors.router.RouterActor \
--addr 0.0.0.0:8000 \
--http_host 0.0.0.0 \
--http_port 8080 \
--model_name my-llm \
--worker_name worker
```

## 2) Start workers
Expand All @@ -29,21 +34,30 @@ You can run **one or more** workers. Each worker should join the router node via
### Option A: Transformers worker (Terminal B)

```bash
pulsing actor transformers --model gpt2 --device cpu --addr 0.0.0.0:8001 --seeds 127.0.0.1:8000
pulsing actor pulsing.actors.worker.TransformersWorker \
--model_name gpt2 \
--device cpu \
--addr 0.0.0.0:8001 \
--seeds 127.0.0.1:8000 \
--name worker
```

### Option B: vLLM worker (Terminal C)

```bash
pulsing actor vllm --model Qwen/Qwen2.5-0.5B --addr 0.0.0.0:8002 --seeds 127.0.0.1:8000
pulsing actor pulsing.actors.vllm.VllmWorker \
--model Qwen/Qwen2.5-0.5B \
--addr 0.0.0.0:8002 \
--seeds 127.0.0.1:8000 \
--name worker
```

## 3) Verify cluster + workers

### List actors (observer mode)

```bash
pulsing actor list --endpoint 127.0.0.1:8000
pulsing inspect actors --endpoint 127.0.0.1:8000
```

### Inspect cluster
Expand Down
22 changes: 18 additions & 4 deletions docs/src/examples/llm_inference.zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,12 @@
Router 需要指定 **actor system 地址**,以便其它进程启动的 workers 加入同一集群:

```bash
pulsing actor router --addr 0.0.0.0:8000 --http_port 8080 --model_name my-llm
pulsing actor pulsing.actors.router.RouterActor \
--addr 0.0.0.0:8000 \
--http_host 0.0.0.0 \
--http_port 8080 \
--model_name my-llm \
--worker_name worker
```

## 2)启动 Worker
Expand All @@ -29,21 +34,30 @@ pulsing actor router --addr 0.0.0.0:8000 --http_port 8080 --model_name my-llm
### 方案 A:Transformers Worker(终端 B)

```bash
pulsing actor transformers --model gpt2 --device cpu --addr 0.0.0.0:8001 --seeds 127.0.0.1:8000
pulsing actor pulsing.actors.worker.TransformersWorker \
--model_name gpt2 \
--device cpu \
--addr 0.0.0.0:8001 \
--seeds 127.0.0.1:8000 \
--name worker
```

### 方案 B:vLLM Worker(终端 C)

```bash
pulsing actor vllm --model Qwen/Qwen2.5-0.5B --addr 0.0.0.0:8002 --seeds 127.0.0.1:8000
pulsing actor pulsing.actors.vllm.VllmWorker \
--model Qwen/Qwen2.5-0.5B \
--addr 0.0.0.0:8002 \
--seeds 127.0.0.1:8000 \
--name worker
```

## 3)验证集群与 worker

### 列出 actors(观察者模式)

```bash
pulsing actor list --endpoint 127.0.0.1:8000
pulsing inspect actors --endpoint 127.0.0.1:8000
```

### 巡检集群
Expand Down
209 changes: 174 additions & 35 deletions docs/src/guide/operations.md
Original file line number Diff line number Diff line change
@@ -1,74 +1,208 @@
# CLI Operations
# CLI Commands

Pulsing ships with built-in CLI tools for running, inspecting, and benchmarking distributed systems.
Pulsing ships with built-in CLI tools for starting actors, inspecting systems, and benchmarking distributed services.

---

## Running Services
## Starting Actors

### Router (OpenAI-compatible HTTP API)
The `pulsing actor` command starts actors by providing their full class path. The CLI automatically matches command-line arguments to the Actor's constructor parameters.

### Format

Actor type must be a full class path:
- Format: `module.path.ClassName`
- Example: `pulsing.actors.router.RouterActor`
- Example: `pulsing.actors.worker.TransformersWorker`
- Example: `pulsing.actors.vllm.VllmWorker`
- Example: `my_module.my_actor.MyCustomActor`

### Examples

#### Router (OpenAI-compatible HTTP API)

```bash
pulsing actor pulsing.actors.router.RouterActor \
--addr 0.0.0.0:8000 \
--http_host 0.0.0.0 \
--http_port 8080 \
--model_name my-llm \
--worker_name worker \
--scheduler stream_load
```

#### Transformers Worker

```bash
pulsing actor pulsing.actors.worker.TransformersWorker \
--model_name gpt2 \
--device cpu \
--addr 0.0.0.0:8001 \
--seeds 127.0.0.1:8000 \
--name worker
```

#### vLLM Worker

```bash
pulsing actor router --addr 0.0.0.0:8000 --http_port 8080 --model_name my-llm
pulsing actor pulsing.actors.vllm.VllmWorker \
--model Qwen/Qwen2 \
--addr 0.0.0.0:8002 \
--seeds 127.0.0.1:8000 \
--name worker \
--role aggregated \
--max_new_tokens 512
```

### Transformers Worker
#### Multiple Workers

```bash
pulsing actor transformers --model gpt2 --addr 0.0.0.0:8001 --seeds 127.0.0.1:8000
# Start multiple workers with different names
pulsing actor pulsing.actors.worker.TransformersWorker \
--model_name gpt2 \
--name worker-1 \
--seeds 127.0.0.1:8000

pulsing actor pulsing.actors.worker.TransformersWorker \
--model_name gpt2 \
--name worker-2 \
--seeds 127.0.0.1:8000

# Router targeting specific worker name
pulsing actor pulsing.actors.router.RouterActor \
--worker_name worker-1 \
--seeds 127.0.0.1:8000
```

### vLLM Worker
### Common Options

- `--name NAME`: Actor name (default: "worker")
- `--addr ADDR`: Actor System bind address
- `--seeds SEEDS`: Comma-separated list of seed nodes
- Any other `--param value` pairs matching the Actor's constructor signature

### How It Works

```bash
pulsing actor vllm --model Qwen/Qwen2 --addr 0.0.0.0:8002 --seeds 127.0.0.1:8000
# Pass parameters directly as command-line arguments
pulsing actor pulsing.actors.worker.TransformersWorker \
--model_name gpt2 \
--device cpu \
--preload true \
--name my-worker \
--seeds 127.0.0.1:8000

# Start vLLM worker with all parameters
pulsing actor pulsing.actors.vllm.VllmWorker \
--model Qwen/Qwen2 \
--role aggregated \
--max_new_tokens 512 \
--name vllm-worker \
--seeds 127.0.0.1:8000
```

Options:
- `--name NAME`: Actor name (default: "worker")
- `--addr ADDR`: Actor System bind address
- `--seeds SEEDS`: Comma-separated list of seed nodes
- Any other `--param value` pairs matching the Actor's constructor signature

The Actor class must:
- Be importable from the specified module path
- Inherit from `pulsing.actor.Actor`
- Have a constructor with named parameters (the CLI automatically matches arguments to constructor parameters)

**How it works:**
The CLI inspects the Actor class constructor signature and automatically extracts matching parameters from command-line arguments. You can use `--help` to see available parameters, or check the Actor class documentation.

---

## Actor List
---

## Inspect

`pulsing inspect` is a lightweight **observer** tool that queries actor systems via HTTP (no cluster join required). It provides multiple subcommands for different inspection needs.

### Subcommands

`pulsing actor list` is a lightweight **observer** that queries actors via HTTP (no cluster join required).
#### Cluster Status

### Single Node
Inspect cluster members and their status:

```bash
pulsing actor list --endpoint 127.0.0.1:8000
pulsing inspect cluster --seeds 127.0.0.1:8000
```

### Cluster (via Seeds)
Output includes:
- Total nodes and alive count
- Status summary (Alive, Suspect, Failed, etc.)
- Detailed member list with node ID, address, and status

#### Actors Distribution

Inspect named actors distribution across the cluster:

```bash
pulsing actor list --seeds 127.0.0.1:8000,127.0.0.1:8001
pulsing inspect actors --seeds 127.0.0.1:8000
```

### Options
Options:
- `--top N`: Show top N actors by instance count
- `--filter STR`: Filter actor names by substring
- `--all_actors True`: Include internal/system actors

| Flag | Description |
|------|-------------|
| `--all_actors True` | Include internal/system actors |
| `--json True` | Output as JSON |
Examples:
```bash
# Show top 10 actors
pulsing inspect actors --seeds 127.0.0.1:8000 --top 10

!!! note
Uses HTTP/2 (h2c). Node must expose HTTP endpoints.
# Filter actors by name
pulsing inspect actors --seeds 127.0.0.1:8000 --filter worker
```

---
#### Metrics

## Inspect
Inspect Prometheus metrics from cluster nodes:

```bash
pulsing inspect metrics --seeds 127.0.0.1:8000
```

`pulsing inspect` joins a cluster (via seeds) and prints a human-friendly snapshot of members and actors.
Options:
- `--raw True`: Output raw metrics (default)
- `--raw False`: Show summary only (key metrics)

#### Watch Mode

Watch cluster state changes in real-time:

```bash
pulsing inspect --seeds 127.0.0.1:8000
pulsing inspect watch --seeds 127.0.0.1:8000
```

Output includes:
Options:
- `--interval 1.0`: Refresh interval in seconds (default: 1.0)
- `--kind all`: What to watch: `cluster`, `actors`, `metrics`, or `all` (default: `all`)
- `--max_rounds N`: Maximum number of refresh rounds (None = infinite)

Examples:
```bash
# Watch cluster member changes
pulsing inspect watch --seeds 127.0.0.1:8000 --kind cluster --interval 2.0

- **Cluster members**: node id, addr, status
- **Named actors**: distribution across nodes
# Watch actor changes
pulsing inspect watch --seeds 127.0.0.1:8000 --kind actors
```

### Common Options

All subcommands support:

!!! tip
For local seeds (`127.0.0.1`), the CLI auto-binds to `127.0.0.1:0`.
- `--timeout 10.0`: Request timeout in seconds (default: 10.0)
- `--best_effort True`: Continue even if some nodes fail (default: False)

!!! note
Observer mode uses HTTP/2 (h2c) and does NOT join the gossip cluster, making it lightweight and suitable for production monitoring.

---

Expand All @@ -93,10 +227,15 @@ pulsing bench gpt2 --url http://localhost:8080

| Task | Command |
|------|---------|
| Start router | `pulsing actor router --addr 0.0.0.0:8000 --http_port 8080` |
| Start worker | `pulsing actor transformers --model gpt2 --seeds ...` |
| List actors | `pulsing actor list --endpoint 127.0.0.1:8000` |
| Inspect cluster | `pulsing inspect --seeds 127.0.0.1:8000` |
| Start router | `pulsing actor pulsing.actors.router.RouterActor --addr 0.0.0.0:8000 --http_port 8080` |
| Start worker | `pulsing actor pulsing.actors.worker.TransformersWorker --model_name gpt2 --seeds ...` |
| Start multiple workers | `pulsing actor pulsing.actors.worker.TransformersWorker --model_name gpt2 --name worker-1 --seeds ...` |
| Router with custom worker | `pulsing actor pulsing.actors.router.RouterActor --worker_name worker-1 --seeds ...` |
| List actors | `pulsing inspect actors --endpoint 127.0.0.1:8000` |
| Inspect cluster | `pulsing inspect cluster --seeds 127.0.0.1:8000` |
| Inspect actors | `pulsing inspect actors --seeds 127.0.0.1:8000 --top 10` |
| Inspect metrics | `pulsing inspect metrics --seeds 127.0.0.1:8000` |
| Watch cluster | `pulsing inspect watch --seeds 127.0.0.1:8000` |
| Benchmark | `pulsing bench gpt2 --url http://localhost:8080` |

---
Expand Down
Loading
Loading