Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
164 changes: 164 additions & 0 deletions docs/developer-guide/query-exporter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
---
sidebar_position: 18
---

# Query Exporter

The query-exporter service monitors the OpenRemote PostgreSQL database and exposes metrics on port 9560 for Prometheus scraping. It uses [query-exporter](https://github.com/albertodonato/query-exporter) to collect database health metrics.

## Available Metrics

### Table and Index Bloat
- `pg_table_bloat_count` - Number of tables/indexes with bloat exceeding thresholds
- `pg_table_bloat_ratio` - Bloat ratio per table/index (1.0 = no bloat, 2.0 = 100% bloat)
- `pg_table_bloat_bytes` - Estimated bloat size in bytes per table/index
- `pg_table_bloat_wasted_mb` - Estimated wasted space in megabytes per table/index

### Autovacuum Workers
- `pg_autovacuum_workers_active` - Number of currently active autovacuum workers
- `pg_autovacuum_workers_max` - Maximum number of autovacuum workers configured
- `pg_autovacuum_running` - Running autovacuum processes (labels: database, table_schema, table_name, phase)

### Datapoint Query Performance
- `pg_datapoint_query_duration_seconds` - Histogram of execution times for the attribute with most datapoints
- `pg_datapoint_count` - Total number of datapoints for the top attribute
Comment on lines +23 to +24
Copy link

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "Datapoint Query Performance" section mentions "the attribute with most datapoints" and "the top attribute", but it's unclear whether this is determined dynamically or requires configuration. Consider clarifying that this automatically identifies and monitors the attribute with the highest datapoint count.

Suggested change
- `pg_datapoint_query_duration_seconds` - Histogram of execution times for the attribute with most datapoints
- `pg_datapoint_count` - Total number of datapoints for the top attribute
- `pg_datapoint_query_duration_seconds` - Histogram of execution times for the attribute that is automatically identified as having the highest datapoint count
- `pg_datapoint_count` - Total number of datapoints for the attribute that is automatically identified as having the highest datapoint count

Copilot uses AI. Check for mistakes.

### Database Health
- `pg_database_size_megabytes` - Total database size in megabytes
- `pg_connections_active` - Number of active connections
- `pg_connections_idle` - Number of idle connections
- `pg_locks_count` - Number of locks by type

## Configuration

### Environment Variables
The service uses the following environment variables (automatically configured in `profile/deploy.yml`):

**Database Connection:**
- `POSTGRES_HOST` - Database host (default: `postgresql`)
- `POSTGRES_PORT` - Database port (default: `5432`)
- `POSTGRES_DB` - Database name (default: `openremote`)
- `POSTGRES_USER` - Database user (default: `postgres`)
- `POSTGRES_PASSWORD` - Database password (default: `postgres`)

**Bloat Thresholds:**
- `TABLE_BLOAT_THRESHOLD` - Table bloat ratio threshold (default: `1.2` = 20% bloat)
- `INDEX_BLOAT_THRESHOLD` - Index bloat ratio threshold (default: `1.5` = 50% bloat)

:::note

Indexes typically bloat faster than tables, so the default index threshold is higher.

:::

### Customize Thresholds
Set environment variables before starting services:
```bash
export TABLE_BLOAT_THRESHOLD=1.3 # 30% table bloat
export INDEX_BLOAT_THRESHOLD=2.0 # 100% index bloat
```

### Query Intervals
- Table bloat queries: Every 5 minutes
- Autovacuum queries: Every 30 seconds
- Datapoint performance: Every 60 seconds
- Database size: Every 5 minutes
- Connection/lock stats: Every 30 seconds

## Accessing Metrics

### View Metrics Endpoint
```bash
curl http://localhost:9560/metrics
```

### Expose on Private Network
To expose on a private network in production, uncomment this line in `profile/deploy.yml`:
```yaml
- "${PRIVATE_IP:-127.0.0.1}:9560:9560"
```

## Prometheus Integration

Add this scrape configuration to your Prometheus config:

```yaml
scrape_configs:
- job_name: 'openremote-postgres'
static_configs:
- targets: ['localhost:9560']
Copy link

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The target 'localhost:9560' will only work if Prometheus is running on the same host as the query-exporter service. For Docker deployments (which is the primary deployment method mentioned throughout the documentation), this should use the Docker service name instead. Consider adding a note about using the service name (e.g., 'query-exporter:9560') when Prometheus runs in the same Docker network, or providing examples for both scenarios.

Copilot uses AI. Check for mistakes.
scrape_interval: 30s
```

## Customizing Queries

To modify queries or add new metrics:

1. Edit the `config.yaml` file located in the `query-exporter` configuration directory (by default, this is `/deployment/query-exporter/config.yaml` which is mounted as a Docker volume at the container's `/config/config.yaml` path—see your `profile/deploy.yml` for the exact path).
Copy link

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence is overly complex and difficult to parse. The nested parenthetical clauses and multiple path references make it hard to understand. Consider breaking this into multiple sentences or a bulleted list that clearly separates the host path, container path, and reference to the deploy.yml file.

Suggested change
1. Edit the `config.yaml` file located in the `query-exporter` configuration directory (by default, this is `/deployment/query-exporter/config.yaml` which is mounted as a Docker volume at the container's `/config/config.yaml` path—see your `profile/deploy.yml` for the exact path).
1. Edit the `config.yaml` file in the `query-exporter` configuration directory.
- Default host path: `/deployment/query-exporter/config.yaml`
- Container path (Docker volume mount): `/config/config.yaml`
- For the exact host path in your environment, see the `query-exporter` volume mapping in `profile/deploy.yml`.

Copilot uses AI. Check for mistakes.
2. Restart the service:
```bash
docker-compose -f profile/deploy.yml restart query-exporter
```

## Troubleshooting

### Check Service Logs
```bash
docker-compose -f profile/deploy.yml logs -f query-exporter
```

### Test Database Connectivity
```bash
docker-compose -f profile/deploy.yml exec query-exporter sh
apk add postgresql-client
psql -h $POSTGRES_HOST -U $POSTGRES_USER -d $POSTGRES_DB
```

### Verify Metrics Endpoint
```bash
curl http://localhost:9560/metrics
```

## Performance Tuning

If bloat detection queries impact database performance:

- **Increase query interval** - Change from 300s to 600s or higher in `config.yaml`
- **Limit to specific schemas** - Modify queries to target specific schemas only
- **Schedule off-peak runs** - Use `schedule` option instead of `interval`
- **Reduce sample size** - Lower the datapoint query sample size (default: 100)
Copy link

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sample size value mentioned here (100) should be documented in the "Query Intervals" section above for consistency. Currently, the query intervals section (lines 61-66) documents time intervals but doesn't mention sample sizes. Consider adding a note about the sample size to that section or creating a "Query Parameters" subsection.

Copilot uses AI. Check for mistakes.

### Query Complexity
- Bloat detection scans `pg_stats` and `pg_class` catalogs (limited to top 50 results)
- Datapoint performance samples 100 most recent datapoints from the largest attribute
Copy link

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line states "100 most recent datapoints" but line 129 refers to "sample size (default: 100)". These should use consistent terminology - either both refer to it as a "sample" or both as "most recent datapoints". Using "sample size" is more technically accurate since it's a configurable parameter.

Suggested change
- Datapoint performance samples 100 most recent datapoints from the largest attribute
- Datapoint performance uses a sample size of 100 recent datapoints from the largest attribute (configurable)

Copilot uses AI. Check for mistakes.
- All queries exclude PostgreSQL system schemas (`pg_%` and `information_schema`)

## Understanding Bloat

### Bloat Ratio Values
- `1.0` - No bloat (optimal size)
- `1.2` - 20% bloat (default table threshold)
- `1.5` - 50% bloat (default index threshold)
- `2.0` - 100% bloat (object is twice the optimal size)

### Maintenance Actions
- **Tables > 1.2** - Run `VACUUM FULL` during maintenance window
- **Indexes > 1.5** - Run `REINDEX` on affected indexes
- **Critical bloat (> 2.0)** - Immediate maintenance recommended

### PostgreSQL Constants
The bloat detection queries use these PostgreSQL internal constants:
- `1048576` - Bytes per megabyte (1024 × 1024)
- `8` - Bits per byte (for null bitmap calculation)
- `20` - Page header size in bytes
- `12` - Index header overhead in bytes
- `4` - Item pointer size in bytes
- `23` - Tuple header size for PostgreSQL 14+ (Linux)
Copy link

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tuple header size of 23 bytes is documented as specific to "PostgreSQL 14+ (Linux)" but the documentation doesn't specify which PostgreSQL version OpenRemote uses or whether this value might differ on other platforms. Consider adding a note about verifying compatibility if users are running different PostgreSQL versions or operating systems.

Suggested change
- `23` - Tuple header size for PostgreSQL 14+ (Linux)
- `23` - Typical tuple header size for PostgreSQL 14+ on Linux (this value may differ for other PostgreSQL versions or operating systems; verify for your deployment)

Copilot uses AI. Check for mistakes.
- `4` - Memory alignment for Linux containers

## References

- [Query Exporter Documentation](https://github.com/albertodonato/query-exporter)
- [Configuration Format](https://github.com/albertodonato/query-exporter/blob/main/docs/configuration.rst)
- [PostgreSQL Statistics Views](https://www.postgresql.org/docs/current/monitoring-stats.html)
- [PostgreSQL Bloat Detection](https://wiki.postgresql.org/wiki/Show_database_bloat)