Skip to content

Improve autovacuum "queue and progress" howto #30

@NikolayS

Description

@NikolayS

Context

The howto at docs/postgres-howtos/database-administration/maintenance/autovacuum-queue-and-progress.md covers autovacuum queue depth and live progress monitoring. A review against the Cybertec autovacuum monitoring article and the postgres-ai SQL style guide surfaced a number of gaps and improvements.

Correctness / coverage

  • Add PG 18+ autovacuum_vacuum_max_threshold to the trigger formula — effective threshold is least(max_threshold, threshold + scale_factor * reltuples); without it the query overstates urgency on large tables under PG 18
  • Handle autovacuum_vacuum_insert_threshold / autovacuum_vacuum_insert_scale_factor (PG 13+) — insert-mostly tables are invisible to the current query
  • Fix inverted autovacuum_enabled label — case when autovacuum_enabled then 'DISABLED' else 'enabled' end is backwards
  • Add anti-wraparound urgency — compute age(relfrozenxid) vs. autovacuum_freeze_max_age and mxid_age(relminmxid) vs. autovacuum_multixact_freeze_max_age, surface wraparound_urgency and per-table overrides
  • Add a trigger_reason column — {dead_tuples, inserts, freeze, multixact, in_progress} so each queue row says why it qualifies

Query quality

  • Replace regex-based reloption parsing with LATERAL unnest(reloptions) + FILTER — shorter, faster, easier to extend
  • Compute explicit vacuum_urgency = n_dead_tup / effective_threshold ratio and order by it, replacing the current boolean in/out-of-queue approach
  • Make query version-portable — use current_setting('autovacuum_vacuum_max_threshold', true) with 'Infinity'::float8 fallback for PG 13–17
  • Handle reltuples = -1 (PG 14+ initial value for freshly created tables) — urgency is meaningless in that case
  • Filter out temp tables / unlogged tables that won't be autovacuumed
  • Use format('%I.%I', nspname, relname) for the relation column instead of string concatenation
  • Simplify the full outer join — a left join + optional union handles the "another DB" edge case more clearly

Adjacent diagnostics

  • Add a pg_stat_progress_analyze companion query (the "natural extension" already mentioned in the page)
  • Add visibility-map health query via pg_visibility_map_summary — different signal from dead-tuple count, relevant for index-only scans
  • Cross-link last_autoanalyze / n_mod_since_analyze alongside last_autovacuum
  • Surface throttle math next to in-progress rows — effective MB/s ceiling from [auto]vacuum_cost_limit, [auto]vacuum_cost_delay, page-cost params

Tuning guidance / prose

  • Add a concrete worked example of the "load average" analogy (e.g., "10 tables in queue, 3 workers → effective depth of 3.3")
  • Explain when per-table overrides are better than bumping autovacuum_max_workers globally
  • Add a "why is this table stuck in the queue?" checklist: long-running transactions, replication slots, prepared transactions, hot-standby feedback, DDL locks, corruption
  • Note vacuum_buffer_usage_limit (PG 16+) when discussing "make workers run faster"
  • Cross-link how-to-monitor-xmin-horizon.md as the natural next step when cleanup is blocked
  • Add a version-compatibility callout (which columns/settings require which PG version)
  • Cite the Cybertec article as further reading

Output / UX

  • Refresh the screenshot (0067_tables_to_be_autovacuumed_2.png)
  • Provide a compact variant (no progress columns) for the common case when nothing is actively vacuuming
  • Provide a \watch-friendly variant for staring at an in-progress vacuum

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentationenhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions