nginx intermittently 404s wp-json/* routes (tries to serve as static files)

## Summary

In production, nginx **intermittently** fails to route `/wp-json/*`
requests to PHP-FPM and instead tries to serve them as static files
from disk, returning 404. Same endpoint, same client, same minute —
some requests reach PHP and succeed, others 404 at the filesystem
layer.

## Evidence

From `logs/cms.catholicdigitalcommons.org/proxy_error_log` on 2026-04-29
during a `workflow_dispatch` deploy:

**Successful PHP-routed request (16:25:23):**
```
[error] FastCGI sent in stderr: "PHP message: cdcf_process_translation:
  Translation complete for post 858 (es)" while reading response header
  from upstream, ... request: "POST /wp-json/cdcf/v1/process-queue
  HTTP/2.0", upstream:
  "fastcgi://unix:/var/www/vhosts/system/cms.catholicdigitalcommons.org/php-fpm.sock"
```

**404 from same client minutes later (16:28:32):**
```
[error] openat() "/var/www/vhosts/catholicdigitalcommons.org/cms.catholicdigitalcommons.org/wp-json/cdcf/v1/process-queue"
  failed (2: No such file or directory),
  client: 2001:41d0:52:bff::14c, server: cms.catholicdigitalcommons.org,
  request: "POST /wp-json/cdcf/v1/process-queue HTTP/2.0"
```

The 404 case shows nginx never reached the FastCGI upstream — it tried
`openat()` on the URL path as a literal file. That only happens when
the WordPress URL-rewrite rules don't match the request, which on a
correctly-configured Plesk + nginx + WP setup should never happen for
`/wp-json/*`.

Affected endpoints I've seen 404 in the logs include:
- `POST /wp-json/cdcf/v1/process-queue` (the redis-queue cron-like
  worker trigger)
- `GET /wp-json/wp/v2/pages/<id>` (verify step's polling reads)

PHP-routed (successful) and filesystem-404 responses are interleaved
seconds apart. The 404 case appears to be ~30-50% of requests during
heavy traffic windows.

## Hypotheses

1. **Plesk's nginx config has a `try_files` order that races.** A
   common Plesk pattern is something like
   `try_files $uri $uri/ /index.php?$args`, which checks for a real
   file first. If the request is somehow routed to a stale or
   uninitialized worker, `try_files` may fail without falling through
   to PHP.
2. **Caching or worker-lifecycle issue.** During the burst of cron
   POSTs from the GitHub Actions runner, nginx may be hitting a
   per-worker code path that doesn't have the WP rewrite rules loaded.
3. **Plesk's "Apache + nginx" hybrid mode interaction.** The site uses
   Plesk's nginx-in-front-of-Apache setup; each layer has its own URL
   rewrite handling and they can fall out of sync after a config
   reload.

## Workarounds in place

The clients we control already retry — the GitHub Actions verify step
(`foundation-docs/.github/workflows/deploy-docs.yml`) now uses
`curl --retry 3 --retry-all-errors --retry-delay 5` plus an outer
30-minute polling window, which is enough to ride out the 404 wave.
But this masks the underlying issue.

## What we don't know

- How exactly the cron driver is set up (it hits
  `/cdcf/v1/process-queue` from `2001:41d0:52:bff::14c`, which is a
  GitHub Actions IPv6 — but no scheduled workflow in our repos calls
  this endpoint, so it must be configured outside the repo, possibly
  via the Plesk panel or a third party).
- Whether the 404 rate correlates with any Plesk maintenance event
  or nginx config reload.
- Plesk's actual nginx + Apache vhost config for `cms.catholicdigitalcommons.org`.

## Suggested next steps

1. Pull the actual nginx config off the production VPS:
   `/etc/nginx/plesk.conf.d/vhosts/cms.catholicdigitalcommons.org*.conf`
   and `/var/www/vhosts/system/cms.catholicdigitalcommons.org/conf/`
2. Check the WordPress permalink rewrite rules are present in the
   `additional nginx directives` slot in the Plesk panel
3. If using Plesk's "static files served by nginx" optimization,
   verify it doesn't intercept `/wp-json/*`
4. Identify the cron driver hitting `/process-queue` (Plesk
   scheduler? Cloudflare Workers? UptimeRobot?) and confirm where it's
   configured

## Severity

Low for now — clients retry, recoveries complete, no end-user impact.
But it's noise in the logs and an availability landmine if/when the
404 rate spikes further.

## Related

- Surfaced during recovery deploys for the foundation-docs translation
  pipeline (see CatholicOS/foundation-docs#22, #23).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nginx intermittently 404s wp-json/* routes (tries to serve as static files) #41

Summary

Evidence

Hypotheses

Workarounds in place

What we don't know

Suggested next steps

Severity

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

nginx intermittently 404s wp-json/* routes (tries to serve as static files) #41

Description

Summary

Evidence

Hypotheses

Workarounds in place

What we don't know

Suggested next steps

Severity

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions