Skip to content

fix(authkit): decode HTML-encoded query strings on incoming password reset requests#19

Closed
rahulacaleffi wants to merge 4 commits into
mainfrom
fix/password-reset-url-html-normalize
Closed

fix(authkit): decode HTML-encoded query strings on incoming password reset requests#19
rahulacaleffi wants to merge 4 commits into
mainfrom
fix/password-reset-url-html-normalize

Conversation

@rahulacaleffi
Copy link
Copy Markdown
Collaborator

@rahulacaleffi rahulacaleffi commented May 14, 2026

Fixes issue #259 in the triage spreadsheet.

Summary

Password reset links emailed through WorkOS can arrive at wp-login.php with HTML-encoded ampersands as parameter separators (?workos_action=reset-password&profile=default&token=…).

When a user clicks such a link, PHP's query-string parser treats & as a literal 5-character sequence and produces $_GET['amp;token'] instead of $_GET['token'], leaving reset_token empty and silently stranding the user on the login screen.

Root cause:
WorkOS HTML-encodes the reset URL when embedding it into the email template, inserting it as an href attribute with & escaped to & (standard HTML practice). Most email clients decode this transparently on click. However, when the link is forwarded through security gateways (e.g. Microsoft SafeLinks) or otherwise handled as raw HTML, the literal & can survive into the browser's address bar. PHP then misparses the query string, producing $_GET['amp;token'] instead of $_GET['token'].

Fix
LoginTakeover::maybe_takeover() now calls normalize_query_string() as its first action (before any $_GET reads). If $_SERVER['QUERY_STRING'] contains & or &, it decodes and re-parses the string, merging the correctly-named keys back into $_GET. Existing, correctly-parsed keys are preserved; the stale amp;token / amp;profile keys remain but are harmless.

Scoped to wp-login.php only - the normaliser never runs on other pages.

Tests
Adds 4 wpunit regression tests covering:

  • & encoded separators → reset-confirm step is reached with correct token
  • & numeric encoded separators → same
  • Clean query string → no-op (no context mutation)
  • Encoded separators present but no token → reset-confirm step not started

@rahulacaleffi rahulacaleffi requested a review from bordoni May 14, 2026 23:12
@rahulacaleffi rahulacaleffi self-assigned this May 14, 2026
@rahulacaleffi rahulacaleffi requested a review from redscar May 14, 2026 23:13
Comment on lines +213 to +222
$raw = \wp_unslash( $_SERVER['QUERY_STRING'] ?? '' ); // phpcs:ignore WordPress.Security.ValidatedSanitizedInput.InputNotSanitized
if ( '' === $raw ) {
return;
}
if ( ! str_contains( $raw, '&' ) && ! str_contains( $raw, '&' ) ) {
return;
}
$decoded = html_entity_decode( $raw, ENT_QUOTES | ENT_HTML5, 'UTF-8' );
parse_str( $decoded, $params );
$_GET = array_merge( $_GET, $params ); // phpcs:ignore WordPress.Security.NonceVerification.Recommended
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code could get some use. Its a security nightmare.

@redscar
Copy link
Copy Markdown
Collaborator

redscar commented May 15, 2026

Wanted to drop a discussion point before merging — not sure this is the direction we want either, just want to put it on the table.

What's bothering me about the current shape is that maybe_takeover() is on login_init, so normalize_query_string() runs on every hit to wp-login.php — normal logins, logout, register, lost-password, fallback, confirmaction — even though only the reset-password callback ever needs the repair. And once it does run, it mutates $_GET globally for the rest of the request, so anything that reads $_GET after login_init priority 5 — audit plugins, third-party hooks, WP core code — sees a rewritten request with no way to tell it was rewritten. The html_entity_decode + parse_str + array_merge chain also feels like more machinery than the bug calls for.

Alternative worth chewing on: instead of rewriting $_GET in place, 302 to a canonical URL once. Detect the mangled shape at the entry point, replace & / & / & / & with & in the raw query string, redirect to the same path with the cleaned query. Browser re-requests, PHP parses cleanly, every downstream consumer (our code, WP core, third-party plugins) sees a normal request — no special handling anywhere else.

Rough sketch:

private function maybe_canonicalize_query(): bool {
    if ( 'GET' !== ( \$_SERVER['REQUEST_METHOD'] ?? '' ) ) {
        return false;
    }

    \$raw = (string) wp_unslash( \$_SERVER['QUERY_STRING'] ?? '' );
    if ( false === strpos( \$raw, '&' ) && false === strpos( \$raw, '&#' ) ) {
        return false;
    }

    // Only canonicalize when an entity-mangled \`workos_action\` is present.
    // If \`workos_action\` parsed cleanly, nothing's broken. If it's absent
    // in every form, this request isn't ours to touch.
    \$is_workos = false;
    foreach ( \$_GET as \$key => \$_ ) {
        if ( 'workos_action' === \$key ) {
            return false;
        }
        if ( is_string( \$key ) && preg_match( '/^[A-Za-z0-9#]+;workos_action\$/', \$key ) ) {
            \$is_workos = true;
            break;
        }
    }
    if ( ! \$is_workos ) {
        return false;
    }

    \$canonical = strtr( \$raw, [
        '&'  => '&',
        '&' => '&',
        '&'  => '&',
        '&' => '&',
    ] );
    if ( \$canonical === \$raw ) {
        return false;
    }

    \$path = strtok( (string) wp_unslash( \$_SERVER['REQUEST_URI'] ?? '' ), '?' );
    nocache_headers();
    wp_safe_redirect( \$path . '?' . \$canonical, 302 );
    exit;
}

Then `maybe_takeover()` just calls it first:

public function maybe_takeover(): void {
    \$this->maybe_canonicalize_query();  // exits on redirect

    if ( ! \$this->should_takeover() ) {
        return;
    }
    // ... rest unchanged
}

Trade-off is one extra 302 on the malformed case — which is currently a broken-page experience for the user, so net UX improves. Cost on every clean request is one `strpos` and an early return, basically free. Side benefit: the browser address bar ends up showing the correct URL, so reloads / bookmarks / shares work.

Could be totally wrong about this — happy to stay with an in-place fix if there's a reason a redirect would be worse here. Just wanted to surface it because mutating `$_GET` for every `wp-login.php` hit was the bit that felt off.

@rahulacaleffi
Copy link
Copy Markdown
Collaborator Author

Implemented the replacement of the in-place $_GET mutation with a one-time 302 to a cleaned URL when wp-login.php is hit with HTML entity-encoded separators (&, &, &, &).

@redscar , this aligns with your cleaner suggestion, with one deviation: the detection gate looks at the raw query string only, not at whether workos_action was parsed cleanly. The failing URLs always have workos_action as the first param, so PHP parses it correctly even when profile/token are mangled, so gating on a clean workos_action wouldn't trigger the fix.

@rahulacaleffi rahulacaleffi requested a review from bordoni May 15, 2026 16:19
@rahulacaleffi
Copy link
Copy Markdown
Collaborator Author

Closing PR due to another fix already implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants