Skip to content

Add raw argument collections for URL-encoded parameters (#3501)#3564

Open
Easton97-Jens wants to merge 7 commits into
owasp-modsecurity:v3/masterfrom
Easton97-Jens:v3/master_pr3501
Open

Add raw argument collections for URL-encoded parameters (#3501)#3564
Easton97-Jens wants to merge 7 commits into
owasp-modsecurity:v3/masterfrom
Easton97-Jens:v3/master_pr3501

Conversation

@Easton97-Jens
Copy link
Copy Markdown
Contributor

Fixes #3501

what

This PR adds raw argument collections for URL-encoded GET and POST parameters.

New collections:

  • ARGS_RAW
  • ARGS_GET_RAW
  • ARGS_POST_RAW
  • ARGS_NAMES_RAW
  • ARGS_GET_NAMES_RAW
  • ARGS_POST_NAMES_RAW

The raw values are captured in Transaction::extractArguments(...) before
utils::urldecode_nonstrict_inplace(...) is applied. Existing ARGS,
ARGS_GET, ARGS_POST, and ARGS_NAMES behavior remains unchanged.

This also includes parser/scanner updates needed to recognize the new
collections and keeps the generated parser sources compatible with modern
Bison/Flex versions.

why

Issue #3501 describes problems caused by relying only on already-decoded
argument collections. In cases such as double-encoded input, rules may need
access to the original parameter name or value before libModSecurity performs
its internal URL decoding.

Adding explicit raw collections allows rules to inspect the pre-decoded
URL-encoded input without changing the behavior of existing collections.

testing

  • Added regression coverage for raw GET and POST arguments
  • Verified encoded names and values remain raw in the new collections
  • Verified existing decoded ARGS* collections still behave as before
  • Verified duplicate arguments and empty values are preserved
  • Verified parser/scanner generation succeeds with Bison/Flex

Easton97-Jens and others added 6 commits May 13, 2026 22:22
Removed a large block of variable evaluation logic and added new variable cases for ARGS_GET_NAMES_RAW and ARGS_POST_NAMES_RAW.
…r-security-hotspots

Use direct empty-string checks instead of `strlen(...) > 0`
@airween airween added the 3.x Related to ModSecurity version 3.x label May 14, 2026
@airween airween requested a review from Copilot May 14, 2026 13:16
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds raw (pre-URL-decoded) argument collections (ARGS_RAW, ARGS_GET_RAW, ARGS_POST_RAW, ARGS_NAMES_RAW, ARGS_GET_NAMES_RAW, ARGS_POST_NAMES_RAW) so rules can inspect original encoded names/values, addressing #3501. The collections are populated in Transaction::extractArguments before URL decoding, and the parser/scanner are updated to recognize the new tokens. The Bison parser sources were also regenerated against a newer Bison version.

Changes:

  • New *_raw variable headers and corresponding members in TransactionAnchoredVariables; extractArguments now stores the raw key/value via a new addRawArgument helper before decoding.
  • Parser/scanner (seclang-parser.yy, seclang-scanner.ll) extended with the new RAW tokens and grammar rules; regenerated seclang-parser.hh updated accordingly; obsolete position.hh/stack.hh removed; configure.ac normalizes bison -y to bison.
  • Added regression test JSON variable-ARGS_RAW.json (and registers it in test-suite.in); minor strlen(buf) > 0buf[0] != '\0' cleanups in pm.cc and modsecurity_test.cc.

Reviewed changes

Copilot reviewed 20 out of 23 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
headers/modsecurity/transaction.h Adds raw AnchoredSetVariable/translation-proxy members and constructor wiring.
src/transaction.cc Captures raw key/value and calls addRawArgument after successful addArgument.
src/transaction_raw_args.h New inline helper that sets the raw variables for ARGS_RAW/ARGS_GET_RAW/ARGS_POST_RAW.
src/variables/args_raw.h, args_get_raw.h, args_post_raw.h, args_names_raw.h, args_get_names_raw.h, args_post_names_raw.h New variable definitions for raw collections.
src/variables/variable.h Adds resolution branches for the new RAW collections.
src/args_names_raw.h Misplaced duplicate of src/variables/args_names_raw.h (flagged).
src/parser/seclang-parser.yy Adds RAW tokens/grammar; updates Bison directives and modernization.
src/parser/seclang-parser.hh Regenerated parser header reflecting new tokens and offsets.
src/parser/seclang-scanner.ll Adds scanner rules for the RAW tokens; small ordering fix.
src/parser/position.hh, stack.hh Removed (obsolete with newer Bison).
src/operators/pm.cc Replaces strlen(...)>0 with [0] != '\0'.
test/common/modsecurity_test.cc Same strlen → first-byte check cleanup.
test/test-cases/regression/variable-ARGS_RAW.json New regression suite covering RAW collections.
test/test-suite.in Registers the new regression test file.
configure.ac Normalizes YACC value when autoconf returns bison -y.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/args_names_raw.h Outdated
@airween
Copy link
Copy Markdown
Member

airween commented May 14, 2026

@Easton97-Jens, thanks again this great PR.

Please take a look at Copilot's suggestions.

Beside of the suggestions, I have a feeling that we should check this modification with a "real-world" example, I mean add an Nginx build to the CI workflow, and try these variables through it. I hope we won't be surprised (eg. the server converts the ARGS and the library will never access them... or similar).

First, let me check this (don't need to add any extra tests).

@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3.x Related to ModSecurity version 3.x

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request: Add ARGS_RAW, ARGS_GET_RAW, ARGS_POST_RAW, and ARGS_NAMES_RAW Variables

3 participants