tests: migrate to data-driven fixtures for feature presence by williballenthin · Pull Request #2986 · mandiant/capa

williballenthin · 2026-04-02T12:12:05Z

Checklist

No CHANGELOG update needed

No new tests needed

No documentation update needed

This submission includes AI-generated code and I have provided details in the description.

github-actions

Please add bug fixes, new features, breaking changes and anything else you think is worthwhile mentioning to the master (unreleased) section of CHANGELOG.md. If no CHANGELOG update is needed add the following to the PR description: [x] No CHANGELOG update needed

tests/fixtures/feature-presence.json

williballenthin · 2026-04-02T12:14:50Z

tests/fixtures.py

+    if key == "basic block":
+        return capa.features.basicblock.BasicBlock()
+
+    if key.startswith("operand[") and key.endswith("].number"):
+        index = int(key[len("operand[") : -len("].number")])
+        return capa.features.insn.OperandNumber(index, int(raw_value, 0))
+
+    if key.startswith("operand[") and key.endswith("].offset"):
+        index = int(key[len("operand[") : -len("].offset")])
+        return capa.features.insn.OperandOffset(index, int(raw_value, 0))
+
+    if key == "regex":
+        return capa.features.common.Regex(raw_value)
+
+    if key in ("number", "offset"):
+        value = int(raw_value, 0)
+    elif key == "bytes":
+        value = capa.rules.parse_bytes(raw_value)
+    else:
+        value = raw_value
+
+    FeatureClass = capa.rules.parse_feature(key)


i wonder if we can/should merge this functionality into capa.rules.parse_feature

gemini-code-assist

Code Review

This pull request migrates test feature definitions from hardcoded Python lists to a centralized fixtures.json file, enabling a more data-driven testing framework. New utility functions, parse_feature_string and _load_feature_tests, were added to handle the dynamic loading and parsing of these tests. Review feedback highlights a bug in regex pattern initialization, the omission of test marks (skips/xfails) from the loading logic, and opportunities to improve performance and maintainability by sorting all test lists and consolidating file path definitions to avoid duplication.

tests/fixtures.py

gemini-code-assist · 2026-04-02T12:14:57Z

tests/fixtures.py

+
+    for entry in data["features"]:
+        feature = parse_feature_string(entry["feature"])
+        test_tuple = (entry["file"], entry["location"], feature, entry["expected"])


The marks field defined in fixtures.json (e.g., lines 95-106) is currently ignored by the loading logic. These marks are intended to skip or xfail tests for specific backends (like idalib or freeze). Ignoring them may lead to unexpected test failures when running against those backends.

gemini-code-assist · 2026-04-02T12:14:57Z

tests/fixtures.py

+    presence_tests.sort(key=lambda t: (t[0], t[1]))
+    return presence_tests, symtab_tests


Sorting the test cases by file and location helps improve performance by making the LRU cache for extractors more effective. This should be applied to symtab_tests as well for consistency.

Suggested change

presence_tests.sort(key=lambda t: (t[0], t[1]))

return presence_tests, symtab_tests

presence_tests.sort(key=lambda t: (t[0], t[1]))

symtab_tests.sort(key=lambda t: (t[0], t[1]))

return presence_tests, symtab_tests

gemini-code-assist · 2026-04-02T12:14:57Z

tests/fixtures.py

+    elif name.startswith("2d3edc"):
+        return CD / "data" / "2d3edc218a90f03089cc01715a9f047f.exe_"


The file path for 2d3edc is now defined in both fixtures.json (lines 64-67) and hardcoded here. To maintain a single source of truth and fully leverage the data-driven approach introduced in this PR, consider refactoring get_data_path_by_name to load paths from the files section of fixtures.json instead of continuing to add hardcoded entries.

wip

github-actions bot requested changes Apr 2, 2026

View reviewed changes

williballenthin commented Apr 2, 2026

View reviewed changes

tests/fixtures/feature-presence.json Show resolved Hide resolved

williballenthin commented Apr 2, 2026

View reviewed changes

gemini-code-assist bot reviewed Apr 2, 2026

View reviewed changes

williballenthin force-pushed the data-fixtures-2 branch from 71b4687 to dc455d1 Compare April 2, 2026 14:06

williballenthin force-pushed the data-fixtures-2 branch 2 times, most recently from 3c61284 to 463ff59 Compare April 13, 2026 14:48

rules: introduce helper to parse features from parts

ecd85b1

williballenthin force-pushed the data-fixtures-2 branch 2 times, most recently from 283ff40 to 6e4bb6b Compare April 14, 2026 12:46

tests: migrate to data-driven fixtures for feature presence

f8d4c8b

wip

williballenthin force-pushed the data-fixtures-2 branch from 6e4bb6b to f8d4c8b Compare April 14, 2026 12:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tests: migrate to data-driven fixtures for feature presence#2986

tests: migrate to data-driven fixtures for feature presence#2986
williballenthin wants to merge 2 commits intomasterfrom
data-fixtures-2

williballenthin commented Apr 2, 2026 •

edited

Loading

Uh oh!

github-actions bot left a comment

Uh oh!

Uh oh!

williballenthin Apr 2, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

gemini-code-assist bot Apr 2, 2026

Uh oh!

gemini-code-assist bot Apr 2, 2026

Uh oh!

gemini-code-assist bot Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		presence_tests.sort(key=lambda t: (t[0], t[1]))
		return presence_tests, symtab_tests

		elif name.startswith("2d3edc"):
		return CD / "data" / "2d3edc218a90f03089cc01715a9f047f.exe_"

Conversation

williballenthin commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

williballenthin Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

gemini-code-assist bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

williballenthin commented Apr 2, 2026 •

edited

Loading