Skip to content

Conversation

@w0rk3r
Copy link
Contributor

@w0rk3r w0rk3r commented Dec 26, 2025

Proposed commit message

windows: refine PowerShell script entropy pipeline

Replace code-point HashMap counting with a fixed 65k UTF-16 char histogram
and skip truncated signature fragments before entropy is computed. Add a
normalized entropy field scaled by script length (0–1).

Summary

Related issue:

This PR:

  • Replaces code‑point HashMap counting with a fixed 65k UTF‑16 char histogram for script entropy, reducing the script processor time and improving eps (2924 → 4873 eps in warm run).
  • Skips truncated signature fragments before entropy is computed.
  • Adds powershell.file.script_block_entropy_normalized = entropy_bits / log2(script_block_length) (0–1).
  • Adds benchmark fixtures to track performance regressions during our research.

Old pipeline:

image

Improved pipeline:

image
Complete benchmark output

Old:

PS C:\Users\Jonhnathan\Documents\Github\integrations\packages\windows> .\..\..\elastic-package.exe benchmark pipeline --data-streams powershell_operational --use-test-samples=false
Run pipeline benchmarks for the package
--- Benchmark results for package: windows - START ---
╭─────────────────────────╮
│ parameters              │
├──────────────────┬──────┤
│ source_doc_count │   11 │
│ doc_count        │ 2500 │
╰──────────────────┴──────╯
╭───────────────────────────╮
│ pipeline_performance      │
├─────────────────┬─────────┤
│ processing_time │   1.10s │
│ eps             │ 2278.94 │
╰─────────────────┴─────────╯
╭────────────────────────────────────────╮
│ procs_by_total_time                    │
├───────────────────────────────┬────────┤
│ script @ default.yml:322      │ 47.49% │
│ gsub @ default.yml:305        │ 30.36% │
│ fingerprint @ default.yml:311 │  3.19% │
│ set @ default.yml:60          │  2.10% │
│ script @ default.yml:13       │  1.82% │
│ gsub @ default.yml:316        │  1.09% │
│ script @ default.yml:30       │  1.00% │
│ remove @ default.yml:575      │  0.55% │
│ rename @ default.yml:290      │  0.18% │
│ trim @ default.yml:302        │  0.18% │
╰───────────────────────────────┴────────╯
╭─────────────────────────────────────────╮
│ procs_by_avg_time_per_doc               │
├───────────────────────────────┬─────────┤
│ script @ default.yml:322      │ 208.4µs │
│ gsub @ default.yml:305        │ 133.2µs │
│ fingerprint @ default.yml:311 │    14µs │
│ set @ default.yml:60          │   9.2µs │
│ script @ default.yml:13       │     8µs │
│ gsub @ default.yml:316        │   4.8µs │
│ script @ default.yml:30       │   4.4µs │
│ remove @ default.yml:575      │   2.4µs │
│ rename @ default.yml:290      │   800ns │
│ trim @ default.yml:302        │   800ns │
╰───────────────────────────────┴─────────╯

--- Benchmark results for package: windows - END   ---
Done
--- Benchmark results for package: windows - START ---
╭─────────────────────────╮
│ parameters              │
├──────────────────┬──────┤
│ source_doc_count │   11 │
│ doc_count        │ 2500 │
╰──────────────────┴──────╯
╭───────────────────────────╮
│ pipeline_performance      │
├─────────────────┬─────────┤
│ processing_time │   0.85s │
│ eps             │ 2923.98 │
╰─────────────────┴─────────╯
╭────────────────────────────────────────╮
│ procs_by_total_time                    │
├───────────────────────────────┬────────┤
│ script @ default.yml:322      │ 50.53% │
│ gsub @ default.yml:305        │ 34.15% │
│ fingerprint @ default.yml:311 │  2.57% │
│ gsub @ default.yml:316        │  1.17% │
│ script @ default.yml:13       │  0.70% │
│ set @ default.yml:60          │  0.58% │
│ remove @ default.yml:575      │  0.35% │
│ script @ default.yml:30       │  0.35% │
│ rename @ default.yml:290      │  0.12% │
╰───────────────────────────────┴────────╯
╭─────────────────────────────────────────╮
│ procs_by_avg_time_per_doc               │
├───────────────────────────────┬─────────┤
│ script @ default.yml:322      │ 172.8µs │
│ gsub @ default.yml:305        │ 116.8µs │
│ fingerprint @ default.yml:311 │   8.8µs │
│ gsub @ default.yml:316        │     4µs │
│ script @ default.yml:13       │   2.4µs │
│ set @ default.yml:60          │     2µs │
│ remove @ default.yml:575      │   1.2µs │
│ script @ default.yml:30       │   1.2µs │
│ rename @ default.yml:290      │   400ns │
╰───────────────────────────────┴─────────╯

--- Benchmark results for package: windows - END   ---
Done

Improved:

PS C:\Users\Jonhnathan\Documents\Github\integrations\packages\windows> .\..\..\elastic-package.exe benchmark pipeline --data-streams powershell_operational --use-test-samples=false
Run pipeline benchmarks for the package
--- Benchmark results for package: windows - START ---
╭─────────────────────────╮
│ parameters              │
├──────────────────┬──────┤
│ source_doc_count │   11 │
│ doc_count        │ 2500 │
╰──────────────────┴──────╯
╭───────────────────────────╮
│ pipeline_performance      │
├─────────────────┬─────────┤
│ processing_time │   0.51s │
│ eps             │ 4892.37 │
╰─────────────────┴─────────╯
╭────────────────────────────────────────╮
│ procs_by_total_time                    │
├───────────────────────────────┬────────┤
│ gsub @ default.yml:305        │ 55.19% │
│ script @ default.yml:322      │ 28.18% │
│ fingerprint @ default.yml:311 │  4.11% │
│ gsub @ default.yml:316        │  1.96% │
│ script @ default.yml:13       │  0.59% │
│ remove @ default.yml:657      │  0.39% │
│ rename @ default.yml:290      │  0.20% │
│ set @ default.yml:60          │  0.20% │
╰───────────────────────────────┴────────╯
╭─────────────────────────────────────────╮
│ procs_by_avg_time_per_doc               │
├───────────────────────────────┬─────────┤
│ gsub @ default.yml:305        │ 112.8µs │
│ script @ default.yml:322      │  57.6µs │
│ fingerprint @ default.yml:311 │   8.4µs │
│ gsub @ default.yml:316        │     4µs │
│ script @ default.yml:13       │   1.2µs │
│ remove @ default.yml:657      │   800ns │
│ rename @ default.yml:290      │   400ns │
│ set @ default.yml:60          │   400ns │
╰───────────────────────────────┴─────────╯

--- Benchmark results for package: windows - END   ---
Done
--- Benchmark results for package: windows - START ---
╭─────────────────────────╮
│ parameters              │
├──────────────────┬──────┤
│ source_doc_count │   11 │
│ doc_count        │ 2500 │
╰──────────────────┴──────╯
╭───────────────────────────╮
│ pipeline_performance      │
├─────────────────┬─────────┤
│ processing_time │   0.51s │
│ eps             │ 4873.29 │
╰─────────────────┴─────────╯
╭────────────────────────────────────────╮
│ procs_by_total_time                    │
├───────────────────────────────┬────────┤
│ gsub @ default.yml:305        │ 57.89% │
│ script @ default.yml:322      │ 25.93% │
│ fingerprint @ default.yml:311 │  3.51% │
│ gsub @ default.yml:316        │  1.95% │
│ script @ default.yml:13       │  0.78% │
│ remove @ default.yml:657      │  0.39% │
│ set @ default.yml:60          │  0.19% │
╰───────────────────────────────┴────────╯
╭─────────────────────────────────────────╮
│ procs_by_avg_time_per_doc               │
├───────────────────────────────┬─────────┤
│ gsub @ default.yml:305        │ 118.8µs │
│ script @ default.yml:322      │  53.2µs │
│ fingerprint @ default.yml:311 │   7.2µs │
│ gsub @ default.yml:316        │     4µs │
│ script @ default.yml:13       │   1.6µs │
│ remove @ default.yml:657      │   800ns │
│ set @ default.yml:60          │   400ns │
╰───────────────────────────────┴─────────╯

--- Benchmark results for package: windows - END   ---
Done

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.
  • I have verified that any added dashboard complies with Kibana's Dashboard good practices

@w0rk3r w0rk3r self-assigned this Dec 26, 2025
@w0rk3r w0rk3r requested review from a team as code owners December 26, 2025 22:21
@w0rk3r w0rk3r added enhancement New feature or request Integration:windows Windows Team:Security-Windows Platform Security Windows Platform team [elastic/sec-windows-platform] labels Dec 26, 2025
@w0rk3r w0rk3r requested review from faec and mauri870 December 26, 2025 22:21
@elasticmachine
Copy link

Pinging @elastic/sec-windows-platform (Team:Security-Windows Platform)

@elasticmachine
Copy link

elasticmachine commented Dec 26, 2025

💔 Build Failed

Failed CI Steps

History

cc @w0rk3r

@pierrehilbert pierrehilbert added the Team:Elastic-Agent-Data-Plane Agent Data Plane team [elastic/elastic-agent-data-plane] label Jan 4, 2026
@elasticmachine
Copy link

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@mauri870 mauri870 self-requested a review January 5, 2026 12:15
Copy link
Member

@mauri870 mauri870 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I'm not very proficient with PowerShell. The code looks fine, but it needs a deeper look from the Windows team.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request Integration:windows Windows Team:Elastic-Agent-Data-Plane Agent Data Plane team [elastic/elastic-agent-data-plane] Team:Security-Windows Platform Security Windows Platform team [elastic/sec-windows-platform]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants