Skip to content

Conversation

@harp-intel
Copy link
Contributor

@harp-intel harp-intel commented Dec 18, 2025

This pull request introduces several improvements and refactorings to signal handling, duration flag behavior, and command execution throughout the codebase. The most significant changes are the addition of a robust signal handler to ensure proper cleanup of child processes (especially when running in parallel or on remote targets), and updates to how duration flags are handled for indefinite runs. Additionally, the interface for running commands on targets has been simplified.

Signal Handling and Process Cleanup:

  • Added a new configureSignalHandler function in internal/common/common.go to catch SIGINT and SIGTERM, send signals to child processes and remote collection scripts, and wait for cleanup before exiting. This ensures proper resource management, especially when running scripts in parallel or remotely. [1] [2]

  • Integrated the signal handler into the main data collection flow, ensuring it is set up when collecting from targets.

Duration Flag and Indefinite Collection:

  • Changed the default value of the duration flag in both flamegraph and telemetry commands from 30 to 0, allowing for indefinite collection by default. Updated help text to clarify that a duration of 0 means the collection will run until interrupted. [1] [2] [3]

  • Updated validation logic to allow a duration of 0 (indefinite), and removed the restriction that prevented indefinite duration on remote targets for telemetry. [1] [2]

  • Enhanced the collection status messaging to indicate when Ctrl+C can be used to stop an indefinite collection. [1] [2]

Command Execution Interface Simplification:

  • Refactored calls to RunCommand throughout the codebase by removing unused parameters, simplifying the interface for running commands on targets. This affects several files, including cmd/metrics/metadata.go, cmd/metrics/nmi_watchdog.go, cmd/metrics/process.go, and internal/common/targets.go. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]

Code Organization and Imports:

  • Added missing imports (context, os/exec, time) required for new signal handling and process management functionality. [1] [2]

Other Minor Refactoring:

  • Cleaned up the initialization of variables in internal/common/common.go for improved readability.

These changes collectively improve the robustness of signal handling, make it easier to run indefinite collections, and simplify the code for executing commands on targets.

Signed-off-by: Harper, Jason M <jason.m.harper@intel.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request enables indefinite duration telemetry collection on remote targets by improving signal handling and process management. The key changes include:

  • Implementing robust signal handling to properly propagate SIGINT/SIGTERM to child processes, particularly for parallel script execution via parallel_master.sh
  • Simplifying the command execution API by introducing separate RunCommand and RunCommandEx methods, removing unused parameters from numerous call sites
  • Changing the default telemetry duration from 30 seconds to 0 (indefinite) and relaxing validation requirements for remote target collection

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
internal/target/target.go Splits RunCommand interface into simple and extended versions, adds newProcessGroup parameter
internal/target/remote_target.go Implements new RunCommand and RunCommandEx methods, updates all call sites to use simplified API
internal/target/local_target.go Implements new RunCommand and RunCommandEx methods with process group support
internal/target/helpers.go Adds newProcessGroup parameter to command execution helpers, implements process group isolation via Setpgid
internal/script/script.go Updates script execution to use RunCommandEx with new process group for parallel_master.sh, adds PID file creation
internal/common/targets.go Updates all RunCommand call sites to use simplified API without unused parameters
internal/common/common.go Adds configureSignalHandler function for robust signal propagation and cleanup, integrates with reporting workflow
cmd/telemetry/telemetry.go Changes default duration from 30 to 0 seconds, removes validation requiring positive duration for remote targets
cmd/metrics/process.go Updates RunCommand call sites to use simplified API
cmd/metrics/nmi_watchdog.go Updates RunCommand call sites to use simplified API
cmd/metrics/metadata.go Updates RunCommand call sites to use simplified API

Signed-off-by: Harper, Jason M <jason.m.harper@intel.com>
Signed-off-by: Harper, Jason M <jason.m.harper@intel.com>
…ut duration

Signed-off-by: Harper, Jason M <jason.m.harper@intel.com>
Signed-off-by: Harper, Jason M <jason.m.harper@intel.com>
Signed-off-by: Harper, Jason M <jason.m.harper@intel.com>
@harp-intel harp-intel changed the title enable indefinite duration telemetry collection on remote targets enable indefinite duration telemetry and flamegraph collection on remote targets Dec 22, 2025
@harp-intel harp-intel requested a review from Copilot December 22, 2025 17:39
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.

Signed-off-by: Harper, Jason M <jason.m.harper@intel.com>
Signed-off-by: Harper, Jason M <jason.m.harper@intel.com>
Signed-off-by: Harper, Jason M <jason.m.harper@intel.com>
Signed-off-by: Harper, Jason M <jason.m.harper@intel.com>
Signed-off-by: Harper, Jason M <jason.m.harper@intel.com>
Signed-off-by: Harper, Jason M <jason.m.harper@intel.com>
@harp-intel harp-intel merged commit c4c7f11 into main Dec 24, 2025
5 checks passed
@harp-intel harp-intel deleted the any_duration_telemetry branch December 24, 2025 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enable unlimited duration for telemetry command when collecting from remote targets

2 participants