Skip to content

Conversation

@smuppand
Copy link
Contributor

This PR updates the Boot Systemd KPI tests to be fully LAVA-friendly by delivering complete, configurable run.sh + YAML test definitions, and aligning CLI options ↔ LAVA variables/env without changing core KPI logic.
Included tests

  • Boot_Systemd_Validate
    • single-boot KPI capture + validation + rich artifacts
    • LAVA YAML added/updated to expose parameters cleanly
  • Boot_Systemd_KPI_Loop
    • multi-boot KPI aggregation wrapper (optional auto-reboot)
    • LAVA YAML added/updated to control iterations, boot type, reboot behavior

@smuppand smuppand force-pushed the systemd-boot branch 2 times, most recently from a396de2 to d26c52c Compare December 22, 2025 16:34
@smuppand smuppand requested a review from vnarapar December 22, 2025 16:37
# Put all CPUs into performance governor, saving previous governor for restore.
# Uses SAVED_GOV_FILE (auto set if not provided).
set_performance_governor() {
SAVED_GOV_FILE="${SAVED_GOV_FILE:-/tmp/perf_saved_governors.$$}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if its already set to performance? should we check that first and return if its already set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if its already set to performance? should we check that first and return if its already set?

QLI uses the default arm64 debug defconfig, which does not set the performance governor by default.


# Put all CPUs into performance governor, saving previous governor for restore.
# Uses SAVED_GOV_FILE (auto set if not provided).
set_performance_governor() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this fnction be made generic to userspace/shedutil aswell?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this fnction be made generic to userspace/shedutil aswell?

Wht do you mean generic? It is already generic, and run.sh can import these functions by calling lib_performance.sh.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i meant to have a function which will set to performance/userspace or shedutil based on the argument instead of having for performance. Let me know your thought?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i meant to have a function which will set to performance/userspace or shedutil based on the argument instead of having for performance. Let me know your thought?

We can handle this in a separate PR. You can create a GitHub issue ticket for this requirement.

if [ "$disable_getty" = "1" ]; then
systemctl disable serial-getty@ttyS0.service >/dev/null 2>&1 || true
systemctl stop serial-getty@ttyS0.service >/dev/null 2>&1 || true
if command -v log_info >/dev/null 2>&1; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be removed to optimize script as log_warn/log_info will always be there and if not execution wont reach till here

timeout="$1"

if ! command -v systemctl >/dev/null 2>&1; then
if command -v log_warn >/dev/null 2>&1; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be removed to optimize script as log_warn/log_info will always be there and if not execution wont reach till here

ts=$(nowstamp)
echo "$ts,$bt,$PERF_KPI_ITERATIONS_HINT,$PERF_KPI_CLOCKSOURCE,$PERF_KPI_UEFI_TIME_SEC,$PERF_KPI_FIRMWARE_SEC,$PERF_KPI_BOOTLOADER_SEC,$PERF_KPI_KERNEL_SEC,$PERF_KPI_USERSPACE_SEC,$PERF_KPI_USERSPACE_EFFECTIVE_SEC,$PERF_KPI_BOOT_TOTAL_SEC,$PERF_KPI_BOOT_TOTAL_EFFECTIVE_SEC" >>"$csv" 2>/dev/null || true

if command -v log_info >/dev/null 2>&1; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be removed to optimize script as log_warn/log_info will always be there and if not execution wont reach till here

perf_kpi_request_reboot() {
msg=$1

if command -v log_info >/dev/null 2>&1; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be removed to optimize script as log_warn/log_info will always be there and if not execution wont reach till here


# If we are still alive after a short delay, try once more.
sleep 5
if command -v systemctl >/dev/null 2>&1; then
Copy link
Contributor

@vnarapar vnarapar Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we try to run in loop with timeout/iteration instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we try to run in loop with timeout/iteration instead?

This function is not completely optimized since full optimization isn't necessary for Boot_Systemd_Validate.

fi

# Boot-id changed → reboot successful, just log it and clear pending flag in state.
if command -v log_info >/dev/null 2>&1; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be removed to optimize script as log_warn/log_info will always be there and if not execution wont reach till here

log_warn "Unknown option: $1"
usage >&2
echo "$TESTNAME FAIL" >"$RES_FILE"
exit 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use exit 0 only?

mkdir -p "$OUT_DIR" || {
log_error "Cannot create $OUT_DIR"
echo "$TESTNAME FAIL" >"$RES_FILE"
exit 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update to exit 0


# --- ensure CPU governors restored on exit ---
cleanup() {
restore_governor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check if this is reallyrequired as we are not setting the performance governer above


BOOT_NOT_FINISHED=0

if command -v systemd-analyze >/dev/null 2>&1; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is already checked in check_dependencies and test will be skipped if not present. So this check can be removed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There’s no harm in checking since it won’t be redundant.

fi

# ---------- Failed units + journal ----------
if command -v systemctl >/dev/null 2>&1; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see there is a lot of systemctl dependency in the script. why not check in check_dependency and skip the test if not available?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There’s no harm in checking since it won’t be redundant.

# ---------- required units gating ----------
suite_rc=0
if [ -n "$REQ_UNITS_FILE" ]; then
if command -v systemctl >/dev/null 2>&1; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to check_dependencies and remove here

Expose wait_analyze_ready() to poll systemd-analyze time safely.
Record systemctl list-jobs whenever boot is not yet finished.
Allow configurable timeout and poll interval via environment variables.
Prepare shared helpers for reuse across performance KPI test suites.

Signed-off-by: Srikanth Muppandam <smuppand@qti.qualcomm.com>
- Replace the ad-hoc systemd-analyze polling loop with the shared helper.
- Capture list-jobs when boot is unfinished and log this explicitly.
- Honor extended boot times on slow platforms before parsing KPIs.
- Preserve existing outputs while avoiding misleading 'boot not finished' results.

Signed-off-by: Srikanth Muppandam <smuppand@qti.qualcomm.com>
Switch auto-reboot orchestration to a oneshot systemd service + timer.
Let the loop manage iterations/state while keeping the boot path clean.
Ensure the KPI service exits quickly so FinishTimestampMonotonic is not blocked.
Document usage while preserving behavior for manual single-shot runs.

Signed-off-by: Srikanth Muppandam <smuppand@qti.qualcomm.com>
Copy link
Contributor

@vnarapar vnarapar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vnarapar vnarapar merged commit 69d4fbb into qualcomm-linux:main Dec 26, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants