Skip to content

Add SLO workflow for the Java SDK#644

Open
polRk wants to merge 1 commit intomasterfrom
slo-tests
Open

Add SLO workflow for the Java SDK#644
polRk wants to merge 1 commit intomasterfrom
slo-tests

Conversation

@polRk
Copy link
Copy Markdown
Member

@polRk polRk commented Apr 26, 2026

Summary

Adds two GitHub Actions workflows that drive ydb-platform/ydb-slo-action against the Java SDK. The actual workload sources live in ydb-java-examples (companion PR: ydb-platform/ydb-java-examples#61).

Workflows

.github/workflows/slo.yml

  • Builds Docker images for the current PR and the baseline (auto-detected as merge-base with master, overridable via workflow_dispatch).
  • Hands both images to ydb-slo-action/init@v2 with KV read/write RPS flags.
  • Triggers: PRs labelled SLO, push to master, manual workflow_dispatch.
  • Concurrency-grouped per ref + matrix entry, with cancel-in-progress.
  • Currently a single matrix entry (java-query-kv); easy to extend to other workloads later.

.github/workflows/slo-report.yml

  • Listens via workflow_run for the SLO workflow to finish.
  • On success, calls ydb-slo-action/report@v2 to post the comparison report to the PR.
  • Removes the SLO label from the PR after publishing.

Build script: .github/scripts/build-slo-image.sh

The Java SDK under test cannot be assumed to be published to a remote Maven repository, so the script:

  1. Assembles a temporary build context with ydb-java-sdk/ and ydb-java-examples/ checkouts side by side.
  2. Feeds that context to the Dockerfile shipped in ydb-java-examples/slo/.
  3. The multi-stage Dockerfile then builds the SDK from source, installs it into an in-image local Maven repo, and pins ydb.sdk.version in the examples parent pom to that version before building the workload.

The script accepts a --fallback-image flag for the baseline build (mirroring the equivalent script in ydb-go-sdk) so a historical SDK commit that no longer compiles doesn't break the SLO run.

Triggering convention

Same as in ydb-go-sdk and ydb-js-sdk: a PR opts into SLO testing by getting the SLO label. The label is removed automatically once the report is posted.

Local verification

The workload image was built from the same script and end-to-end tested against ydb-slo-action/deploy/compose.yml (1 storage + 5 database + Prometheus + Grafana + chaos-monkey). Results, including chaos-induced ydb/transport_unavailable errors and a clean p99 latency measurement, are documented in the companion PR.

Companion PR

Workload sources: ydb-platform/ydb-java-examples#61.

Add two GitHub Actions workflows that drive ydb-platform/ydb-slo-action
against the Java SDK:

- slo.yml: builds Docker images for the current PR and the merge-base
  baseline, then hands them to ydb-slo-action/init@v2 with KV
  read/write RPS flags. Triggered on PRs with the `SLO` label, on
  push to master and via workflow_dispatch.
- slo-report.yml: waits on the SLO workflow via workflow_run,
  publishes the comparison report through ydb-slo-action/report@v2,
  and removes the `SLO` label from the PR.

The workload sources live in ydb-platform/ydb-java-examples, in a
new `slo` Maven module. The build script in this commit
(.github/scripts/build-slo-image.sh) assembles a temporary build
context with both checkouts side by side and feeds it to the
Dockerfile shipped with the workload, so the Java SDK under test is
built from source and pinned into the workload build without ever
needing to publish snapshots.

The workflow checks out ydb-java-examples at `master` by default;
`workflow_dispatch` exposes an `examples_ref` input for testing
against unmerged workload changes.
Copilot AI review requested due to automatic review settings April 26, 2026 10:48
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds GitHub Actions automation to run YDB SLO comparisons for the Java SDK by building “current” and “baseline” workload images and publishing a report back to the originating PR.

Changes:

  • Added an SLO workflow to build workload images for current and baseline refs and run ydb-platform/ydb-slo-action/init@v2.
  • Added a workflow_run-based report workflow to publish results via ydb-platform/ydb-slo-action/report@v2 and remove the SLO label.
  • Added a build helper script to assemble a combined Docker build context from ydb-java-sdk and ydb-java-examples.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

File Description
.github/workflows/slo.yml Defines the SLO run pipeline: dependency setup, baseline selection, image builds, and init@v2 execution.
.github/workflows/slo-report.yml Publishes the SLO comparison report after successful runs and removes the SLO label.
.github/scripts/build-slo-image.sh Builds the workload image from two adjacent checkouts and supports a fallback image for baseline failures.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .github/workflows/slo.yml

jobs:
ydb-slo-action:
if: contains(github.event.pull_request.labels.*.name, 'SLO') || github.event_name == 'workflow_dispatch' || github.event_name == 'push'
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The job-level if: evaluates contains(github.event.pull_request.labels.*.name, 'SLO') even on push and workflow_dispatch events where github.event.pull_request is null. This can cause an expression evaluation error and skip/break the workflow. Guard the label check behind an github.event_name == 'pull_request' condition (or restructure the expression so it only references github.event.pull_request for PR events).

Suggested change
if: contains(github.event.pull_request.labels.*.name, 'SLO') || github.event_name == 'workflow_dispatch' || github.event_name == 'push'
if: github.event_name == 'workflow_dispatch' || github.event_name == 'push' || (github.event_name == 'pull_request' && contains(github.event.pull_request.labels.*.name, 'SLO'))

Copilot uses AI. Check for mistakes.
Comment thread .github/workflows/slo.yml
Comment on lines +154 to +157
uses: ydb-platform/ydb-slo-action/init@v2
timeout-minutes: 30
with:
github_issue: ${{ github.event.inputs.github_issue }}
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

github_issue is only provided for workflow_dispatch, but this workflow also runs on pull_request/push. For those events, ${{ github.event.inputs.github_issue }} will be empty, which is likely to break ydb-slo-action/init@v2 (and makes push runs impossible to report anywhere). Consider deriving it from the PR number when event_name == 'pull_request', and either require an explicit input (or skip reporting) on push runs.

Suggested change
uses: ydb-platform/ydb-slo-action/init@v2
timeout-minutes: 30
with:
github_issue: ${{ github.event.inputs.github_issue }}
if: ${{ github.event_name == 'workflow_dispatch' || github.event_name == 'pull_request' }}
uses: ydb-platform/ydb-slo-action/init@v2
timeout-minutes: 30
with:
github_issue: ${{ github.event_name == 'workflow_dispatch' && github.event.inputs.github_issue || github.event.pull_request.number }}

Copilot uses AI. Check for mistakes.
Comment thread .github/workflows/slo.yml
Comment on lines +97 to +100
run: |
set -euo pipefail
if [[ -n "${{ inputs.baseline_ref }}" ]]; then
BASELINE="${{ inputs.baseline_ref }}"
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential bash injection: ${{ inputs.baseline_ref }} is interpolated directly into the shell script inside double quotes. A workflow_dispatch input containing a " or newline could break quoting and inject shell syntax. Safer pattern is to pass the input via env: and read it as a normal shell variable (still quoted), or use ${{ toJSON(inputs.baseline_ref) }} style escaping when embedding into bash.

Suggested change
run: |
set -euo pipefail
if [[ -n "${{ inputs.baseline_ref }}" ]]; then
BASELINE="${{ inputs.baseline_ref }}"
env:
INPUT_BASELINE_REF: ${{ inputs.baseline_ref }}
run: |
set -euo pipefail
if [[ -n "${INPUT_BASELINE_REF}" ]]; then
BASELINE="${INPUT_BASELINE_REF}"

Copilot uses AI. Check for mistakes.
Comment thread .github/workflows/slo.yml
Comment thread .github/workflows/slo.yml
Comment on lines +69 to +70
sudo curl -L "https://github.com/mikefarah/yq/releases/download/${YQ_VERSION}/yq_linux_amd64" \
-o /usr/local/bin/yq
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curl for downloading yq does not use -f/--fail, so a 404/redirect-to-HTML could silently write an invalid binary and only fail later in a less obvious way. Use curl -fLo (consistent with the buildx/compose downloads below) to fail fast on HTTP errors.

Suggested change
sudo curl -L "https://github.com/mikefarah/yq/releases/download/${YQ_VERSION}/yq_linux_amd64" \
-o /usr/local/bin/yq
sudo curl -fLo /usr/local/bin/yq \
"https://github.com/mikefarah/yq/releases/download/${YQ_VERSION}/yq_linux_amd64"

Copilot uses AI. Check for mistakes.
Comment on lines +10 to +23
publish-slo-report:
runs-on: ubuntu-latest
name: Publish YDB SLO Report
permissions:
checks: write
contents: read
pull-requests: write
if: github.event.workflow_run.conclusion == 'success'
steps:
- name: Publish YDB SLO Report
uses: ydb-platform/ydb-slo-action/report@v2
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
github_run_id: ${{ github.event.workflow_run.id }}
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This workflow triggers on every successful SLO run, including push and workflow_dispatch. For non-PR runs there may be no associated PR to comment on, and ydb-slo-action/report@v2 may fail or post nowhere. Consider gating publish-slo-report on github.event.workflow_run.event == 'pull_request' (or otherwise handling the non-PR reporting path explicitly).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for testing only

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.83%. Comparing base (8556216) to head (248305a).

Additional details and impacted files
@@             Coverage Diff              @@
##             master     #644      +/-   ##
============================================
- Coverage     70.87%   70.83%   -0.04%     
+ Complexity     3314     3310       -4     
============================================
  Files           374      374              
  Lines         15699    15699              
  Branches       1650     1650              
============================================
- Hits          11126    11120       -6     
- Misses         3931     3934       +3     
- Partials        642      645       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants