Skip to content

Retry initial certs rather than fail. Monitor agent rather than exit.#1721

Merged
terickson-nvidia merged 1 commit into
NVIDIA:mainfrom
terickson-nvidia:tom/otel-agent-service
May 18, 2026
Merged

Retry initial certs rather than fail. Monitor agent rather than exit.#1721
terickson-nvidia merged 1 commit into
NVIDIA:mainfrom
terickson-nvidia:tom/otel-agent-service

Conversation

@terickson-nvidia
Copy link
Copy Markdown
Contributor

Description

Improves the behavior of the otel-agent service by continuing in Running state while retrying initial certs, rather than relying on systemd for retry from failed state.

Type of Change

  • Add - New feature or capability
  • Change - Changes in existing functionality
  • Fix - Bug fixes
  • Remove - Removed features or deprecated functionality
  • Internal - Internal changes (refactoring, tests, docs, etc.)

Related Issues (Optional)

Breaking Changes

  • This PR contains breaking changes

Testing

  • Unit tests added/updated
  • Integration tests added/updated
  • Manual testing performed
  • No testing required (docs, internal refactor, etc.)

Verified that sudo crictl ps reported the container as "Running" before and after providing initial certs. Verified that after sudo systemctl stop otel-agent.service the container was no longer listed. Verified that sudo systemctl status reported the expected state before and after systemctl start and systemctl stop.

Additional Notes

@terickson-nvidia terickson-nvidia requested a review from a team as a code owner May 15, 2026 18:39
Copy link
Copy Markdown
Contributor

@jabdulvahid jabdulvahid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks goog to me.

@terickson-nvidia terickson-nvidia merged commit 55cf83e into NVIDIA:main May 18, 2026
43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants