Skip to content

Add FW-Compatible Bare Metal Support#95

Open
JustinKovacich wants to merge 100 commits into
mainfrom
feature/phase17_cleanup
Open

Add FW-Compatible Bare Metal Support#95
JustinKovacich wants to merge 100 commits into
mainfrom
feature/phase17_cleanup

Conversation

@JustinKovacich
Copy link
Copy Markdown
Contributor

@JustinKovacich JustinKovacich commented Apr 28, 2026

This update does a massive refactor on the underlying Simple Someip components to allow compilation on bare metal with no allocations and allowing traits to be used to handle alternative socket implementations.

JustinKovacich and others added 30 commits April 24, 2026 18:40
Covers the send path end-to-end (SOME/IP envelope, SD flags, OfferService
entry fields, and the IPv4 endpoint option) plus session-id advancement
and wrap-through-zero exercised via send_offer_service itself, and a
smoke test for Server::start_announcing. All `#[ignore]`d pending the
loopback MULTICAST-flag fix on this branch; without that fix, hosts
drop the multicast packet silently and the tests time out on recv.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Rustfmt on stable wraps the let-else continue blocks and the
assert_offer_matches signature differently than I hand-wrote.
Let cargo fmt normalize the style.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Per AUTOSAR SOME/IP-SD, the reboot bit on emitted SD messages must
flip from RecentlyRebooted to Continuous once the session counter
wraps past 0xFFFF. SdStateManager already owns the counter, so it
also tracks wrap (new has_wrapped: AtomicBool latched exactly on
the 0xFFFF -> 0x0001 transition) and exposes reboot_flag() as the
single source of truth.

The four SD emission paths — SdStateManager::send_offer_service,
Server::send_unicast_offer, send_subscribe_ack_from_view, and
send_subscribe_nack_from_view — all now consume the tracked flag
instead of hardcoding Flags::new(true, true).

Coverage: four new non-ignored unit tests cover the state machine
(fresh, sub-wrap, exactly-on-wrap, monotonic-after-wrap);
assert_offer_matches takes an expected RebootFlag; the existing
wrap multicast test now asserts first-emit=RecentlyRebooted and
second-emit=Continuous across the boundary.

Responds to PR #75 feedback on src/server/sd_state.rs:90.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Round-2 Copilot review caught a real correctness bug in the prior
reboot-flag refactor: the four SD emission paths read
`reboot_flag()` BEFORE advancing the session counter, so the
message whose session_id crosses 0xFFFF -> 0x0001 (where
`has_wrapped` actually latches) still advertises
`RebootFlag::RecentlyRebooted`. The flip only lands on the NEXT
emission — violating AUTOSAR SOME/IP-SD semantics that say the
wrap message itself should carry `Continuous`.

Reordered in all four sites: call `next_session_id()` first so
`has_wrapped` latches, then read `reboot_flag()` for this specific
message. Sites:

- `SdStateManager::send_offer_service` (sd_state.rs)
- `Server::send_unicast_offer` (mod.rs)
- `Server::send_subscribe_ack_from_view` (mod.rs)
- `Server::send_subscribe_nack_from_view` (mod.rs)

Added short comments at each site pointing at the canonical
ordering note on `send_offer_service`.

Also reworded the multicast-loopback `#[ignore]` comment block and
per-test message to remove the stale branch-name reference
(`feature/firmware_someip_conversion`) — the underlying dependency
is the `lo` MULTICAST flag, not a branch-specific fix. New wording
says "skipped on hosts whose `lo` lacks the MULTICAST flag" with
the `ip link show lo` diagnostic pointer.

Coverage: the existing ignore-gated wrap test
`send_offer_service_wraps_session_id_through_zero_on_send` already
asserts the pre-wrap/post-wrap flag transition on-the-wire; with
the ordering fix it now passes in environments that run ignored
tests (would have FAILED before this commit — which is why the
bug slipped past the first round). The non-ignored state-machine
tests (`reboot_flag_flips_to_continuous_exactly_on_wrap` et al.)
are unaffected and still green.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot round-3 flagged the `#[ignore]` reason on
start_announcing_emits_first_offer_within_timeout for still
carrying a branch-specific phrase ("re-enable after lo fix on
this branch"), which becomes stale once merged. Replaced with
a durable prerequisite description:

  requires loopback multicast support (MULTICAST on lo)

Matches the companion rewording in a638a4b on the sd_state.rs
multicast-loopback harness comment block.

Addresses Copilot comment 3132878961.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- inner.rs: on request_queue overflow, reject the ControlMessage by
  sending Err(Error::Capacity("request_queue")) on every oneshot it
  carries (via new ControlMessage::reject_with_capacity helper) rather
  than silently dropping it — the drop previously cancelled the oneshot
  and panicked callers awaiting with .unwrap().
- inner.rs: on pending_responses.insert saturation, destructure the
  returned (request_id, response) and send
  Err(Error::Capacity("pending_responses")) on the response sender;
  previously the sender was dropped, panicking PendingResponse::response.
- mod.rs: update the send_to_service saturation doc to match the new
  explicit-error behavior.
- mod.rs: fix the module-header footprint doc — SESSION_CAP lives in
  session.rs, not inner.rs.
- session.rs: gate the saturation warn! behind a one-shot saturation_warned
  latch; previously the warning fired on every check() against every new
  key once the map was full, which meant log-spam at the packet rate.
- error.rs: drop #[non_exhaustive] on client::Error; downstream crates
  relying on exhaustive matches shouldn't silently break before a planned
  breaking release.

New tests: reject_with_capacity_notifies_every_sender covers every
ControlMessage variant (including SendToService's dual senders);
capacity_overflow_warns_only_on_first_hit locks in the saturation latch.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Commit cb1d0d1 added explicit capacity-error delivery when
pending_responses.insert overflows — without the explicit Err send,
the dropped Sender would cause PendingResponse::response().await to
panic on RecvError instead of surfacing a clean Result.

The recovery logic was inline in the SendToService run-loop arm,
which made it hard to exercise without driving 64 live sockets. Lift
it to a small `track_or_reject_pending_response` helper on Inner so
the SendToService arm delegates, and the branch is testable against
the exposed `pending_responses` map directly.

Added two tests:
- `track_or_reject_pending_response_inserts_when_room_available`:
  happy path — entry lands in the map, the sender is still live
  (receiver stays pending).
- `track_or_reject_pending_response_rejects_on_saturation`: fill
  map to PENDING_RESPONSES_CAP, invoke the helper with one more
  request, assert the map is unchanged and the caller's receiver
  resolves to Err(Error::Capacity("pending_responses")) — which is
  the invariant cb1d0d1 guarantees.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Round-2 Copilot feedback on PR #76 caught 5 resolvable issues (6th
rejected — see below):

- src/client/inner.rs: the three `push_front(...).ok()` re-enqueue
  sites (SetInterface x2, SendSD, Subscribe) silently dropped the
  returned Err. Swapped each to `if let Err(rejected) = ... { ...
  rejected.reject_with_capacity("request_queue"); }`, matching the
  primary `push_back` overflow arm. These branches are defensive —
  by construction the slot we just popped is free — but a future
  refactor that changes queue usage would otherwise reintroduce the
  silent-drop-of-oneshot-senders regression cb1d0d1 was specifically
  written to prevent.

- src/server/subscription_manager.rs: `list.push(...).ok()` on a
  freshly-allocated `SubscribersList` (first subscriber in a new
  event group) swapped to `.expect(...)` with a message naming the
  `SUBSCRIBERS_PER_GROUP >= 1` invariant. Tripwires a future cap-0
  regression at test time instead of silently losing the only
  subscriber.

- src/client/session.rs: reword the `SessionTracker` doc comment's
  reference to non-existent "module docs" — point directly at the
  `SESSION_CAP` constant instead.

Rejected: src/client/error.rs `Capacity` variant as a breaking
exhaustive-match break. This is consistent with the earlier
decision recorded on #75/#80 to accept the breaking change rather
than carry deprecated shims or wrap everything in `Error::Io` —
the current release is a breaking version bump by design, and
hiding the variant in `Io` would lose the lowercase-snake_case tag
semantics the new error was specifically designed to carry.

No production behavior changes — just defensive tripwires, docs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Address two clippy::pedantic warnings introduced by this branch's
own commits:

- missing_panics_doc on SubscriptionManager::subscribe — the
  heapless::Vec::push.expect on the first-insert path can only
  trip if SUBSCRIBERS_PER_GROUP is set to zero; document that.
- cast_possible_truncation on 'i as u32' in the saturation test
  for pending_responses — use u32::try_from with an expect that
  documents why the 64-cap fits.
Copilot round-3 on PR #76 (comment 3138669839) caught a real gap
in `track_or_reject_pending_response`: the helper only handled
`Err((key, value))` (map-full + new-key), but `heapless::IndexMap::
insert` on an existing key returns `Ok(Some(old_sender))` and the
old sender was silently dropped. If `request_id` is ever reused
while an older pending entry is still live — the documented
example is `session_counter` wrap-around — the caller awaiting
the original request would see a `RecvError` on channel
cancellation, which `PendingResponse::response()` turns into a
panic.

Reworked the `if let Err(...)` into a full `match` with all
three arms:

- `Ok(None)`            — normal insert, nothing to do.
- `Ok(Some(displaced))` — `warn!` that the slot was replaced,
                          complete the displaced sender with
                          `Err(Error::Capacity("pending_responses"))`
                          so the original caller gets a clean
                          `Result`, not a `RecvError` panic.
- `Err((_req_id, r))`   — existing saturation path (unchanged).

Also updated the doc comment on `track_or_reject_pending_response`
to describe the displacement contract.

Coverage: new test
`track_or_reject_pending_response_completes_displaced_sender`
explicitly inserts the same key twice and asserts the first
receiver resolves to `Err(Error::Capacity("pending_responses"))`
before the second sender still sits pending in the map.

Addresses Copilot comment 3138669839.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The old docstring explained why client::Error is not #[non_exhaustive]
but didn't acknowledge that adding variants to a non-#[non_exhaustive]
enum is itself a breaking change for downstream exhaustive matches.
Rewrite the note to call that out explicitly and flag future
#[non_exhaustive] as a planned breaking-release change, and record the
new Capacity variant under the Unreleased Added section.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SubscriptionManager::subscribe previously returned () and logged a
warn! on capacity failure, which let handle_sd_message send
SubscribeAck to a client whose subscription had silently been dropped
— the client would believe it was subscribed but never receive any
events.

- subscribe now returns Result<(), SubscribeError> with
  SubscribersPerGroupFull / EventGroupsFull variants; existing
  capacity/refresh semantics are unchanged.
- handle_sd_message inspects the result and emits SubscribeNack with a
  reason derived from the SubscribeError variant on rejection.
- EventPublisher::register_subscriber surfaces the same Result so
  external SD dispatchers can take the same corrective action.
- SubscribeError is re-exported from server::mod.
- Tests updated to consume the Result; subscription_manager overflow
  tests now assert the specific error variant.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ording

- session::check: expand the doc comment with an explicit "Capacity
  behavior" section — the saturation path does not update stored state
  for new keys, which contradicts the old "Always updates the stored
  state" wording. Describe the actual behavior (normal path updates,
  new-key inserts under saturation are silently dropped and return
  Initial repeatedly) so callers don't rely on a strict guarantee the
  implementation doesn't hold.
- session::check saturation warn!: include sender + transport in the
  log line so diagnosing which peer lost reboot-detection state no
  longer requires out-of-band correlation.
- SubscriptionManager::subscribe dedup branch: rename the log from
  "Refreshed existing subscriber" to "already subscribed; skipping"
  and add a comment making it explicit that no per-subscriber state
  is modified. The old wording implied a refresh (TTL bump, etc) the
  code never performed.
- inner.rs test comment: tokio::sync::oneshot doesn't drop the sender
  when the receiver is dropped — it flips send() to a failure mode.
  Rewrite the comment to describe the actual stash purpose (keep
  channels open for later observation).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- server::run NACK arm: stop allocating a String with format!() held
  across the SubscribeNack await; log the SubscribeError at warn!
  separately and pass a static "subscription rejected" reason string
  to send_subscribe_nack_from_view. Cleaner and avoids the borrow-
  across-await style issue Copilot flagged.
- EventPublisher::register_subscriber: add # Errors section describing
  the two SubscribeError variants, that the subscriber is NOT
  registered on Err, and that external dispatchers should NACK on Err
  just like the server's own run() loop does.
- CHANGELOG: add server::SubscribeError to Added, and list the
  breaking signature changes on SubscriptionManager::subscribe and
  EventPublisher::register_subscriber under Changed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ubscribe docs

- CHANGELOG.md: drop the two "Changed" bullets about `std` becoming the
  default feature and `thiserror`/`tracing` moving to
  `default-features = false`. Those landed in 0.6.0/0.6.1 (commit
  f161980, already on main) and are not changes made in this PR.
- server/mod.rs: when a SubscribeError rejects a subscription, match on
  the variant and pass a specific `&'static str` reason to
  `send_subscribe_nack_from_view` (`"subscribers_per_group_full"` /
  `"event_groups_full"`) instead of the generic
  `"subscription rejected"`. The NACK log line now reflects the real
  cause, and the static-str choice avoids any String-across-await
  allocation.
- server/subscription_manager.rs: rewrite the `subscribe` docs to say
  the duplicate path is idempotent / deduplicated rather than implying
  TTL-refresh semantics that don't exist today. Flag the future-TTL
  extension point explicitly so the doc and the log wording stay in
  sync if that behavior is added later.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Commit 343da67 added a `required_size() > UDP_BUFFER_SIZE` pre-check
to `EventPublisher::publish_event` but left the new branch uncovered.
Regression guard added as `publish_event_pre_encode_exceeds_udp_buffer_returns_capacity_error`:
registers a subscriber (the pre-check sits after the `subscribers.is_empty()`
early return, so the test needs one or else hits the false-positive
Ok(0) path), constructs a 1501-byte fixture (16-byte header + 1485-byte
payload, one over the cap), calls publish_event, asserts
Err(Error::Capacity("udp_buffer")). Mirrors the fixture pattern from
`send_raw_message_exceeding_udp_buffer_returns_capacity_error` on the
client side.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- src/lib.rs: UDP_BUFFER_SIZE doc now enumerates exactly which send
  paths honor this cap (SocketManager::send, publish_event,
  publish_raw_event) and explicitly calls out the SD announcement /
  SubscribeAck / SubscribeNack paths that still use heap Vec buffers
  as a known gap planned for the bare-metal no_alloc refactor.
- src/server/event_publisher.rs: reworded "stack buffer sized to MTU"
  comments at the two buffer-allocation sites — the buffer lives in
  the async future's state, not literally on the stack, and the cap
  is a UDP payload limit, not an Ethernet MTU. New wording points at
  the UDP_BUFFER_SIZE docs for the distinction.

The two `use std::vec` comments (event_publisher.rs:335,
socket_manager.rs:362) were verified to be false positives: removing
the imports breaks the lib-test build with 4 errors about `vec!`
macro not in scope. Same no_std mechanics as the prior #75-1
resolution — reply posted on the comment threads.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Three doc-only Copilot round-3 nits on PR #77:

- src/client/mod.rs: the per-`SocketManager` memory-footprint blurb
  implied the second `[u8; UDP_BUFFER_SIZE]` is always allocated.
  In fact `socket_loop_future` only allocates that scratch buffer
  when the send path actually needs E2E protection (the destination
  key is in the `E2ERegistry`); plain sends pay only ~1.5 KiB.
  Reworded the always-live vs peak budget so the ~24 KiB number is
  no longer presented as the steady-state cost.

- src/client/socket_manager.rs: the E2E-overflow test comment said
  "8 over MTU", but `UDP_BUFFER_SIZE` is documented as a UDP payload
  cap, not an Ethernet-MTU-safe size. Reworded to "8 bytes over
  UDP_BUFFER_SIZE".

- src/lib.rs: the `UDP_BUFFER_SIZE` doc referenced bare
  `Error::Capacity("udp_buffer")`, which is ambiguous at the crate
  root (no `crate::Error` exists). Qualified to
  `client::Error::Capacity(...)` / `server::Error::Capacity(...)`.

Addresses Copilot comments 3138697909, 3138698042, 3138698156.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…event

Adds a unit test that registers a Profile4 E2E profile for the message
key and publishes a message whose raw encoded size fits UDP_BUFFER_SIZE
(1496 bytes) but whose protected size does not (1508 bytes, after the
12-byte Profile4 header). Asserts that publish_event returns
Error::Capacity("udp_buffer") — exercising the post-protect guard that
was previously only covered on the client send path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- publish_event: log now says "E2E-protected datagram (... header +
  protected payload)" so the 16+protected_len value is identified as
  the full SOME/IP datagram size, not the payload.
- test fixture comment: "8 over MTU" → "8 bytes over UDP_BUFFER_SIZE"
  for terminology consistency with the rest of the PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…_len

`header_len + payload.len()` used unchecked `usize` addition. On a
system with large enough `payload.len()` the sum can wrap, silently
bypassing the `> UDP_BUFFER_SIZE` guard and corrupting the slice
operations that follow. Switch to `checked_add` and treat overflow the
same as exceeding `UDP_BUFFER_SIZE` — return
`Error::Capacity("udp_buffer")`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Client/server "E2E-protected payload" logs now say
  "E2E-protected datagram (header + protected payload)" since the
  logged value is the full SOME/IP datagram size, not the payload.
- publish_raw_event checked_add-fail log now describes the real
  condition (usize overflow) and includes the input lengths, instead
  of falsely pointing at UDP_BUFFER_SIZE.
- Oversize fixtures in socket_manager + event_publisher tests are now
  sized from UDP_BUFFER_SIZE (and PROFILE4_HEADER_SIZE implicitly for
  the E2E case) instead of hardcoded 1480/1485. Fixtures stay valid
  if the cap is retuned.
- client/mod.rs memory-footprint doc no longer hardcodes "(1500
  bytes)" or "~1.5 KiB" as load-bearing numbers; the scaling is now
  expressed in terms of UDP_BUFFER_SIZE with the current-default
  numbers as a parenthetical reference.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The module-level "# `Send` and multithreaded executors" section showed
a HRTB bound on `<T as TransportSocketSendFut<'a>>::Fut: Send` as the
way consumers should bind `Send`. No such trait exists in this crate
— with RPITIT the returned future type is anonymous and cannot be
named, and introducing a GAT-style escape hatch would pollute the
trait for the common single-threaded case.

Replaced with the reviewer-preferred pattern: wrap the call in an
`async move` block and require `T: Send + 'static` on the captured
state. A tokio-backed implementation whose underlying `UdpSocket` is
already `Send + Sync` produces `Send` futures automatically via
async-block capture inference, so no trait-level bound is required.
Implementations holding `!Send` state fail the `T: Send` bound at
the `tokio::spawn` call site, which is the actionable location.

Docs-only change; `cargo test --doc` passes on the new ignore-fenced
example.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 28, 2026 19:28
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 51 out of 53 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/traits.rs
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 51 out of 53 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/static_channels_alloc_witness.rs
Comment thread tests/no_alloc_witness.rs
Comment thread CHANGELOG.md Outdated
Comment thread src/tokio_transport.rs Outdated
- cargo fmt: remove extra blank lines left by deleted deliver_inbound blocks
- static_channels_alloc_witness: fix typo "heap-back" -> "heap-backed"
- no_alloc_witness: doc says "panic"; impl actually calls process::abort()
- CHANGELOG: bare_metal feature desc incorrectly listed EmbassySyncChannels;
  EmbassySyncChannels is gated by embassy_channels (which implies bare_metal)
- CHANGELOG: document Server::unicast_local_addr breaking return-type change
  (Result<_, std::io::Error> -> Result<_, server::Error>)
- tokio_transport: bind impl missing explicit + Send; add for clarity
- tokio_transport: comment said bare_metal gates embassy_channels module;
  correct to embassy_channels feature
- event_publisher: MAX_FANOUT duplicated SUBSCRIBERS_PER_GROUP; remove
  MAX_FANOUT and use pub(crate) SUBSCRIBERS_PER_GROUP from subscription_manager

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 51 out of 53 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/no_alloc_witness.rs
Comment thread src/tokio_transport.rs
…9 Low)

Critical:
- C1: gate StaticE2EHandle/StaticE2EStorage behind cfg(all(bare_metal, std));
  AtomicInterfaceHandle remains no_std. cargo build --no-default-features
  --features bare_metal now compiles. CI gate added.
- C2: bump version to 0.8.0 so cargo-semver-checks classifies the breaking
  changes correctly; adds matching CHANGELOG section header.
- C3: fix static-pool first-claim race in OneshotPool/MpscPool ensure_seeded
  (concurrent first claimers no longer panic with "pool exhausted"). New
  regression test asserts 4 concurrent first claims all succeed.

High:
- H1: replace single-slot AtomicWaker with MultiWakerRegistration<8> on the
  bounded send-close path in both static_channels and embassy_channels;
  cloned senders blocked on a full channel are all woken on receiver drop.
  New regression test covers multi-sender wake.
- H2: pack (session_id, has_wrapped) into one AtomicU32 in SdStateManager;
  concurrent emitters around the 0xFFFF -> 0x0001 wrap boundary can no
  longer disagree. New stress test runs 32 concurrent emitters across 20
  trials and asserts the (sid, flag) invariant.
- H3: handle_sd_message now rolls back a committed subscription when the
  ACK send fails, and never propagates transient SD-socket I/O errors via
  ?, so a single SD hiccup cannot tear down run().
- H4: announcement_loop is now idempotent — second call returns
  Error::Io(InvalidInput) via an AtomicBool latch.
- H5: validate event_group_id against ServerConfig::event_group_ids in the
  Subscribe handler; unknown groups now NACK with "unknown_event_group"
  instead of being silently ACKed (opt-in via populated Vec).
- H6: convert Timer::sleep and TransportFactory::bind to GAT-based
  associated future types. Multi-threaded callers add
  for<'a> F::BindFuture<'a>: Send + for<'a> Tm::SleepFuture<'a>: Send;
  bare-metal !Send backends are no longer blocked. TokioTransport gets
  named TokioBindFuture and TokioSleep; tests use BoxFuture / Ready.
- H7: SocketOptions::multicast_loop_v4 is now Option<bool>. Pinning an
  outbound interface no longer silently disables IP_MULTICAST_LOOP when
  the caller had no opinion.
- H8: receive_any_unicast and receive_discovery now evict dead socket
  managers (poll_receive returns Ready(None)) instead of busy-looping
  on Error::SocketClosedUnexpectedly.
- H9: re-enqueued Subscribe carries the just-bound unicast_port, so
  pass-2 hits the bind_unicast dedupe instead of leaking another
  ephemeral socket.
- H10: split recv-error counter into transient/fatal classes via
  IoErrorKind::is_transient_recv. Inbound ICMP storms (ConnectionRefused),
  WouldBlock, Interrupted, TimedOut, NetworkUnreachable no longer count
  toward MAX_CONSECUTIVE_RECV_ERRORS. Added IoErrorKind::WouldBlock
  variant.
- H11: rewrite intra-doc links that target feature-gated items as
  code literals. cargo doc with partial feature subsets is now
  warning-free; CI runs --features client and --features server,bare_metal
  doc builds with -D warnings.
- H12: publish_event / publish_raw_event now return Err(Transport(_))
  when every send failed, instead of masking total failure as Ok(0).

Medium:
- M1: rephrase CHANGELOG known-issue bullet to point at .config/nextest.toml
  (which serializes the client_server suite) instead of stale
  --test-threads=1 advice.
- M3: clear stale waker registrations on slot release in OneshotPool /
  MpscPool so the next tenant's first registration cannot poke a
  defunct task.
- M4: Client::set_interface(current_iface) is now a no-op; previously
  it silently bound the discovery socket as a side effect.
- M5: SocketManager::shut_down drains the receiver until None instead
  of returning after one buffered message, ensuring the loop has
  actually dropped the underlying socket before we proceed.
- M6: drop dead "overflow" branch in publish_event / publish_raw_event
  and add a const_assert tying the snapshot buffer cap to
  SUBSCRIBERS_PER_GROUP.
- M8: document that register_e2e / unregister_e2e bypass the run-loop
  control channel and are therefore not subject to Error::Shutdown.
- M9: Inner SendToService advances session_counter only on Ok send,
  so transient transport failure cannot chew through 16-bit space.
- M10: lib.rs feature table now spells out that bare_metal alone is
  no_std-friendly, StaticE2EHandle additionally requires std, and
  embassy_channels users on no_std must wire up #[global_allocator].
- M11/M13: rewrite client::Error::Capacity tag list with one-line
  semantics for each tag and a note that "udp_buffer" can fire
  post-E2E-protect.

Low:
- AtomicInterfaceHandle uses Release/Acquire instead of Relaxed.
- TokioSpawner::spawn wraps its future in catch_unwind and
  tracing::error!-logs panics so they are visible in the operator's
  log pipeline.
- IoErrorKind::WouldBlock added; map_io_error routes std::io::ErrorKind::WouldBlock to it.
- StaticUnboundedSender::send_now docstring documents the unified
  Err(value) for "closed" vs "full".
- no_alloc_witness ARMED uses Acquire load (matches SeqCst stores)
  for weak-memory correctness.
- transport.rs:1056 stale ControlMessage link rewritten as code literal.

Deferred (with rationale documented in code/CHANGELOG):
- M2 Client run-loop alloc witness — needs a custom no-alloc spawner
  harness; the existing static_channels_alloc_witness covers the
  channel layer.
- L: configurable client_id, session_id move out of SocketManager,
  drop unused ChannelFactory bounds, route MTU through
  max_datagram_size — substantive API changes flagged for follow-up.

Verification:
- cargo fmt --check clean
- cargo clippy --all-features --all-targets -- -D warnings -D clippy::pedantic clean
- cargo doc --no-deps --all-features and partial-feature subsets clean
- cargo nextest run --all-features: 524/524 pass, 8 skipped
- cargo semver-checks check-release: no semver update required (0.7.0 -> 0.8.0)
- 13-config build matrix: all green, including standalone bare_metal

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ture

The previous commit landed the GAT-based future types on TransportFactory
and Timer (H6 from the adversarial review), but missed the workspace
example crates: cargo clippy --workspace --all-features (CI's command)
exercises every workspace member, including the example binaries, so
their mock TransportFactory / Timer impls also need the new associated
types.

Also adds the new ServerConfig::event_group_ids field (H5) to the
client_server example via struct-update syntax over ServerConfig::new.

Verified locally:
- cargo build --workspace --all-features clean
- cargo clippy --workspace --all-features -- -D warnings -D clippy::pedantic clean
- cargo clippy --no-default-features -- -D warnings -D clippy::pedantic clean
- cargo fmt --all --check clean

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 50 out of 53 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread examples/bare_metal_server/src/main.rs
Comment thread examples/bare_metal_client/src/main.rs
Comment thread examples/bare_metal_client/src/main.rs
Comment thread examples/bare_metal_server/src/main.rs
JustinKovacich and others added 2 commits April 28, 2026 17:36
Adds targeted unit tests for the five higher-leverage behaviors that
landed in this PR's adversarial-review pass without coverage:

- H10 (`io_error_kind_transient_classification`): pure-function check
  that `IoErrorKind::is_transient_recv` returns true for
  `ConnectionRefused` / `NetworkUnreachable` / `WouldBlock` /
  `Interrupted` / `TimedOut`, and false for `PermissionDenied` /
  `Other`. Locks in the classification driving the recv-loop
  fatal-error counter.

- H5 (`server_config_accepts_event_group_*`): two pure-function tests
  on `ServerConfig::accepts_event_group` — empty `event_group_ids`
  accepts any group (back-compat), populated `event_group_ids`
  validates strictly.

- H4 (`announcement_loop_second_call_returns_invalid_input`): builds a
  Server, calls `announcement_loop()` twice, asserts the second call
  returns `Err(Error::Io(InvalidInput))` with "already started" in the
  diagnostic. Prevents regressions of the AtomicBool latch.

- H12 (`publish_event_returns_err_when_every_send_fails`,
  `publish_raw_event_returns_err_when_every_send_fails`): mock
  `TransportSocket` whose `send_to` always returns
  `Err(NetworkUnreachable)` / `Err(ConnectionRefused)`, registers a
  subscriber, calls `publish_event` / `publish_raw_event`, asserts the
  result is `Err(Transport(Io(_)))` rather than the previous
  `Ok(0)` masking total failure.

- H3 (`handle_sd_message_rolls_back_subscription_on_failed_ack_send`):
  builds a Server via `new_with_deps` with a `FailingFactory` whose
  sockets always fail `send_to`. Drives a Subscribe through
  `handle_sd_message` and asserts the function returns `Ok(())` (the
  H3 fix log-and-continues instead of propagating via `?`) AND the
  subscription manager has been rolled back to 0 entries.

Other already-covered behaviors (C3, H1, H2) had regression tests
added in the previous commit; remaining client-side gaps (H8, H9, M4,
M9) are deferred — each needs a sizable Client + mock harness and is
better addressed as a separate test-infrastructure task.

Verification:
- cargo test --lib --all-features: 510 pass (was 503; +7 new tests)
- cargo nextest run --all-features: 531/531 pass, 8 skipped
- cargo clippy --workspace --all-features -- -D warnings -D clippy::pedantic clean
- cargo clippy --no-default-features -- -D warnings -D clippy::pedantic clean
- cargo fmt --all --check clean

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Both files were swept in by a `git add -A` in the previous commit and
should not have been part of this PR's review-fix scope:

- `.vscode/settings.json` — per-developer editor auto-approve list
- `.github/copilot-instructions.md` — Copilot guidance file (also stale
  with respect to this PR's behavior changes; not the right time to
  refresh it)

Removed via `git rm --cached`, so local working copies are preserved.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 28, 2026 21:36
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 50 out of 53 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/no_alloc_witness.rs
Two unaddressed comments on PR #95 after the previous round of fixes:

- `src/tokio_transport.rs:411-414` (TokioChannels docstring): said
  active under `client` or `server` features, but the entire
  `tokio_transport` module is `#[cfg(any(feature = "client-tokio",
  feature = "server-tokio"))]`. Reword to name the actual gating
  features and clarify the bare `client`/`server` features only
  expose the trait surface.

- `tests/no_alloc_witness.rs:81` (cosmetic eprintln! trailing comma):
  the trailing comma after the format string is valid Rust but
  Copilot flagged it as syntactic noise. Removed for readability.

Other comments in the latest batch were either already addressed or
made obsolete by my prior commits:

- The four "TransportFactory/Timer trait surface mismatch" comments on
  examples/bare_metal_{client,server}/src/main.rs (Copilot 21:32-21:33Z)
  pre-dated commit fe618cf, which ported those exact mocks to the new
  GAT pattern. Verified current state — the examples now match the
  trait signature.
- All comments from earlier in the day already carry "Fixed" replies
  from a previous round.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot review on #95 flagged that `assert_no_alloc`'s doc still said
forbidden allocations "panic" and exit with a non-zero status. The
implementation actually routes through `diagnose_and_abort`, which
disarms the allocator, writes the diagnostic to stderr, and then
calls `std::process::abort()` — no unwinding. (Panicking would
re-allocate via the panic-unwind machinery and re-trip the trap,
which is exactly why we abort instead.)

Updated the docstring to match the abort semantics so CI failures
are interpreted correctly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@JustinKovacich JustinKovacich marked this pull request as ready for review April 28, 2026 22:34
@JustinKovacich JustinKovacich requested a review from zheylmun April 28, 2026 22:34
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 50 out of 53 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +317 to +327
let run_handle = tokio::spawn(run_fut);

// After construction (and after a yield to give the spawn loop a
// chance to do its one-time setup allocs), measure pure-handle
// operations under the serialization lock.
tokio::task::yield_now().await;
let _guard = MEASURE_LOCK.lock().unwrap();
let baseline = alloc_count();
for _ in 0..16 {
assert_eq!(client.interface(), Ipv4Addr::LOCALHOST);
}
@JustinKovacich
Copy link
Copy Markdown
Contributor Author

Will a user be able to compile this on bare metal? What do adoption steps look like?

● Honest answer: partially yes, with significant gaps. Let me be specific about what works today and what doesn't.

What compiles on a true no_std target today

cargo build --no-default-features --features bare_metal:

  • ✅ protocol::* — full SOME/IP wire format encode/decode (zero-alloc)
  • ✅ e2e::{E2EChecker, E2EProtector, E2EProfile, ...} — the per-message protect/check operations
  • ✅ transport::* traits (TransportSocket, TransportFactory, Timer, Spawner, LocalSpawner, ChannelFactory, InterfaceHandle, E2ERegistryHandle)
  • ✅ static_channels::* no-alloc channel pool primitives + define_static_channels! macro
  • ✅ AtomicInterfaceHandle
  • ✅ traits::{WireFormat, PayloadWireFormat}

What does NOT compile on a true no_std target today

Look at Cargo.toml:
client = ["std", "dep:futures"]
server = ["std", "dep:futures"]

The client and server features force std. So:

  • ❌ Client::new_with_deps / Client::new_with_deps_local — requires std
  • ❌ Server::new_with_deps — requires std
  • ❌ EventPublisher — requires std
  • ❌ SubscriptionManager — requires std (uses std::sync internally even though most state is heapless)
  • ❌ E2ERegistry — uses HashMap, fundamentally std
  • ❌ StaticE2EHandle — gated on + std for the above reason

The bare_metal feature is honest about its contribution (no-alloc channels, atomic interface handle), but the Client/Server engines themselves are not bare-metal-ready today. A cortex-m / RISC-V
firmware target without std cannot drive a Client run-loop.

Adoption paths today

Path A — std-on-embedded host (Linux on Cortex-A, Hifive Unmatched, container):
simple-someip = { version = "0.8", default-features = false,
features = ["client", "bare_metal"] }
embassy-sync = "0.6"

  1. Implement TransportFactory + TransportSocket over your UDP stack (smoltcp, embassy-net, raw socket, etc.).
  2. Implement Timer over your runtime's sleep.
  3. Implement Spawner (or LocalSpawner for single-task).
  4. define_static_channels! { name: MyChannels, oneshot: [...], bounded: [...], unbounded: [...] } sized to your workload's high-water mark.
  5. Provide an E2ERegistryHandle (use StaticE2EHandle if std is on, or provide your own).
  6. Build a ClientDeps, call Client::<MyPayload, _, _, MyChannels>::new_with_deps(deps, false).
  7. spawner.spawn(run_fut) and proceed.

This works. examples/bare_metal_client/ demonstrates it end-to-end (the example name is somewhat misleading — it's actually "trait-surface bare-metal-shaped, but std-hosted").

Path B — true no_std cortex-m / RISC-V firmware:
You can use the protocol layer + traits + static_channels, but you have to write your own session loop because Client/Server are unreachable. Concretely:

  1. Use protocol::Header + Message to encode outgoing requests.
  2. Use protocol::MessageView + e2e::E2EChecker to parse and verify incoming bytes.
  3. Use static_channels-backed OneshotPool / MpscPool directly to wire up request→response correlation.
  4. Drive everything from your own embassy_executor task.

You're getting wire-format library + channel primitives but not the orchestration layer. That's still useful for many firmware embedders, but it's not "drop in Client::new_with_deps and you're done."

Path C — workspace example as a starting template:
examples/bare_metal_client/src/main.rs shows the full Path A wiring including custom MockFactory, MockSocket, MockTimer, Spawner — copy-pasting these and replacing the mock with a real network backend
is the realistic starting point.

What's missing for true no_std cortex-m

To actually support the cortex-m no_std story, the work needed is:

  1. Port E2ERegistry off HashMap to heapless::FnvIndexMap (~1 day).
  2. Drop std from client / server features (loosen lock-handle traits and SubscriptionManager defaults; ~3-5 days, mostly hunting std::sync references and replacing with the trait abstractions).
  3. Add a no_std CI gate (cargo build --target thumbv7em-none-eabihf --no-default-features --features client,bare_metal) to keep regressions out.

That's cortex-m readiness — phase 18 worth of work; the trait-surface scaffolding and the static-pool channels are in place, but the Client engine itself still requires std.

Bottom line for a prospective adopter

  • "My target is Linux on Cortex-A and I want zero-alloc after init": ✅ Works today via Path A. Roughly half a day of integration work for the four trait impls.
  • "My target is no_std cortex-m and I want to drop in Client::new_with_deps": ❌ Not today. You can use the protocol library and channel primitives but must write your own state machine.
  • "My target is no_std cortex-m and I want the full Client engine": That's a future port; the trait surface from this PR makes it tractable but the work isn't done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants