Skip to content

[BUG] Socket UDP --mode both deadlocks on peer learning; only ~1000 packets / 30 s trickle through #75

@RamyaGuru

Description

@RamyaGuru

Symptom

Running daqiri_bench_socket against examples/daqiri_bench_socket_udp_tx_rx.yaml with --mode both --seconds 30 produces ~1000 packets and ~390 kbps effective throughput. Stderr spams the following for the
duration of the run:

[ERROR] /workspace/daqiri/src/managers/socket/daqiri_socket_mgr.cpp:735: UDP server has no learned peer yet; cannot transmit

Reproduced on DGX Spark (GB10) during PR #15 / commit 5e57a5b data-fill on 2026-05-12.

Repro

./build/examples/daqiri_bench_socket \
    examples/daqiri_bench_socket_udp_tx_rx.yaml \
    --seconds 30 --mode both

Root cause

Both server and client workers attempt to send before either has received an inbound packet. UDP is connectionless, so the server has no peer address to send to until the client's first packet arrives. With both
ends spinning in send-then-receive in the same process, only a trickle of packets gets through and the server's [ERROR] log spam masks the real throughput.

Fix sketch (one or both)

  • In the bench (examples/socket_bench.cpp), defer the server's send loop until the first inbound packet has populated the peer address.
  • In the manager (src/managers/socket/daqiri_socket_mgr.cpp:735), downgrade the "no learned peer" log to DEBUG once-per-peer. It's currently CRITICAL/ERROR once per attempt — ~30 lines per second of attempted
    send.

Scope

src/managers/socket/daqiri_socket_mgr.cpp + examples/socket_bench.cpp. Unblocks the Socket UDP rows of docs/performance-dgx-spark.md.

Context

This issue was uncovered during PR #15 verification on DGX Spark. The Socket UDP rows in the performance report are footnoted as _deferred_[^2] pending this fix.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions