Skip to content

14 cross-service protocol linkers (PR 2/5)#377

Draft
Shidfar wants to merge 7 commits into
DeusData:mainfrom
hodizoda:oss/pr2-protocol-linkers
Draft

14 cross-service protocol linkers (PR 2/5)#377
Shidfar wants to merge 7 commits into
DeusData:mainfrom
hodizoda:oss/pr2-protocol-linkers

Conversation

@Shidfar
Copy link
Copy Markdown

@Shidfar Shidfar commented May 26, 2026

Summary

Adds the 14 protocol linkers that populate the empty dispatch table introduced by #376.

Stacked on #376 — please review that first. Until #376 merges, this PR's diff includes both #376 and PR 2 commits; it'll narrow to PR-2-only after #376 lands.

Commits

  1. feat: add GraphQL and gRPC protocol linkers
  2. feat: add Kafka, SQS, SNS, and EventBridge protocol linkers
  3. feat: add Pub/Sub, RabbitMQ, MQTT, NATS, and Redis Pub/Sub linkers
  4. feat: add WebSocket, SSE, and tRPC protocol linkers
  5. build: wire 14 protocol linkers into pipeline — restores the full LINKERS table in pass_servicelinks.c, adds source files to PIPELINE_SRCS, registers the 14 suite_servicelink_* tests, and fixes a pre-existing latent UB bug in cbm_pipeline_run (see below)

Pre-existing UB bug fix in cbm_pipeline_run

The plumbing commit in #376 added cbm_sl_endpoint_list_free((cbm_sl_endpoint_list_t *)ctx.endpoints); to the cleanup: block. But ctx is declared after the early-cancel check (if (rc != 0 || check_cancel(p)) { goto cleanup; }), so when cancellation fires before any pipeline work, the cleanup block reads uninitialized stack memory through ctx.endpoints.

I caught this with ASan in test_integ_pipeline_cancel: the uninitialized slot happened to contain a stale sqlite-freed address from the previous test_integ_mcp_list_projects (sqlite's pcache1EnforceMaxPage freed memory during cbm_store_close, and that address landed in ctx.endpoints's slot on the cancel test's stack frame), surfacing as a heap-use-after-free in cbm_sl_endpoint_list_free. It's latent UB — the full oss/clean-features stack passed because subsequent commits' stack-layout shifts happened to put NULL/safe values there.

Fix: declare cbm_sl_endpoint_list_t *endpoints = NULL; at the top of cbm_pipeline_run alongside path_aliases, allocate the list after the cancel check, assign to ctx.endpoints, and free endpoints (not ctx.endpoints) in the cleanup block. Safe regardless of which goto fires, since endpoints is declared before all gotos and initialized to NULL (and cbm_sl_endpoint_list_free(NULL) is a no-op).

Test plan

  • ./scripts/test.sh passes (3796/3796, ASan + UBSan)
  • All 14 suite_servicelink_* test suites green
  • ASan no longer reports use-after-free in test_integ_pipeline_cancel

Upstream overlap audit (re-checked against upstream/main @ 6226972)

Since this PR was opened the audit has been re-run on current upstream. Findings:

  • Already covered upstream (12 of 14 protocols):
    • internal/cbm/service_patterns.c (lines 152-248) — library detection tables for kafka, sqs, sns, eventbridge, pubsub, rabbitmq/amqp, nats, redis, mqtt, grpc, graphql, trpc
    • src/pipeline/pass_calls.c:253-257 — emits HTTP_CALLS and ASYNC_CALLS edges with broker metadata
    • src/pipeline/pass_parallel.c:1221-1327 — emits GRPC_CALLS, GRAPHQL_CALLS, TRPC_CALLS edges
    • src/pipeline/pass_cross_repo.c:match_typed_routes (line 492) — cross-repo typed-route matching for gRPC/GraphQL/tRPC
    • src/pipeline/pass_cross_repo.c:match_async_routes (line 330) — cross-repo matching for messaging protocols
  • Net-new in this PR:
    • WebSocket and Server-Sent Events (SSE) linker coverage — no upstream equivalent
    • Regex-source-scan enrichment for the other 12 protocols that produces topic/exchange/QoS metadata not derivable from upstream's library-name + call-resolution path
  • Recommended path: trim the 12 overlapping protocol files; keep WS + SSE only. Alternatively, migrate the 12 to enrichment hooks layered on top of upstream's existing *_CALLS edges rather than emitting them in parallel.

Marking remains draft until reviewed against this audit. PR #380 establishes the architectural reconciliation (cedes 4 protocols to upstream); the consolidated shape of this PR depends on how that lands.

Shidfar added 6 commits May 25, 2026 14:04
Core framework for 14 protocol linkers:
- servicelink.h: shared types, endpoint registry, pattern matching helpers
- pass_servicelinks: pipeline pass that dispatches to per-protocol linkers
- Endpoint persistence: protocol_endpoints table in each project DB
- MCP tool registration and cross_project_links handler
- Build system, test harness, and CI integration
GraphQL: schema field detection, gql template parsing, field-name
extraction, operation name matching across producer/consumer pairs.
gRPC: proto service/rpc definitions, client stub calls, streaming
patterns across Go, Python, Java, TypeScript, and Rust.
Cloud messaging linkers for AWS and Apache Kafka:
- Kafka: producer/consumer topic detection across Java, Python, Go, TS
- SQS: queue URL and queue name extraction, send/receive matching
- SNS: topic ARN detection, publish/subscribe patterns
- EventBridge: event bus, rule, and put-events pattern detection
Message broker protocol linkers:
- GCP Pub/Sub: topic/subscription detection, Terraform subscriber configs
- RabbitMQ: exchange/queue binding, AMQP topic wildcard matching
- MQTT: topic publish/subscribe with wildcard (+/#) matching
- NATS: subject publish/subscribe with wildcard (*/>)  matching
- Redis Pub/Sub: channel publish/subscribe detection
Real-time and RPC protocol linkers:
- WebSocket: connection URL detection, send/receive message matching
- SSE: EventSource URL detection, event stream endpoint matching
- tRPC: router procedure definitions, client hook call matching
Activates the linker files added by the prior cherry-picks:

- Makefile.cbm: add 14 servicelink_*.c to PIPELINE_SRCS, add 14
  TEST_SERVICELINK_*_SRCS test declarations, extend ALL_TEST_SRCS
- pass_servicelinks.c: restore the LINKERS dispatch table to the
  full 14-entry list and remove the empty-table guard
- pipeline.c: allocate cbm_sl_endpoint_list_t at function top
  (alongside path_aliases) so cleanup can free it safely even when
  the early cancel check goto's into cleanup before ctx is declared
- test_main.c: register the 14 suite_servicelink_* test suites
Removes stale-fact drift from the fork era (language/agent counts,
install one-liner, feature bullets) flagged in PR DeusData#295's close comment.
No URL substitutions involved — README's links already pointed at
DeusData; this only reverts the content body.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant