Skip to content

Add Architecture & Design Documentation: FlatBuffers-based Zero-Copy Data Plane for WAMP #1819

@oberstet

Description

@oberstet

📘 Add Architecture & Design Documentation: FlatBuffers-based Zero-Copy Data Plane for WAMP

See also: crossbario/zlmdb#98

Summary

Both zlmdb and autobahn-python implement a shared architectural vision:

A schema-first, zero-copy, end-to-end data plane for WAMP, using FlatBuffers as the single source of truth for data representation — across storage and transport, or data-at-rest and data-in-transit.

This design is non-trivial, highly intentional, and spans multiple layers of the system.
At the moment, this architecture exists implicitly in the code and build system, but is not documented in one coherent place.

This issue proposes adding a dedicated Sphinx documentation page explaining the high-level goals, architecture, and design rationale.


Motivation

New contributors and advanced users currently have to infer the design from:

  • vendored FlatBuffers sources
  • bundled flatc compiler
  • native wheels (x86-64 + ARM64)
  • CFFI-based LMDB integration
  • FlatBuffers reflection usage
  • version synchronization checks between projects

This is powerful, but not obvious without context.

A clear architecture document would:

  • Explain why FlatBuffers is central (not incidental)
  • Explain why flatc is bundled
  • Explain why both projects vendor FlatBuffers
  • Clarify how data-at-rest and data-in-transit share the same schema
  • Make the zero-copy goal explicit
  • Reduce the learning curve for advanced users
  • Serve as a reference for future maintenance decisions

High-Level Architecture

At a conceptual level, the system looks like this:

LMDB (data-at-rest)
   ↓ zero-copy
FlatBuffers object graph
   ↓ zero-copy
WAMP RPC / PubSub (data-in-transit)
   ↓ zero-copy
WebSocket (or other WAMP-)transport

Key properties:

  • FlatBuffers schemas are the single source of truth

  • The same schema is used for:

    • persistent storage (zlmdb)
    • network transport (autobahn-python)
  • No intermediate serialization formats are introduced

  • No JSON / MsgPack / Protobuf translation layers

  • Memory layouts are shared as much as possible


Design Principles

1. Schema-First

  • FlatBuffers schemas define:

    • structure
    • evolution rules
    • compatibility guarantees
  • Code generation is derived from schemas — not vice versa

  • Reflection (reflection.fbs / .bfbs) enables dynamic and tooling use cases

2. Zero-Copy by Design

  • LMDB provides memory-mapped access to data
  • FlatBuffers allows reading structured data directly from memory
  • WAMP messages can be transmitted without re-encoding
  • Goal: avoid deserialize → reserialize cycles

3. Unified Data Model for Storage and Transport

  • zlmdb focuses on data-at-rest

  • autobahn-python focuses on data-in-transit

  • Both operate on the same FlatBuffers data model

  • This enables patterns such as:

    • reading a database record
    • returning it directly as a WAMP RPC result

4. Hermetic Tooling

  • FlatBuffers is vendored to ensure:

    • deterministic builds
    • version consistency
    • reproducibility
  • flatc is bundled inside wheels:

    • avoids system dependencies
    • avoids PATH issues
    • ensures schema/compiler compatibility
  • Version synchronization is explicitly checked at runtime (e.g. between zlmdb and autobahn-python)

5. First-Class Native Support

  • Native wheels for:

    • x86-64
    • ARM64 (AArch64)
  • manylinux-compliant builds

  • PyPy supported via CFFI (no CPython-specific APIs)

  • C++ usage isolated behind stable interfaces


Why This Matters

This architecture enables:

  • High-performance WAMP applications
  • Predictable memory usage
  • Efficient large-payload handling
  • Long-term schema evolution without breaking consumers
  • A coherent mental model across persistence and messaging

It is intentionally not a generic “serialization library integration”, but a vertically integrated data plane.


Proposed Documentation Work

Add a new Sphinx documentation page (in both projects) covering:

  • Architectural goals
  • Data flow diagram
  • Role of FlatBuffers
  • Relationship between zlmdb and autobahn-python
  • Rationale for bundling flatc
  • Zero-copy design considerations
  • Version synchronization guarantees

Possible page titles:

  • “Architecture & Design”
  • “FlatBuffers-based Data Plane”
  • “Zero-Copy Data Flow for WAMP”

Related Code / References

  • FlatBuffers version synchronization check:

    • zlmdb.check_autobahn_flatbuffers_version_in_sync()
  • Vendored FlatBuffers + bundled flatc

  • Native wheel build & auditwheel verification

  • CFFI-based LMDB integration


Outcome

Documenting this architecture will:

  • Make the design explicit and durable
  • Improve onboarding for advanced users
  • Provide context for future contributors
  • Reduce accidental regressions
  • Showcase the uniqueness of the approach

If you want, next I can:

  • tailor two slightly different versions (one zlmdb-centric, one autobahn-centric), or
  • draft the actual Sphinx page content instead of an issue description.

Either way — this is absolutely worth documenting.


Checklist

  • I have searched existing issues to avoid duplicates
  • I have described the problem clearly
  • I have provided use cases
  • I have considered alternatives
  • I have assessed impact and breaking changes

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions