Skip to content

Add content to README.md#4

Open
bettio wants to merge 1 commit into
atomvm:mainfrom
bettio:add-readme
Open

Add content to README.md#4
bettio wants to merge 1 commit into
atomvm:mainfrom
bettio:add-readme

Conversation

@bettio

@bettio bettio commented May 12, 2026

Copy link
Copy Markdown
Collaborator

No description provided.

Signed-off-by: Davide Bettio <davide@uninstall.it>
@petermm

petermm commented May 15, 2026

Copy link
Copy Markdown

PR Review — 4df91fa "Add content to README.md"

Author: Davide Bettio <davide@uninstall.it>
Date: 2026‑05‑12
Diff: README.md only, +214 / −0
Reviewer notes: documentation-only commit; reviewed the prose against
src/aprotobuf_encoder.erl, src/aprotobuf_decoder.erl,
src/aprotobuf.app.src, rebar.config and the two CI workflows.


Summary

The README is well‑structured, accurate on the high‑level API, and the
encoder/decoder registry example is correct as written. The points below are
precision and consistency fixes — there is no functional code in this
commit. The two issues most worth addressing before merge are:

  1. Overbroad feature claims. The decoder’s transform_schema/1 only
    normalizes a narrower set of repeated and map shapes than the prose
    implies (no {repeated, {enum, …}}, no {repeated, MapSchema} inline,
    no {map, K, MapSchema} inline, no {map, K, {enum, …}}).
  2. Enum decode behavior is mis‑stated. cast/3 returns the raw integer
    when an unknown enum value is on the wire — the table currently says it
    always decodes to a label.

Everything else is minor: dependency wording, encoder return type
("iolist" vs the iodata-shaped value aprotobuf_encoder:encode/2,3
actually produces), an unused-variable shadow in the registry example, and a
slight overclaim about test coverage (no direct enum encode/parse round
trip exists).


Findings

1. Schema feature matrix overstates what the decoder normalizes — medium

README.md lists composite types in a single sentence:

{enum, …}, {repeated, ElemType}, {map, KeyType, ValueType},
{oneof, …}, nested map schemas for inline sub-messages, and
{ref, Name} for references…

But aprotobuf_decoder:decode_schema/2
only has clauses for:

  • {repeated, ElemType} where ElemType is an atom
  • {repeated, {ref, Name}}
  • {map, KeyType, ValueType} where both are atoms
  • {map, KeyType, {ref, Name}}

There is no clause for {repeated, {enum, …}}, {repeated, MapSchema}
(repeated inline sub-messages), {map, K, {enum, …}}, or
{map, K, MapSchema}. Any of those will fall through to the
error({badarg, K, T}) clause at line 74 the moment the user calls
transform_schema/1.

Fix: narrow the prose to the shapes the decoder actually accepts.

--- a/README.md
+++ b/README.md
@@
-Pure Erlang, no NIFs, no code generation (schemas are plain Erlang maps
-consumed at runtime), and no dependencies beyond `kernel` and `stdlib`.
-Implements the subset of the Protobuf wire format needed by typical embedded
-consumers: all proto3 scalar types, sub-messages, `repeated` (packed and
-unpacked), `map<K, V>`, `enum`, `oneof`, and message references for
-recursive or mutually-recursive schemas.
+Pure Erlang, no NIFs, no code generation (schemas are plain Erlang maps
+consumed at runtime), and no runtime dependencies beyond `kernel` and
+`stdlib`. Implements the subset of the Protobuf wire format needed by typical
+embedded consumers: all proto3 scalar types, singular inline sub-messages,
+`repeated` scalar fields (the decoder accepts both packed and unpacked wire
+forms), `repeated` referenced sub-messages, `map<K, V>` with scalar or
+referenced-message values, `enum`, `oneof`, and message references for
+recursive or mutually-recursive schemas.
@@
-Supported scalar types: `int32`, `int64`, `uint32`, `uint64`, `sint32`,
-`sint64`, `bool`, `bytes`, `string`, `float`, `double`, `fixed32`, `fixed64`,
-`sfixed32`, `sfixed64`. Composite types: `{enum, #{Label => Int}}`,
-`{repeated, ElemType}`, `{map, KeyType, ValueType}`, `{oneof, #{Variant =>
-{FieldNum, Type}}}`, nested map schemas for inline sub-messages, and
-`{ref, Name}` for references resolved through a registry.
+Supported scalar types: `int32`, `int64`, `uint32`, `uint64`, `sint32`,
+`sint64`, `bool`, `bytes`, `string`, `float`, `double`, `fixed32`, `fixed64`,
+`sfixed32`, `sfixed64`. Composite types:
+
+- `{enum, #{Label => Int}}`
+- `{repeated, ElemType}` for a scalar `ElemType`
+- `{repeated, {ref, Name}}` for repeated sub-messages via a registry
+- `{map, KeyType, ValueType}` when both are scalar atoms
+- `{map, KeyType, {ref, Name}}` for message values via a registry
+- `{oneof, #{Variant => {FieldNum, Type}}}`
+- a nested map schema for an inline singular sub-message
+- `{ref, Name}` for singular sub-message references resolved through a registry

2. Enum row in the representation table is incomplete — small

cast/3
returns the raw integer when an enum value is not in the (reverse-mapped)
table:

cast(Value, {enum, IntToLabels}, _Registry) ->
    case maps:find(Value, IntToLabels) of
        {ok, Label} -> Label;
        error       -> Value
    end;

The README table currently reads:

enum — label (atom or string, matching how it appears in the schema)

--- a/README.md
+++ b/README.md
@@
-| `enum`                | label (atom or string, matching how it appears in the schema)            |
+| `enum`                | encoded with the schema label term; decoded to that label for known values, or to the raw integer for unknown values |

3. Encoder/decoder schema-form asymmetry deserves a single sentence — small

The README correctly shows the asymmetry by example, but never names it.
A new reader cannot tell from the prose that only the decoder consumes a
transformed schema
— the encoder always takes the user-written form
(verified at
aprotobuf_encoder:encode/3).

--- a/README.md
+++ b/README.md
@@
-- **`aprotobuf_encoder`**: encodes an Erlang map into Protobuf wire bytes
-  against a user-written schema (`encode/2`, `encode/3`).
+- **`aprotobuf_encoder`**: encodes an Erlang map into Protobuf wire iodata
+  against a user-written schema or registry (`encode/2`, `encode/3`).
 - **`aprotobuf_decoder`**: parses Protobuf wire bytes into an Erlang map
   (`parse/2`, `parse/3`). The user-written schema is first run through
-  `transform_schema/1` (or `transform_schemas/1` for a registry).
+  `transform_schema/1` (or `transform_schemas/1` for a registry). Only the
+  decoder uses the transformed form; the encoder takes the original
+  user-written schema or registry.
@@
-The arity-2 functions are convenience wrappers that build a single-entry
-registry under the sentinel name `root`.
+The arity-2 functions are convenience wrappers that build a single-entry
+registry under the sentinel name `root`; `encode/2` takes the original
+schema, while `parse/2` expects the transformed one.

4. Tree example rebinds Treesmall

Wire = iolist_to_binary(aprotobuf_encoder:encode(Tree, 'Node', Registry)),
Tree = aprotobuf_decoder:parse(Wire, 'Node', DecRegistry).

In Erlang the second occurrence of Tree is a pattern match, not an
assertion of equality with the original. It will succeed at runtime here,
but it reads as if it were intentional pattern matching against the
encoded value, which is fragile to copy/paste and confusing to newcomers.

--- a/README.md
+++ b/README.md
@@
-Wire = iolist_to_binary(aprotobuf_encoder:encode(Tree, 'Node', Registry)),
-Tree = aprotobuf_decoder:parse(Wire, 'Node', DecRegistry).
+Wire        = iolist_to_binary(aprotobuf_encoder:encode(Tree, 'Node', Registry)),
+DecodedTree = aprotobuf_decoder:parse(Wire, 'Node', DecRegistry).

5. "no dependencies" is slightly too strong — cosmetic

Runtime is kernel + stdlib (per
aprotobuf.app.src),
but rebar.config does pull in erlfmt and rebar3_hex as project
plugins.

-and no dependencies beyond `kernel` and `stdlib`.
+and no runtime dependencies beyond `kernel` and `stdlib`.

(Rolled into the diff in finding 1.)

6. "Encoder returns an iolist" — cosmetic

encode_field/4 returns lists that contain other iolists and binaries
(see e.g.
lines 50, 123, 140),
which is iodata, not strictly an iolist (though Erlang's
iolist_to_binary/1 accepts both, so it works). Saying "iodata" is more
accurate and matches the encoder’s signature shape; "iolist" is also
acceptable in common usage. Optional:

-The encoder returns an iolist; flatten it with `iolist_to_binary/1` when you
-need a binary.
+The encoder returns iodata; flatten it with `iolist_to_binary/1` when you
+need a binary. Pass the original user-written schema or registry to
+`encode/2` or `encode/3` — do not pre-transform it.

7. "Packs primitive repeated fields by default" — small

The encoder packs only repeated whose element type is one of the
packable scalars in
is_packable/1
string, bytes, {ref, …} and inline sub-messages are not packed.
"Primitive" is informal and has historically caused confusion in protobuf
docs because string/bytes are scalars but never packable on the wire.

-The encoder packs primitive `repeated` fields by default; the decoder
-transparently accepts both packed and unpacked wire forms.
+The encoder packs packable scalar `repeated` fields by default (numeric
+types, `bool`, `enum`-as-int — never `string`/`bytes`/sub-messages); the
+decoder transparently accepts both packed and unpacked wire forms for those
+fields.

(Note: the encoder doesn’t actually special‑case enum for packing today;
{enum, …} is not in is_packable/1, so wording can drop the enum
mention if you prefer to describe the code as-is.)

8. Test-coverage claim — cosmetic

The test/ directory is extensive (~2.4k lines) but, on a quick skim,
there is no direct encode‑then‑parse round trip for enumenum
appears in transform_schema_test/0 and indirectly elsewhere, but not as
a first‑class round trip. Soften slightly:

-See [`test/`](test/) for runnable examples covering every supported type and
-schema form.
+See [`test/`](test/) for runnable examples covering most supported types
+and schema forms.

9. Minor prose nits — cosmetic

  • "the equivalent aprotobuf schema" → capitalize ("The equivalent…") since
    it follows a code block and stands as its own sentence.
  • The "Build, test, format" snippet shows rebar3 fmt …, which requires
    erlfmt as a plugin. That plugin is declared in rebar.config, so
    the instructions work — but it’s worth a word that the plugin is fetched
    on first invocation (a one‑line note, optional).

What looks good

  • The registry/{ref, Name} example is correct: encoder reads the
    user‑written registry directly, decoder reads the transformed registry —
    matches the code paths in
    aprotobuf_encoder:encode/3
    and
    aprotobuf_decoder:parse/3.
  • The non‑finite float representation (infinity, '-infinity', nan)
    in the table matches both the encoder and decoder
    (encoder lines 152‑168,
    decoder lines 223‑242).
  • The oneof representation table row matches the encoder ({Variant, V}
    at encoder line 42)
    and decoder ({Variant, cast(...)} at
    decoder line 131).
  • The CI claim "OTP 28.0; separate REUSE workflow" matches
    .github/workflows/test.yaml
    and .github/workflows/reuse-lint.yaml.
  • rebar3 and mix.exs snippets are syntactically correct.

Recommendation

Approve with minor changes. Findings 1 and 2 are worth folding in
before merge so users don’t hit error({badarg, …}) from
transform_schema/1 after copying patterns from the README. The remaining
items (3–9) are quality‑of‑life polish.

Out-of-scope follow-ups (optional)

If you want to keep the broader README claims as-is, extend
aprotobuf_decoder:decode_schema/2 to also normalize:

  • {repeated, {enum, …}}
  • {repeated, MapSchema} (inline repeated sub-messages)
  • {map, K, {enum, …}}
  • {map, K, MapSchema} (inline message values in maps)

…and add an explicit enum encode↔parse round‑trip test. Not required
for this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants