Skip to content

feat: Streaming JSON Parsing API#8

Open
vnixx wants to merge 1 commit intomainfrom
feat/streaming-parser
Open

feat: Streaming JSON Parsing API#8
vnixx wants to merge 1 commit intomainfrom
feat/streaming-parser

Conversation

@vnixx
Copy link
Copy Markdown
Member

@vnixx vnixx commented Apr 25, 2026

Summary

Add streaming/incremental JSON parsing support to ReerJSON.

New Types

Bottom layer (JSONValue):

  • JSONStreamParser — push-based streaming parser with two modes:
    • .jsonLines: extract multiple JSON documents from a byte stream (NDJSON/SSE)
    • .jsonArray: parse elements from a large JSON array one by one, O(1) memory
  • JSONIncrementalReader — accumulate chunks for large single-document parsing

Codable layer:

  • StreamingJSONLinesDecoder<T> / StreamingJSONArrayDecoder<T> — typed streaming decoders

AsyncSequence adapters:

  • JSONValueStream / JSONValueByteStream / DecodingStream
  • AsyncSequence.jsonValues() / .decode() convenience extensions

Implementation

  • Uses yyjson's YYJSON_READ_STOP_WHEN_DONE + yyjson_doc_get_read_size() for accurate byte-level buffer management
  • Internal buffer with lazy compaction (readOffset > buffer/2 triggers memmove)
  • JSON Array mode uses a state machine to strip [, ,, ] and parse each element individually
  • Zero impact on existing parsing paths

Testing

  • 33 new tests covering JSON Lines, JSON Array, incremental reader, edge cases, and Codable layer
  • All 755 tests pass (722 existing + 33 new), zero regressions

Note on yyjson_incr_* API

yyjson 0.12.0's incremental API (yyjson_incr_new/read/free) requires all data to be pre-loaded in the buffer — len only controls how far each parse step reads. It cannot handle dynamically appended data between reads. Therefore, network streaming uses STOP_WHEN_DONE instead.

- JSONStreamParser: push-based streaming parser for JSON Lines and JSON Array modes
  - JSON Lines: extracts multiple JSON documents from a byte stream using STOP_WHEN_DONE
  - JSON Array: state machine to parse elements from a large JSON array one by one
  - Internal buffer with lazy compaction for efficient memory management

- JSONIncrementalReader: accumulates chunks for large single-document parsing

- StreamingJSONLinesDecoder / StreamingJSONArrayDecoder: Codable-layer streaming decoders

- JSONValueStream / DecodingStream: AsyncSequence adapters for async byte streams

- AsyncSequence extensions: .jsonValues() and .decode() convenience methods

- Document.streamParse: internal API using yyjson_doc_get_read_size for accurate byte counting

- 33 new tests covering JSON Lines, JSON Array, incremental, edge cases, and Codable layer
- All 755 existing tests pass with zero regressions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant