asun-csharp

A high-performance ASUN (Array-Schema Unified Notation) serialization/deserialization library for .NET — zero-copy, SIMD-accelerated, schema-driven data format designed for LLM interactions and large-scale data transmission.

中文文档

What is ASUN?

ASUN separates schema from data, eliminating repetitive keys found in JSON. The schema is declared once, and data rows carry only values:

JSON (100 tokens):
{"users":[{"id":1,"name":"Alice","active":true},{"id":2,"name":"Bob","active":false}]}

ASUN (~35 tokens, 65% saving):
[{id@int, name@str, active@bool}]:(1,Alice,true),(2,Bob,false)

Aspect	JSON	ASUN
Token efficiency	100%	30–70% ✓
Key repetition	Every object	Declared once ✓
Human readable	Yes	Yes ✓
Nested structs	✓	✓
Type annotations	No	Optional ✓
Serialization speed	1x	~1.2–8x faster ✓
Data size	100%	40–60% ✓

Why ASUN?

json

Standard JSON repeats every field name in every record. When you send structured data to an LLM, over an API, or across services, that repetition wastes tokens, bytes, and attention:

[
  { "id": 1, "name": "Alice", "active": true },
  { "id": 2, "name": "Bob", "active": false },
  { "id": 3, "name": "Carol", "active": true }
]

asun

ASUN declares the schema once and streams data as compact tuples:

[{id, name, active}]:
  (1,Alice,true),
  (2,Bob,false),
  (3,Carol,true)

Fewer tokens. Smaller payloads. Clearer structure, and faster parsing than repeated-object JSON.

Quick Start

Add the Asun NuGet package:

dotnet add package Asun

The published NuGet package ships a single package with assets for both net8.0 and net10.0.

If your app targets a specific runtime, you can pin it explicitly in your project file:

<TargetFramework>net8.0</TargetFramework>

or:

<TargetFramework>net10.0</TargetFramework>

Define a Schema Type

using Asun;

record User(long Id, string Name, bool Active) : IAsunSchema
{
    static readonly string[] _names = ["id", "name", "active"];
    static readonly string?[] _types = ["int", "str", "bool"];
    public ReadOnlySpan<string> FieldNames => _names;
    public ReadOnlySpan<string?> FieldTypes => _types;
    public object?[] FieldValues => [Id, Name, Active];

    public static User FromFields(Dictionary<string, object?> m) =>
        new(Convert.ToInt64(m["id"]), (string)m["name"]!, Convert.ToBoolean(m["active"]));
}

Serialize & Deserialize

var user = new User(1, "Alice", true);

// Encode
var s = Asun.Asun.encode(user);
// => "{id,name,active}:(1,Alice,true)"

// Encode with scalar type hints
var typed = Asun.Asun.encodeTyped(user);
// => "{id@int,name@str,active@bool}:(1,Alice,true)"

// Decode
var u2 = Asun.Asun.decodeWith(s, User.FromFields);
// u2 == user ✓

Vec Serialization (Schema-Driven)

For List<T>, ASUN writes the schema once and emits each element as a compact tuple — the key advantage over JSON:

var users = new List<User> {
    new(1, "Alice", true),
    new(2, "Bob", false),
};

var s = Asun.Asun.encode<User>(users);
// => "[{id,name,active}]:(1,Alice,true),(2,Bob,false)"

var users2 = Asun.Asun.decodeListWith(s, User.FromFields);
// users2.Count == 2 ✓

Binary Format

// Zero-copy binary encoding (BinaryPrimitives, no intermediate allocation)
var bin = Asun.Asun.encodeBinary(user);

var u3 = Asun.Asun.decodeBinaryWith(bin,
    new[] { "id", "name", "active" },
    new[] { FieldType.Int, FieldType.String, FieldType.Bool },
    User.FromFields);

Pretty Format

var pretty = Asun.Asun.encodePretty(user);
// => "{id, name, active}:(1, Alice, true)"

var prettyTyped = Asun.Asun.encodePrettyTyped(user);
// => "{id@int, name@str, active@bool}:(1, Alice, true)"

Supported Types

Type	ASUN Representation	Example
int	Plain number	`42`, `-100`
float	Decimal number	`3.14`, `-0.5`
bool	Literal	`true`, `false`
str	Unquoted or quoted	`Alice`, `"Carol Smith"`
Optional	Value or empty	`hello` or (blank)
List<T>	`[v1,v2,v3]`	`[rust,go,python]`
Nested struct	`(field1,field2)`	`(Engineering,500000)`

Native Dictionary<K,V> / map fields are intentionally unsupported in the current ASUN format. If you need keyed collections, model them explicitly as entry-list arrays such as:

{attrs@[{key@str,value@int}]}:([(age,30),(score,95)])

Nested Structs

record Dept(string Title) : IAsunSchema { /* ... */ }
record Employee(string Name, Dept Dept) : IAsunSchema { /* ... */ }

// Schema reflects nesting:
// {name@str,dept@{title@str}}:(Alice,(Engineering))

Optional Fields

// With value@{id,label}:(1,hello)
// With null@{id,label}:(1,)

Arrays

{name,tags}:(Alice,[rust,go,python])

Comments

/* user list */
[{id@int, name@str, active@bool}]:(1,Alice,true),(2,Bob,false)

Multiline Format

[{id@int, name@str, active@bool}]:
  (1, Alice, true),
  (2, Bob, false),
  (3, "Carol Smith", true)

API Reference

Function	Description
`Asun.encode(T)`	Serialize struct → schema without scalar hints
`Asun.encodeTyped(T)`	Serialize struct → schema with scalar type hints
`Asun.encode<T>(List<T>)`	Serialize list → schema without scalar hints (written once)
`Asun.encodeTyped<T>(List<T>)`	Serialize list → schema with scalar type hints
`Asun.decode(string)`	Deserialize → field bag (`Dictionary<string, object?>`)
`Asun.decodeWith<T>(s, fn)`	Deserialize → typed T via factory
`Asun.decodeListWith<T>(s, fn)`	Deserialize → List<T> via factory
`Asun.encodeBinary(T)`	Binary encode (zero-copy BinaryPrimitives)
`Asun.decodeBinaryWith<T>(…)`	Binary decode → typed T
`Asun.encodePretty(T)`	Pretty-format encode
`Asun.encodePrettyTyped(T)`	Pretty-format with scalar type hints

Benchmark Output

Run the bundled benchmark with:

dotnet run --project examples/Bench/Asun.Examples.Bench.csproj -c Release

Headline numbers::

  Flat struct × 500 (8 fields, vec)
    Serialize:   JSON 16.22ms/60784B | ASUN 10.11ms(1.6x)/28327B(46.6%) | BIN 4.92ms(3.3x)/37230B(61.2%)
    Deserialize: JSON    22.09ms | ASUN     5.70ms(3.9x) | BIN     2.11ms(10.5x)

Actual timings vary by runtime, CPU, and whether you run Debug or Release.

Why ASUN Performs Well

Zero key-hashing — Schema parsed once; fields mapped by position index O(1), no per-row key string hashing.
Schema-driven parsing — Deserializer knows expected types, enabling direct parsing. CPU branch prediction hits ~100%.
Minimal allocation — All rows share one schema reference. ArrayPool, stackalloc, ReadOnlySpan<char> everywhere.
SIMD acceleration — SearchValues<char> auto-selects SSE2/AVX2/AdvSimd for character scanning.
Zero-copy decode — Parsing operates directly on ReadOnlySpan<char>, no intermediate string allocation.
Schema caching — Encoder caches schema header strings per type; decoder caches parsed field name arrays.
Zero-boxing WriteValues — Direct typed field writes bypass object?[] allocation entirely.

C# Performance Techniques Used

ArrayPool<char> / ArrayPool<byte> for writer buffers — zero GC pressure
ThreadLocal writer reuse for single-struct encode — no rent/return overhead
Schema header caching via ConcurrentDictionary<Type, string>
Decoded schema caching via ConcurrentDictionary<int, string[]>
Zero-boxing WriteValues / WriteBinaryValues interface methods
stackalloc for integer/float formatting
ReadOnlySpan<char> for all parsing — no string copies
BinaryPrimitives for little-endian binary I/O — direct memory operations
SearchValues<char> (.NET 8+, package targets net8.0 and net10.0) — hardware-accelerated character scanning
ref struct for decoder state — fully stack-allocated
[MethodImpl(MethodImplOptions.AggressiveInlining)] on hot paths

Examples

# Basic usage
dotnet run --project examples/Basic

# Complex nested structures, escaping, 5-level deep nesting
dotnet run --project examples/Complex

# Performance benchmark (ASUN vs JSON)
dotnet run --project examples/Bench -c Release

If you have both target frameworks enabled locally, you can run a specific one:

dotnet run --project examples/Basic -f net8.0
dotnet run --project examples/Basic -f net10.0

ASUN Format Specification

See the full ASUN Spec for syntax rules, BNF grammar, escape rules, type system, and LLM integration best practices.

Syntax Quick Reference

Element	Schema	Data
Object	`{field1@type,field2@type}`	`(val1,val2)`
Array	`field@[type]`	`[v1,v2,v3]`
Object array	`field@[{f1@type,f2@type}]`	`[(v1,v2),(v3,v4)]`
Nested object	`field@{f1@type,f2@type}`	`(v1,(v3,v4))`
Null	—	(blank)
Empty string	—	`""`
Comment	—	`/* ... */`

License

MIT

Contributors

Athan

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
examples		examples
src/Asun		src/Asun
tests/Asun.Tests		tests/Asun.Tests
.gitignore		.gitignore
Asun.slnx		Asun.slnx
README.md		README.md
README_CN.md		README_CN.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

asun-csharp

What is ASUN?

Why ASUN?

Quick Start

Define a Schema Type

Serialize & Deserialize

Vec Serialization (Schema-Driven)

Binary Format

Pretty Format

Supported Types

Nested Structs

Optional Fields

Arrays

Comments

Multiline Format

API Reference

Benchmark Output

Why ASUN Performs Well

C# Performance Techniques Used

Examples

ASUN Format Specification

Syntax Quick Reference

License

Contributors

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

asun-csharp

What is ASUN?

Why ASUN?

Quick Start

Define a Schema Type

Serialize & Deserialize

Vec Serialization (Schema-Driven)

Binary Format

Pretty Format

Supported Types

Nested Structs

Optional Fields

Arrays

Comments

Multiline Format

API Reference

Benchmark Output

Why ASUN Performs Well

C# Performance Techniques Used

Examples

ASUN Format Specification

Syntax Quick Reference

License

Contributors

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages