bitc

Plain C inference for PrismML's 1-bit Bonsai Qwen3-based GGUF models. Reads Q1_0_g128 quantized weights and runs the forward pass. No dependencies beyond libc and libm.

What it do?

Nothing here is new. This is just a from scratch C implementation written to learn how inference works. The architecture is Qwen3, the file format is GGUF, the quantization scheme is Q1_0_g128, RoPE uses YaRN scaling. All of this exists in llama.cpp and is the source of truth. If you want a real engine, use that.

Build!

gcc -O3 -o bitc main.c gguf.c model.c inference.c tokenizer.c -lm

Run!

Drop a Bonsai-1.7B.gguf into models/ and run:

./bitc

It will prompt for input and stream a reply.

Whats next?

I have plans to learn more about making inferences and figuring out other weird things i can try.

Files!

gguf.{c,h}: GGUF parser
model.{c,h}: Qwen3 weight loading
inference.{c,h}: forward pass, RoPE, attention, SwiGLU, sampling
tokenizer.{c,h}: BPE merges + byte-level encoding
main.c: chat loop with the <|im_start|>/<|im_end|> template

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
README.md		README.md
gguf.c		gguf.c
gguf.h		gguf.h
inference.c		inference.c
inference.h		inference.h
main.c		main.c
model.c		model.c
model.h		model.h
tokenizer.c		tokenizer.c
tokenizer.h		tokenizer.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bitc

What it do?

Build!

Run!

Whats next?

Files!

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

bitc

What it do?

Build!

Run!

Whats next?

Files!

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages