Byte-level BPE broken for GPT-2/Qwen models (affects both GGUF and SafeTensors)

## Summary

`GGUFTokenizer.encodeBPE()` (in `llm-core/.../tokenizer/GGUFTokenizer.kt`) does not implement byte-level BPE correctly for GPT-2/Qwen-family tokenizers.
When used to encode text containing chat-template special tokens (e.g., `<|im_start|>`) or arbitrary Unicode, it produces broken token sequences that cause the model to generate nonsense output (CJK characters, URL-encoded fragments, HTML entities).

**This affects both file formats**, not just GGUF:

- `GGUFTokenizer.fromRandomAccessSource(gguf)` — broken for Qwen GGUF models
- `GGUFTokenizer.fromTokenizerJson(json)` — broken for Qwen SafeTensors models (same code path)

The bug hasn't surfaced on SafeTensors simply because no Qwen SafeTensors model has been tested in this project yet. All SafeTensors testing so far used LLaMA/Gemma (SentencePiece), which goes through a different, working code path.

**Tokenizer selection should be per-architecture (per tokenizer type), not per file format.**
A Qwen model needs byte-level BPE whether its weights come from `.gguf` or `.safetensors`. A LLaMA model needs SentencePiece regardless of format.

This blocks tool calling and chat mode for Qwen2, Qwen3, Qwen2.5, Mistral-Nemo, and any other model that uses GPT-2-style byte-level BPE.

## Symptoms

### Qwen2.5-0.5B-Instruct tool-calling demo

```bash
./gradlew :llm-apps:kllama-cli:run --args="-m Qwen2.5-0.5B-Instruct-Q8_0.gguf --demo 'What is 2 + 2?'"
```

Model loads successfully (tied embeddings, Q8_0 SIMD, qwen chat template auto-detected). The agent loop then produces:

```
Assistant: footingÃ¶k JSONExceptionzm.bzéħ¬ÙĥØªreckæľ¬ç½ĳaira ?>>
<?ä¸ĢæĹ¥/ĊĊĊĊannppers ... (continues with random CJK, HTML, URL fragments)
```

Expected: either a plain answer (`2 + 2 = 4`) or a `<tool_call>{"name":"calculator",...}</tool_call>` XML block.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Byte-level BPE broken for GPT-2/Qwen models (affects both GGUF and SafeTensors) #52

Summary

Symptoms

Qwen2.5-0.5B-Instruct tool-calling demo

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Byte-level BPE broken for GPT-2/Qwen models (affects both GGUF and SafeTensors) #52

Description

Summary

Symptoms

Qwen2.5-0.5B-Instruct tool-calling demo

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions