[Feature]: Support Gemma 4 Jinja Templates (Fixes MissingTemplateException)

### Background & Description

### **Description**

Currently, attempting to load a Gemma 4 GGUF model and evaluate its prompt template using `LLamaTemplate` (or high-level wrappers like `ChatSession`) results in a crash. LLamaSharp throws a `MissingTemplateException` because the native `llama_chat_apply_template` function returns `-1`.

**The Error:**
```text
LLama.Exceptions.MissingTemplateException: llama_chat_apply_template failed: 
template not found for '{%- macro format_parameters(properties, required) -%}
{%- set standard_keys = ['description', 'type', 'properties', 'required', 'nullable'] -%}
{%- set ns = namespace(found_first=false) -%} ...
```

**Root Cause Analysis:**
Gemma 4 introduced a highly complex Jinja template to handle multimodality and native function/tool calling. It expects to iterate over complex objects like `properties`, `required`, and `tools`.

The failure occurs because the upstream `llama.cpp` core C API (`llama_chat_message` struct) currently only accepts `role` and `content`. LLamaSharp's P/Invoke interop struct perfectly mirrors this:
```csharp
_nativeChatMessages[i] = new LLamaChatMessage
{
    role = (byte*)r.Pointer,
    content = (byte*)c.Pointer
};
```
Because the native `minijinja` engine inside `llama.cpp` is starved of the tool-calling variables the Gemma 4 template expects, the template evaluation fails internally. The C++ engine falls back to heuristic matching, fails, and returns `-1`, which LLamaSharp bubbles up as a `MissingTemplateException`.

Crucially, **the upstream `llama.cpp` repository currently bypasses `llama_chat_apply_template` for Gemma 4.** They rely on hardcoded C++ workarounds (e.g., specialized template functions) to manually format Gemma 4 prompts because their own Jinja parser cannot handle it yet without struct updates.

**Proposed Action Plan:**
Since LLamaSharp cannot marshal tool definitions until `llama.cpp` updates its `llama_chat_message` struct upstream, we need a two-phased approach to support Gemma 4:

**1. Short-Term Fix (Actionable PR):** Implement a C# equivalent of `llama.cpp`'s specialized template bypass. If a Gemma 4 model is detected (or its specific Jinja signature is read), LLamaSharp should intercept it and automatically apply the safe `"gemma"` fallback template internally, rather than passing the un-parsable Jinja string to the C++ backend and crashing.

**2. Long-Term Fix (Upstream Dependency):**
Once `llama.cpp` overhauls the `llama_chat_apply_template` API to accept tool/JSON schemas, update LLamaSharp's `LLamaChatMessage` P/Invoke struct and the `LLamaTemplate.Apply()` method to marshal C# tool definitions across the boundary.

**Current Workaround**
Currently, developers can bypass the exception by explicitly overriding the Jinja template extraction and forcing the backend to use its internal C++ defaults via:
```csharp
var template = new LLamaTemplate("gemma"); 
```
While this prevents the crash and allows standard text generation, it strips out Gemma 4's native tool-calling and multimodality formatting. Furthermore, developers using high-level wrappers like `ChatSession` often hit this crash before they realize they need to override the template.

**Additional context**
This issue will become a widespread blocker as more developers adopt Gemma 4. Implementing the short-term bypass will prevent the immediate crashes while we wait for the upstream C API to accommodate native tool calling.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Support Gemma 4 Jinja Templates (Fixes MissingTemplateException) #1375

Background & Description

Description

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Feature]: Support Gemma 4 Jinja Templates (Fixes MissingTemplateException) #1375

Description

Background & Description

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions