eval: benchmark Gemma 4 E4B as enrichment model replacement for Ministral 3.3B

## Context

The current default decoder-only model for enrichment and consolidation is `mlx-community/Ministral-3-3B-Instruct-2512-4bit`. Google released Gemma 4 on April 2, 2026, and the E4B variant is a strong candidate to replace it.

## Why Gemma 4 E4B

- **Native structured JSON output / function calling** -- reduces parsing failures in the enrichment pipeline (currently needs fallback `_parse_enrichment_text` parser)
- **128K context window** (vs ~32K) -- could eliminate the multi-window enrichment workaround for long transcripts
- **~4B effective params** at similar memory footprint (~2.5 GB 4-bit)
- **Apache 2.0 license** (same as Ministral)
- Configurable reasoning mode could improve consolidation quality

## Blocker: PLE quantization bug

As of April 11, 2026, standard MLX 4-bit quants from `mlx-community` and `unsloth` produce garbage output because they incorrectly quantize Per-Layer Embedding (PLE) layers. This is a novel architecture feature in Gemma 4's edge models.

Tracking: https://huggingface.co/mlx-community/gemma-4-e2b-4bit/discussions/1

**Workarounds:**
- `bf16` versions work but use ~10 GB (too large for background enrichment)
- PLE-safe community quant exists: `FakeRockert543/gemma-4-e4b-it-MLX-4bit`
- Upstream fix expected soon (model is 9 days old)

## Acceptance criteria

1. Wait for stable PLE-safe 4-bit MLX quant (either upstream fix or validated community quant)
2. Run controlled A/B on enrichment pipeline: same transcripts, compare JSON compliance rate and extraction quality between Ministral 3.3B and Gemma 4 E4B
3. Run LOCOMO conv3 gate with Gemma 4 E4B enrichment to check for regression
4. If results are equal or better, update `DEFAULT_DECODER_MODEL` in `_model_router.py` and `DEFAULTS["consolidation"]` in `config.py`
5. Update `eval_config.json` with new model references

## Files involved

- `src/synapt/recall/_model_router.py` (DEFAULT_DECODER_MODEL)
- `src/synapt/recall/config.py` (DEFAULTS dict)
- `evaluation/eval_config.json` (enrichment_model entries)
- `src/synapt/recall/enrich.py` (may simplify _parse_enrichment_text fallback if JSON compliance improves)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

eval: benchmark Gemma 4 E4B as enrichment model replacement for Ministral 3.3B #661

Context

Why Gemma 4 E4B

Blocker: PLE quantization bug

Acceptance criteria

Files involved

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

eval: benchmark Gemma 4 E4B as enrichment model replacement for Ministral 3.3B #661

Description

Context

Why Gemma 4 E4B

Blocker: PLE quantization bug

Acceptance criteria

Files involved

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions