Skip to content

feat(openviking-plugin): native Session+Search integration with memory-first retrieval and observable fallback#2

Open
mczabca-boop wants to merge 14 commits intosteven1522:feature/plugin-systemfrom
mczabca-boop:feat/openviking-on-pr1-base
Open

feat(openviking-plugin): native Session+Search integration with memory-first retrieval and observable fallback#2
mczabca-boop wants to merge 14 commits intosteven1522:feature/plugin-systemfrom
mczabca-boop:feat/openviking-on-pr1-base

Conversation

@mczabca-boop
Copy link

@mczabca-boop mczabca-boop commented Feb 24, 2026

Summary

This PR continues the OpenViking integration/pluginization line (TinyAGI#127 -> #1 -> #2) and adds the next stabilization layer focused on latency control, gate behavior, and runtime reliability.

In short: OpenViking remains fully pluginized, but prefetch is no longer “always-on”; it is now gated rule-first with optional LLM decisioning for ambiguous cases.

What This PR Adds

1) Rule-first prefetch gate inside openviking-context plugin

  • Adds prefetch gate modes:
    • always
    • never
    • rule (default)
    • rule_then_llm
  • Adds configurable gate controls:
    • prefetch_force_patterns
    • prefetch_skip_patterns
    • prefetch_rule_threshold
    • prefetch_llm_ambiguity_low
    • prefetch_llm_ambiguity_high
    • prefetch_llm_timeout_ms
  • Force/skip/rule/llm decisions are all plugin-internal (no OpenViking-specific branching in core queue flow).

2) Expanded bilingual trigger vocabulary (CN/EN)

  • Extended force/skip phrase coverage significantly in:
    • plugin rule evaluator
    • setup defaults
    • runtime defaults
  • Improves practical recall/precision for real user phrasing in Chinese + English.

3) Optional LLM gate for ambiguous cases only

  • In rule_then_llm, LLM is called only when rule verdict is ambiguous.
  • LLM output is parsed as structured JSON (need_memory, reason).
  • Timeout/error is fail-open to no-prefetch (llm_no) to avoid blocking the main response path.

4) Better hook-budget awareness in beforeModel

  • Gate logs now include:
    • hook_budget_ms
    • hook_remaining_ms
    • prefetch_timeout_effective_ms
  • If remaining hook budget is insufficient, plugin cleanly skips prefetch with explicit reason (instead of hidden timeout behavior).

5) Stability hardening around OpenViking runtime

  • Adds vectordb dimension consistency guard in daemon startup:
    • compares expected dim (config) vs actual dim (runtime data metadata)
    • on mismatch, auto backup + rebuild runtime data dir
  • Reduces recurring “expected X, got Y” instability loops.

6) Queue processor env loading fix

  • queue-processor now loads .env (dotenv/config), ensuring plugin hook timeout/env knobs reliably take effect after normal restart.
  • Fixes cases where .env had correct values but runtime still used defaults.

Observability Improvements

Per-turn logs now clearly expose gate behavior:

  • prefetch_decision=force|rule_yes|rule_no|llm_yes|llm_no|disabled
  • reason=...
  • LLM gate timing + result (or timeout reason)
  • Prefetch hit distribution remains visible (memory/resource/skill)

Key Files Touched

  • src/plugins/openviking-context/index.ts
  • src/plugins/openviking-context/prefetch-gate.ts
  • src/queue-processor.ts
  • lib/common.sh
  • lib/setup-wizard.sh
  • lib/daemon.sh
  • README.md

Compatibility / Rollback

No change to core requirement:

  • OpenViking remains optional and pluginized.
  • Disabling OpenViking/plugin keeps TinyClaw normal behavior intact.

Feature flags remain compatible:

  • TINYCLAW_OPENVIKING_CONTEXT_PLUGIN=0
  • TINYCLAW_OPENVIKING_SESSION_NATIVE=1
  • TINYCLAW_OPENVIKING_SEARCH_NATIVE=1
  • TINYCLAW_OPENVIKING_PREFETCH=1
  • TINYCLAW_OPENVIKING_AUTOSYNC=1

Gate defaults for new installs remain conservative:

  • prefetch_gate_mode=rule (no LLM gate by default)

Validation Notes

Passed locally:

  • npm run build:main
  • npm run test:prefetch-parser
  • npm run test:openviking-plugin

Manual runtime validation completed for:

  • force path (prefetch_decision=force)
  • rule-no path (prefetch_decision=rule_no)
  • rule-then-llm path (prefetch_decision=llm_yes/llm_no)
  • commit-required memory ingestion behavior (memory appears after successful session commit)

Risk / Follow-up

  • LLM gate latency remains environment-dependent; timeout tuning still matters in production.
  • If teams prefer lower latency by default, keep rule mode and enable rule_then_llm selectively.
  • OpenViking upstream config behavior still requires explicit api_key in current version (pure env-substitution in ov.conf is not yet reliable).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant