Skip to content

feat(posthog): add Redis-backed flag definition cache for local evaluation#3384

Open
kilo-code-bot[bot] wants to merge 1 commit into
mainfrom
feat/posthog-redis-flag-cache
Open

feat(posthog): add Redis-backed flag definition cache for local evaluation#3384
kilo-code-bot[bot] wants to merge 1 commit into
mainfrom
feat/posthog-redis-flag-cache

Conversation

@kilo-code-bot
Copy link
Copy Markdown
Contributor

@kilo-code-bot kilo-code-bot Bot commented May 21, 2026

Summary

  • Upgrades posthog-node from 5.10.4 → 5.34.2 to gain the posthog-node/experimental export, which provides the FlagDefinitionCacheProvider interface for distributed flag-definition caching.
  • Adds apps/web/src/lib/posthog-flag-cache.ts: a Redis-backed implementation of FlagDefinitionCacheProvider. Each polling cycle one process wins a SET NX lock, fetches flag definitions from PostHog, writes them to Redis, then releases the lock — all other processes read from the shared cache and skip the API call.
  • Adds redisSetNX to apps/web/src/lib/redis.ts for atomic lock acquisition (mirrors the existing redisGet/redisSet/redisDel helpers).
  • Registers POSTHOG_FLAG_DEFINITIONS_REDIS_KEY and POSTHOG_FLAG_CACHE_LOCK_REDIS_KEY in redis-keys.ts.
  • Wires flagDefinitionCacheProvider, enableLocalEvaluation: true, and personalApiKey into PostHogClient() when POSTHOG_PERSONAL_API_KEY is set. Falls back gracefully to the previous remote-evaluation behaviour when the key is absent or Redis is unavailable.
  • Documents POSTHOG_PERSONAL_API_KEY in .env.local.example.

Verification

  1. Set POSTHOG_PERSONAL_API_KEY (a PostHog personal API key from https://us.posthog.com/settings/user-api-keys) and REDIS_URL in .env.local.
  2. Start the dev server and make a feature flag check (e.g. via a route that calls getFeatureFlagPayload).
  3. Observe posthog:flags:definitions and posthog:flags:lock keys appear in Redis (redis-cli KEYS posthog:*).
  4. Subsequent flag checks should be evaluated locally (no outbound /decide calls to PostHog).
  5. Without POSTHOG_PERSONAL_API_KEY, behaviour is unchanged — remote evaluation per request.

Visual Changes

N/A

Reviewer Notes

  • The FlagDefinitionCacheProvider API is marked @experimental in posthog-node and may change in minor versions — worth revisiting on the next major posthog-node upgrade.
  • The lock uses SET NX EX 90 (90 s TTL > 30 s default polling interval). If the process that holds the lock crashes before calling onFlagDefinitionsReceived, a new leader is elected after 90 s.
  • When Redis is unavailable, shouldFetchFlagDefinitions catches the error and returns true, so all processes fall back to independent API calls — same as before this PR.
  • The JSON.parse(cached) as FlagDefinitionCacheData cast in getFlagDefinitions is acceptable here: the data was serialised by our own onFlagDefinitionsReceived and the PostHog SDK validates it internally before use. No untrusted external input is involved.
  • Pre-push hook note: the repo has pre-existing typecheck failures in packages/db, packages/kilo-chat-hooks, etc. (present on main) that prevented the push hook from completing. All files touched by this PR typecheck cleanly.

Built for Evgeny Shurakov by Kilo for Slack

…ation

Enables PostHog local evaluation with a distributed Redis cache so that
multiple Next.js server processes share a single copy of flag definitions
instead of each making independent API calls on every polling interval.

- Upgrade posthog-node 5.10.4 → 5.34.2 to get FlagDefinitionCacheProvider
- Add posthog-flag-cache.ts: implements the cache with acquire-fetch-release
  distributed locking (SET NX) so only one process polls PostHog at a time
- Add redisSetNX to redis.ts for atomic lock acquisition
- Add POSTHOG_FLAG_DEFINITIONS_REDIS_KEY and POSTHOG_FLAG_CACHE_LOCK_REDIS_KEY
  to redis-keys.ts
- Wire flagDefinitionCacheProvider + enableLocalEvaluation + personalApiKey
  into PostHogClient() when POSTHOG_PERSONAL_API_KEY is set
- Document POSTHOG_PERSONAL_API_KEY in .env.local.example

When Redis is absent or POSTHOG_PERSONAL_API_KEY is unset the client falls
back to the previous behaviour (remote evaluation per request).
async onFlagDefinitionsReceived(data: FlagDefinitionCacheData): Promise<void> {
await redisSet(POSTHOG_FLAG_DEFINITIONS_REDIS_KEY, JSON.stringify(data), CACHE_TTL_SECONDS);
// Release the lock so the next poll cycle can re-elect a leader.
await redisDel(POSTHOG_FLAG_CACHE_LOCK_REDIS_KEY);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: redisDel errors propagate uncaught here, which may surface to the PostHog SDK's polling mechanism.

If redisSet at line 52 succeeds but redisDel throws (e.g. Redis timeout), the lock remains held for the full 90s TTL even though the data was already written to Redis. All other instances will wait up to 90s before the next poll cycle can elect a new leader. This is a latency issue: flag definitions will be stale until the lock expires.

Consider wrapping redisDel in a try/catch here (similar to how it's wrapped in shutdown()) so errors are silently absorbed rather than surfacing to the SDK:

try {
  await redisDel(POSTHOG_FLAG_CACHE_LOCK_REDIS_KEY);
} catch {
  // Lock TTL will expire it; not critical since data was already written
}

}

// Single shared PostHog client for the process.
// Disabled outside production to avoid sending real events during tests/dev.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SUGGESTION: Stale comment — disabled is no longer passed to the PostHog constructor.

The original code passed disabled: !isProduction to the PostHog constructor. The new code relies on the early return at line 17-26 to prevent instantiation in non-production. The comment now implies there is still a disabled flag being set, which is misleading.

Suggested change
// Disabled outside production to avoid sending real events during tests/dev.
// Single shared PostHog client for the process — only created in production.

// Single shared PostHog client for the process.
// Disabled outside production to avoid sending real events during tests/dev.
instance = new PostHog(isProduction ? key : key || 'disabled', {
instance = new PostHog(key, {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: PostHog is instantiated with an empty string key (key = '') if NEXT_PUBLIC_POSTHOG_KEY is unset.

The original code used isProduction ? key : key || 'disabled' to fall back to 'disabled' and explicitly passed disabled: !isProduction. Now, the code path only reaches this line when isProduction is true, so passing an empty-string key to PostHog in production (when the env var is missing) will result in silent failures on every event capture. Previously, the disabled option would have made this a no-op.

Consider an early guard:

if (!key) {
  // misconfigured — return stub to avoid silent event loss
  return { capture: () => {}, isFeatureEnabled: async () => false, getFeatureFlag: async () => undefined, debug: () => {}, getFeatureFlagPayload: async () => undefined, alias: () => {} };
}

@kilo-code-bot
Copy link
Copy Markdown
Contributor Author

kilo-code-bot Bot commented May 21, 2026

Code Review Summary

Status: 3 Issues Found | Recommendation: Address before merge

Executive Summary

The most significant issue is an unguarded error path in onFlagDefinitionsReceived where a redisDel failure after a successful redisSet leaves the distributed lock held for up to 90s, and a missing guard for empty PostHog keys in production.

Overview

Severity Count
CRITICAL 0
WARNING 2
SUGGESTION 1
Issue Details (click to expand)

WARNING

File Line Issue
apps/web/src/lib/posthog-flag-cache.ts 54 redisDel errors propagate uncaught — if it throws after redisSet succeeds, the lock stays held for the full 90s TTL before any process can re-poll
apps/web/src/lib/posthog.ts 41 PostHog is constructed with key = '' if NEXT_PUBLIC_POSTHOG_KEY is missing in production — previously disabled: !isProduction would have made this a safe no-op; now events silently fail

SUGGESTION

File Line Issue
apps/web/src/lib/posthog.ts 40 Stale comment — disabled is no longer passed to the constructor; the comment misleads readers into thinking a disabled flag is still set
Files Reviewed (6 files)
  • apps/web/src/lib/posthog-flag-cache.ts — 2 issues
  • apps/web/src/lib/posthog.ts — 1 issue
  • apps/web/src/lib/redis.ts — no issues
  • apps/web/src/lib/redis-keys.ts — no issues
  • apps/web/package.json — no issues
  • .env.local.example — no issues

Fix these issues in Kilo Cloud


Reviewed by claude-sonnet-4.6 · 1,049,551 tokens

Review guidance: REVIEW.md from base branch main

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants