Skip to content

feat(linter): add unknown-key rule for unknown top-level keys#84

Open
ryo-manba wants to merge 3 commits into
google-labs-code:mainfrom
ryo-manba:feat/unknown-key-lint
Open

feat(linter): add unknown-key rule for unknown top-level keys#84
ryo-manba wants to merge 3 commits into
google-labs-code:mainfrom
ryo-manba:feat/unknown-key-lint

Conversation

@ryo-manba
Copy link
Copy Markdown

Closes #83

Adds an unknown-key lint rule (warning) that flags top-level YAML keys not in the known schema, so a typo like colours: is reported instead of silently discarded.

@davideast
Copy link
Copy Markdown
Collaborator

Hey @ryo-manba! Thank you so much for the issue and PR.

I agree this is useful for typos. However, we need to consider the primary case of custom token values in DESIGN.md. We don't require anything but name, no token sections are strictly required.

The only one conditional requirement is in colors, which if present the primary color palette must be defined. All major sections: colors, typography, rounded, spacing, components are optional. And the recommended token names (primary, secondary, headline-lg, sm, etc.) are guidance, not requirements.

So the idea here would be to identify typos for a warn rather than any unknown token values because we shouldn't trigger warn logs for custom fields.

Here's what I would explore for a typo-based warn rule.

The ParsedDesignSystem interface in spec.ts is the canonical source of known keys. So you can use an approach like an Levenshtein-based typo detection, with the known keys derived from the ParsedDesignSystem interface.

Derive keys from the schema, don't hardcode them

This PR currently hardcodes KNOWN_TOP_LEVEL_KEYS in handler.ts. Instead, derive them from the ParsedDesignSystem interface's property names. Since TypeScript interfaces don't exist at runtime, you need a single source-of-truth constant that both the type and the rule reference:

// packages/cli/src/linter/parser/spec.ts

/** Canonical top-level YAML keys per the DESIGN.md schema. */
export const SCHEMA_KEYS = [
  'version', 'name', 'description',
  'colors', 'typography', 'rounded', 'spacing', 'components',
] as const;

export type SchemaKey = typeof SCHEMA_KEYS[number];

This gives us runtime keys that also work for type inference.

Add a zero-dependency Levenshtein function

This is ~15 lines, so no npm dependency needed:

// packages/cli/src/linter/linter/rules/levenshtein.ts

/** Levenshtein edit distance between two strings. */
export function levenshtein(a: string, b: string): number {
  const m = a.length, n = b.length;
  const dp: number[][] = Array.from({ length: m + 1 }, (_, i) =>
    Array.from({ length: n + 1 }, (_, j) => (i === 0 ? j : j === 0 ? i : 0))
  );
  for (let i = 1; i <= m; i++) {
    for (let j = 1; j <= n; j++) {
      dp[i][j] = a[i - 1] === b[j - 1]
        ? dp[i - 1][j - 1]
        : 1 + Math.min(dp[i - 1][j], dp[i][j - 1], dp[i - 1][j - 1]);
    }
  }
  return dp[m][n];
}

Rewrite the rule: only warn when a close match exists

// packages/cli/src/linter/linter/rules/unknown-key.ts

import { SCHEMA_KEYS } from '../../parser/spec.js';
import { levenshtein } from './levenshtein.js';
import type { DesignSystemState } from '../../model/spec.js';
import type { RuleDescriptor, RuleFinding } from './types.js';

/** Max edit distance to consider a typo (not a custom key). */
const MAX_TYPO_DISTANCE = 2;

export function unknownKey(state: DesignSystemState): RuleFinding[] {
  const knownSet = new Set<string>(SCHEMA_KEYS);
  return (state.unknownKeys ?? []).flatMap(key => {
    if (knownSet.has(key)) return [];
    
    // Find closest known key
    let bestMatch: string | undefined;
    let bestDist = Infinity;
    for (const known of SCHEMA_KEYS) {
      const dist = levenshtein(key.toLowerCase(), known.toLowerCase());
      if (dist < bestDist) {
        bestDist = dist;
        bestMatch = known;
      }
    }

    if (bestDist <= MAX_TYPO_DISTANCE && bestMatch) {
      return [{
        path: key,
        message: `Unknown key "${key}" — did you mean "${bestMatch}"?`,
      }];
    }
    
    // Not close to any known key → intentional extension, stay silent
    return [];
  });
}

Key heuristics

  • Threshold: A distance ≤ 2 catches colourscolors (1), typografytypography (2), but ignores icons, motion, `brand.
  • Case handling: Compare lowercased. This way Colors and COLORS should match colors.
  • Silence for zero matches: Don't emit any finding to respect the spec's extensibility philosophy.

Test cases to cover

"colours"     → warns, suggests "colors"       (distance 1)
"typografy"   → warns, suggests "typography"    (distance 2)  
"icons"       → silent (distance 4 from any key)
"motion"      → silent (distance 5+ from any key)
"colors"      → silent (exact match, not unknown)
"nam"         → warns, suggests "name"          (distance 1)
"rounding"    → silent (distance 3 from "rounded")

Copy link
Copy Markdown
Collaborator

@davideast davideast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting changes based on the larger comment I left.

ryo-manba added a commit to ryo-manba/design.md that referenced this pull request May 26, 2026
DESIGN.md is intentionally extensible, so warning on every unknown
top-level key flags legitimate custom fields. Restrict the rule to
likely typos of known schema keys (edit distance ≤ 2, case-insensitive)
and stay silent for unrelated extension keys.

Per @davideast review feedback on google-labs-code#84:
- Add SCHEMA_KEYS / SchemaKey in parser/spec.ts as the single source
  of truth and reference it from the model handler.
- Add a zero-dependency Levenshtein helper.
- Suggest the closest known key in the warning message.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
DESIGN.md is intentionally extensible, so warning on every unknown
top-level key flags legitimate custom fields. Restrict the rule to
likely typos of known schema keys (edit distance ≤ 2, case-insensitive)
and stay silent for unrelated extension keys.

Per @davideast review feedback on google-labs-code#84:
- Add SCHEMA_KEYS / SchemaKey in parser/spec.ts as the single source
  of truth and reference it from the model handler.
- Add a zero-dependency Levenshtein helper.
- Suggest the closest known key in the warning message.
@ryo-manba ryo-manba force-pushed the feat/unknown-key-lint branch from 0f3e337 to 9ebf3a2 Compare May 26, 2026 14:57
Cover empty strings, single edit operations (insert/delete/substitute),
symmetry, the classic kitten/sitting case, and the exact distances
used by the unknown-key typo threshold.
@ryo-manba ryo-manba requested a review from davideast May 26, 2026 15:02
@ryo-manba
Copy link
Copy Markdown
Author

@davideast Thanks for the review! I've fixed it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: warn on unknown top-level schema keys (catch typos like colours:)

3 participants