Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,4 @@ vectorlint.ini
# .agent/
/.idea
/npm
VECTORLINT.md
.vectorlint/runs/
docs
77 changes: 77 additions & 0 deletions docs/VECTORLINT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# VectorLint example styling rules

{/* This file is prepended to the system prompt for every VectorLint evaluation
This file defines global style instructions for all VectorLint evaluations.
VectorLint prepends its contents to the system prompt for every rule it runs.
Keep this file under ~800 tokens to avoid performance and cost issues.
Adapt the rules below to match your team's style guide.
*/}
Comment thread
oshorefueled marked this conversation as resolved.

## Voice and tone

- Write in active voice. Never use passive constructions.
- Use present tense throughout.
- Address the reader as "you." Do not use "the user," "the developer," or "one."
- Do not call any step or feature "simple," "easy," "straightforward," or "just."
- Do not use marketing language, superlatives, or exclamation marks.
- Do not use filler phrases: "please note," "at this time," "it is worth mentioning."
- Do not use weasel words when the correct answer is known: "might," "could," "generally," "typically."
- Do not attribute motivations or feelings to software. Systems execute; they do not think, want, or know.
- Do not use "execute" to refer to using software. Use "run," "deploy," or "instantiate" depending on the context.
- Contractions are acceptable and preferred when they sound natural.

## Structure

- Every page must open with a short description (1–2 sentences) stating what the page covers and who it is for.
- Do not duplicate the page title in the short description.
- Use sentence case for all headings. Capitalize only the first word and proper nouns.
- Do not end headings with punctuation.
- Do not skip heading levels.
- Do not stack two headings with no body content between them.
- Start procedure headings with an imperative verb, not a gerund.

## Lists

- Introduce every list with a complete sentence ending in a colon.
- Do not let a list complete a sentence that begins above it.
- All items in a list must follow parallel grammatical structure.
- Do not create a list with fewer than two items. Use prose instead.
- Do not nest lists more than two levels deep.

## Procedures

- Use numbered steps for sequential tasks.
- Write one action per step.
- Start each step with an imperative verb.
- List prerequisites before step one under a "Before you begin" or "Prerequisites" heading.
- Include a verification step when the outcome is not visually obvious.

## Code and technical content

- Put every command the reader must type in a code block, not inline prose.
- Include a language identifier on every fenced code block.
- Format command names, file names, and parameter names as inline code.
- Format UI element names in bold, not inline code.
- Use `<PLACEHOLDER_NAME>` format for values the reader must replace.
- Explain every placeholder immediately after the code block.

## Language

- Define every acronym on first use: full term, then acronym in parentheses.
- After first use, use the acronym consistently. Do not alternate.
- Do not use "e.g.," "i.e.," or "etc." Use "for example," "that is," and "and so on."
- Spell out numbers zero through nine. Use numerals for 10 and above.
- Always use numerals for measurements, versions, and technical values.

## Terminology

- Use one term per concept. Do not introduce synonyms to avoid repetition.
- Use the official name of a product exactly as the project spells and capitalizes it.

## Admonitions

- Use admonitions sparingly. Do not use more than one per section.
- Keep admonition content to one or two sentences.
- Do not begin admonition content with the admonition type label ("Note:", "Warning:").
- Use Warning only for risks of data loss, system failure, or security vulnerability.
- Use Tip only for optional techniques. Do not put required steps in a Tip.
140 changes: 140 additions & 0 deletions docs/best-practices.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
---
title: Best practices
description: Proven patterns for getting accurate, low-noise results from VectorLint across your content workflow.
---

These practices are drawn from the patterns that appear throughout the VectorLint documentation — consolidated here so you can apply them without reading every page first.

## Start with VECTORLINT.md, not rule pack files

Start with `VECTORLINT.md`. It requires no configuration, activates immediately, and brings your most important standards in plain language before you commit to structured prompts.

After you run `VECTORLINT.md` against your content library and complete your first assessments, you can begin tuning the rules in 'VECTORLINT.md' based on your initial findings. This way, you'll have a sense of if you have styling gaps that warrant a dedicated rule pack file.

## Keep VECTORLINT.md under 800 tokens

VectorLint emits a warning at 4,000 tokens, but a practical target is *much* lower. Under 800 tokens leaves headroom for rule-specific prompts to add context without the combined system prompt becoming unwieldy. Long context degrades LLM precision and increases API costs on every evaluation.

If your `VECTORLINT.md` is growing, that's usually a sign that some rules are specific enough to belong in a dedicated rule pack file instead.

## Write specific prompts, not general ones

The quality of VectorLint's findings is directly proportional to the specificity of your prompts. Vague prompts produce vague findings — or worse, inconsistent findings that vary by run.

Instead of:
```
Check if the writing is clear.
```

Write:
```
You are a clarity evaluator for developer documentation. Flag sentences that:
1. Exceed 25 words
2. Use passive voice where active voice is possible
3. Contain filler phrases: "it is important to note", "please be aware", "in order to"
```

The second prompt gives the LLM a defined audience, measurable criteria, and specific examples. It will produce consistent, actionable findings.

## Give rules domain context

LLMs evaluate content relative to an implied standard. Make that standard explicit with a context block in your rule prompt. Tell the model who the audience is, what they value, and what good looks like for your specific content type.

```markdown
## CONTEXT BANK

**Audience**: Software engineers and DevOps practitioners who value:
- Technical precision over marketing language
- Practical examples over theory
- Direct answers without lengthy preambles
```

A grammar rule without context produces generic grammar findings. The same rule with a developer audience context produces findings calibrated to technical writing conventions.

## Use weights that reflect real priorities

In judge rules, weights are the single most important configuration decision. They determine which criteria actually control the final score. Treat them as a statement of your content team's values — not arbitrary numbers.

```yaml
criteria:
- name: Technical Accuracy
weight: 40 # Factual errors erode user trust — this matters most
- name: Clarity
weight: 30 # Unclear docs generate support tickets
- name: Tone
weight: 20 # Important but recoverable in editing
- name: SEO
weight: 10 # Nice to have, never at the expense of the above
```

If everything has the same weight, nothing is prioritized.

## Tier strictness by content type

Not all content deserves the same quality bar. Apply strictness in proportion to how much a failure costs — measured in user trust, support load, or brand impact.

```ini
# Customer-facing API docs — every error matters
[content/docs/**/*.md]
GrammarChecker.strictness=strict

# Blog posts — quality matters, tone is flexible
[content/blog/**/*.md]
GrammarChecker.strictness=standard

# Internal drafts — let writers write
[content/drafts/**/*.md]
RunRules=
```

Setting the same strictness everywhere produces either too much noise on low-stakes content or too little signal on high-stakes content.

## Start permissive, tighten over time

When rolling VectorLint out to a team for the first time, resist the urge to apply strict settings immediately. A workflow that generates too many findings on day one loses the team's trust before it earns it.

1. Start with `CONFIDENCE_THRESHOLD=0.75` and `standard` strictness
2. Run against your existing content library and review findings as a team
3. Identify which findings are consistently useful vs. consistently dismissed
4. Raise strictness on your highest-stakes content first
5. Raise `CONFIDENCE_THRESHOLD` once your rules are stable

The goal is a workflow where every finding is worth reading. That takes iteration.

## Set a higher confidence threshold in CI than locally

In CI, a false positive blocks a merge. Set `CONFIDENCE_THRESHOLD` higher in your CI environment than in local development so only the highest-confidence findings gate a merge. Lower-confidence candidates still surface locally where a writer can evaluate them in context.

```bash
# Local development — catch more, review in context
CONFIDENCE_THRESHOLD=0.75

# CI environment — only high-confidence findings block merges
CONFIDENCE_THRESHOLD=0.85
```

## Gate CI only on production-bound content

Limit your CI workflow's `paths` filter to directories that actually ship to users. Checking drafts or work-in-progress in CI creates unnecessary friction and noise.

```yaml
on:
pull_request:
paths:
- 'content/docs/**'
- 'content/api/**'
```

Drafts should have `RunRules=` in `.vectorlint.ini` — VectorLint skips them entirely and they never reach CI.

## Validate new rules before raising strictness

When you write a new rule, run it at `lenient` strictness and low `CONFIDENCE_THRESHOLD` first. Review everything it flags. Once you're confident the rule's coverage is correct and its false positive rate is acceptable, raise both settings to production levels.

Skipping this step leads to rules that look correct on paper but produce noise in practice — which erodes team confidence in the entire workflow.

## Next steps

- [Tuning evaluation precision](/false-positive-tuning) — detailed guidance on CONFIDENCE_THRESHOLD and strictness
- [CI Integration](/ci-integration) — set up content quality gates in your pipeline
- [Customize style rules](/customize-style-rules) — write effective prompts for rule pack files
160 changes: 160 additions & 0 deletions docs/ci-integration.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
---
title: CI integration
description: Gate merges on content quality by running VectorLint in your CI pipeline.
---

VectorLint exits with a non-zero status code when it finds violations, making it a natural fit for CI pipelines. Add it as a pre-merge check and content that fails your quality thresholds never reaches production.

## How it works in CI

When VectorLint finds violations it exits with code `1`. When content passes all checks it exits with code `0`. Most CI systems treat a non-zero exit as a failed step and block the merge automatically — no additional configuration needed.

| Exit code | Meaning |
|-----------|---------|
| `0` | All files passed — no violations found |
| `1` | One or more violations found |

## GitHub Actions

### Basic setup

Add a workflow file at `.github/workflows/vectorlint.yml`:

```yaml
name: Content quality check

on:
pull_request:
paths:
- 'content/**/*.md'

jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: lts/*

- name: Install VectorLint
run: npm install -g vectorlint

- name: Run content check
run: vectorlint content/**/*.md
env:
LLM_PROVIDER: openai
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
CONFIDENCE_THRESHOLD: 0.85
```

The `paths` filter limits the workflow to runs where content files actually changed — it won't fire on code-only PRs.

### Storing API keys

Never commit API keys to your repository. Store them as GitHub Actions secrets:

1. Go to your repository **Settings → Secrets and variables → Actions**
2. Add a new secret for each key (e.g. `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`)
3. Reference them in your workflow with `${{ secrets.SECRET_NAME }}`

### Checking only changed files

For large content libraries, checking every file on every PR is slow and expensive. Use `git diff` to check only the files changed in the PR:

```yaml
- name: Get changed files
id: changed
run: |
echo "files=$(git diff --name-only origin/${{ github.base_ref }}...HEAD | grep '\.md$' | tr '\n' ' ')" >> $GITHUB_OUTPUT

- name: Run content check
if: steps.changed.outputs.files != ''
run: vectorlint ${{ steps.changed.outputs.files }}
env:
LLM_PROVIDER: openai
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
```

## Recommended CI configuration

CI environments should be stricter than local development. A finding a writer might dismiss in review becomes a merge blocker in CI — so surface only high-confidence violations.

**Raise `CONFIDENCE_THRESHOLD` in CI.** Set it higher than your local default so only the most certain findings block a merge:

```yaml
env:
CONFIDENCE_THRESHOLD: 0.85
```

**Gate checks only on production-bound content.** Limit the workflow `paths` filter to directories that ship to users, not drafts or internal docs:

```yaml
on:
pull_request:
paths:
- 'content/docs/**'
- 'content/api/**'
```

**Use strict strictness on gated content.** In your `.vectorlint.ini`, apply strict scoring to the files that CI checks:

```ini
[content/docs/**/*.md]
RunRules=TechDocs
GrammarChecker.strictness=strict

[content/drafts/**/*.md]
RunRules=
```

Drafts never reach CI — the empty `RunRules=` means VectorLint skips them entirely.

## Other CI systems

VectorLint works with any CI system that supports running shell commands. The pattern is the same: install VectorLint, set environment variables from secrets, run the check.

**GitLab CI:**

```yaml
content-quality:
image: node:lts
script:
- npm install -g vectorlint
- vectorlint content/docs/**/*.md
variables:
LLM_PROVIDER: openai
OPENAI_API_KEY: $OPENAI_API_KEY
rules:
- changes:
- content/**/*.md
```

**CircleCI:**

```yaml
jobs:
content-check:
docker:
- image: cimg/node:lts
steps:
- checkout
- run:
name: Install VectorLint
command: npm install -g vectorlint
- run:
name: Run content check
command: vectorlint content/**/*.md
environment:
LLM_PROVIDER: openai
```

Store API keys in your CI system's secret or environment variable manager — never in the workflow file itself.

## Next steps

- [Tuning evaluation precision](/false-positive-tuning) — set `CONFIDENCE_THRESHOLD` and strictness for CI
- [Project Configuration](/project-config) — configure which files CI checks
- [CLI reference](/cli-reference) — full command and exit code reference
Loading
Loading