Skip to content

Latest commit

 

History

History
264 lines (188 loc) · 8.27 KB

File metadata and controls

264 lines (188 loc) · 8.27 KB

🤝 Contributing to @xcrap/html-parser

Thank you for your interest in contributing! This document provides guidelines and instructions to help you get started.


📋 Table of Contents


Code of Conduct

This project and everyone participating in it is expected to follow basic standards of respectful and inclusive communication. Harassment, discrimination, or disrespectful behavior of any kind will not be tolerated.


How Can I Contribute?

There are several ways to contribute to this project:

  • 🐛 Report bugs — Open an issue with a clear reproduction case.
  • 💡 Suggest features — Open an issue describing the use case and expected behavior.
  • 🔧 Fix bugs — Check the open issues and submit a pull request.
  • Implement features — Discuss first in an issue before implementing large changes.
  • 📖 Improve documentation — Fix typos, add examples, or clarify existing content.
  • 🧪 Write tests — Improve test coverage for edge cases.

Development Setup

Prerequisites

Ensure you have the following installed:

Tool Minimum Version Install Link
Rust stable https://rustup.rs/
Node.js >= 18.0.0 https://nodejs.org/
Yarn >= 4.0.0 npm install -g yarn

Installation Steps

# 1. Fork the repository on GitHub, then clone your fork:
git clone https://github.com/<your-username>/html-parser.git
cd html-parser

# 2. Add the upstream remote:
git remote add upstream https://github.com/Xcrap-Cloud/html-parser.git

# 3. Install Node.js dependencies:
yarn install

# 4. Build the native addon:
yarn build

Verifying Your Setup

# Run the test suite to confirm everything is working:
yarn test

All tests should pass before you begin making changes.


Project Structure

@xcrap/html-parser/
├── src/                    # Rust source code
│   ├── lib.rs              # Crate entry point (NAPI export, parse() function)
│   ├── parser.rs           # HTMLParser struct (lazy-loaded engines)
│   ├── types.rs            # HTMLElement struct (properties + nested queries)
│   ├── engines.rs          # Core selection logic (CSS and XPath)
│   └── query_builders.rs   # css() and xpath() builder functions
├── __test__/
│   └── index.spec.ts       # AVA test suite
├── benchmark/
│   └── bench.ts            # Performance benchmarks
├── index.d.ts              # Auto-generated TypeScript declarations (do not edit)
├── index.js                # Auto-generated JS entry point (do not edit)
├── Cargo.toml              # Rust package manifest
├── package.json            # Node.js package manifest
├── build.rs                # NAPI build script
└── README.md               # Project documentation

⚠️ Files marked do not edit (index.d.ts, index.js) are auto-generated by napi build. They will be overwritten on the next build.


Making Changes

Branching Strategy

Create a dedicated branch for your contribution:

# For new features:
git checkout -b feat/your-feature-name

# For bug fixes:
git checkout -b fix/issue-description

# For documentation-only changes:
git checkout -b docs/what-you-changed

Always branch off from main and keep your branch up-to-date:

git fetch upstream
git rebase upstream/main

Commit Messages

We follow the Conventional Commits specification:

<type>(<scope>): <short description>

[optional body]

[optional footer]

Types:

Type When to use
feat A new feature
fix A bug fix
docs Documentation-only changes
refactor Code changes that don't fix a bug or add a feature
test Adding or correcting tests
chore Tooling, config, dependency updates
perf A code change that improves performance

Examples:

git commit -m "feat: add support for data-* attribute queries"
git commit -m "fix: handle empty id attribute in HTMLElement.id getter"
git commit -m "docs: add XPath query examples to README"
git commit -m "test: add edge case for invalid CSS selector"

Coding Standards

Rust

  • Follow the Rust API Guidelines.
  • Run cargo fmt (or yarn format:rs) before committing.
  • Run cargo clippy and address any warnings — the crate has #![deny(clippy::all)].
  • Prefer Option over panics for recoverable failure modes (e.g., element not found).
  • Document public functions with /// doc comments.
/// Selects the first element matching `query` using the CSS engine.
///
/// Returns `None` if no element matches or the query is invalid.
pub fn select_first_by_css(document: &Html, query: String) -> Option<HTMLElement> {
    // ...
}

TypeScript

  • Follow the existing code style enforced by Prettier and OXLint.
  • Run yarn format:prettier and yarn lint before committing.
  • Avoid any — use explicit types or generics.
  • Do not manually edit index.d.ts or index.js — they are auto-generated.

Writing Tests

Tests live in __test__/index.spec.ts and are run with AVA.

Guidelines:

  1. Each test should cover a single behavior.
  2. Use descriptive test titles: "HTMLElement.id returns null when element has no id".
  3. Group related tests with section comments (e.g., // ─── HTMLElement — properties ───).
  4. When fixing a bug, add a test that would have failed before the fix.
  5. Cover edge cases: empty strings, missing attributes, invalid selectors, null returns.

Example:

test("HTMLElement.getAttribute returns null for a missing attribute", (t) => {
  const parser = new HTMLParser("<div>Hello</div>")
  const el = parser.selectFirst({ query: css("div") }) as HTMLElement
  t.is(el.getAttribute("data-missing"), null)
})

Running tests:

yarn test

Pull Request Process

  1. Ensure all checks pass — tests, formatting, and linting.
  2. Update documentation — if your change affects the public API, update README.md.
  3. Keep the PR focused — one feature or fix per PR. Avoid bundling unrelated changes.
  4. Fill in the PR template — describe what changed and why, and link to related issues.
  5. Be responsive — address review comments promptly.

PRs are reviewed by maintainers. We aim to provide feedback within a few days.


Reporting Bugs

Before opening a bug report:

  • Check the existing issues to avoid duplicates.
  • Confirm the bug is reproducible on a supported Node.js version (>= 18).

When opening an issue, include:

  • A minimal reproduction (HTML snippet + code that triggers the bug).
  • The expected vs. actual behavior.
  • Your environment: OS, Node.js version, package version.

Requesting Features

Open an issue labeled enhancement with:

  • A clear description of the problem you're trying to solve.
  • The proposed API or behavior (with examples if possible).
  • Any alternatives you've considered.

Large features should be discussed in an issue before implementation to avoid wasted effort.


Thank you for contributing! 🎉