-
Notifications
You must be signed in to change notification settings - Fork 7.6k
feat(tools): add SkimReaderTool (x402-paid clean web reader) #6266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
JessieJanie
wants to merge
2
commits into
crewAIInc:main
Choose a base branch
from
JessieJanie:feat/skim-reader-tool
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
69 changes: 69 additions & 0 deletions
69
lib/crewai-tools/src/crewai_tools/tools/skim_reader_tool/README.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,69 @@ | ||
| # SkimReaderTool | ||
|
|
||
| ## Description | ||
|
|
||
| [Skim](https://skim402.com) is an x402-native clean reader API for AI agents. | ||
| Give it a URL and it returns clean, agent-ready Markdown plus structured | ||
| metadata (title, byline, published date, language, excerpt) — nav, ads, and | ||
| boilerplate stripped out. | ||
|
|
||
| `SkimReaderTool` lets a CrewAI agent read any web page (articles, docs, blog | ||
| posts, GitHub READMEs, research papers) as Markdown. Reads are paid per call | ||
| over the [x402 protocol](https://x402.org) — $0.002 in USDC on Base — using a | ||
| wallet you control. There are no API keys and no signup: the wallet is the | ||
| identity, and payment happens automatically on the HTTP 402 handshake. | ||
|
|
||
| ## Installation | ||
|
|
||
| Install the tool with the `x402` extra, which pulls the x402 client with EVM | ||
| support: | ||
|
|
||
| ```shell | ||
| pip install 'crewai[tools]' | ||
| pip install 'crewai-tools[x402]' | ||
| ``` | ||
|
|
||
| ## Requirements | ||
|
|
||
| - A Base wallet private key, funded with a small amount of USDC on Base, exposed | ||
| as the `SKIM_WALLET_PRIVATE_KEY` environment variable (or passed via | ||
| `private_key=`). Use a dedicated wallet, never your personal one. The key is | ||
| used only to sign payment authorizations locally and never leaves your machine. | ||
|
|
||
| ## Example | ||
|
|
||
| ```python | ||
| from crewai_tools import SkimReaderTool | ||
|
|
||
| # Reads SKIM_WALLET_PRIVATE_KEY from the environment. | ||
| tool = SkimReaderTool() | ||
|
|
||
| markdown = tool.run(url="https://en.wikipedia.org/wiki/HTTP_402") | ||
| print(markdown) | ||
| ``` | ||
|
|
||
| Or wire it into an agent: | ||
|
|
||
| ```python | ||
| from crewai import Agent | ||
| from crewai_tools import SkimReaderTool | ||
|
|
||
| researcher = Agent( | ||
| role="Researcher", | ||
| goal="Read and summarize web pages", | ||
| backstory="An analyst who reads primary sources before drawing conclusions.", | ||
| tools=[SkimReaderTool()], | ||
| ) | ||
| ``` | ||
|
|
||
| ## Arguments | ||
|
|
||
| - `private_key` (`SecretStr`, optional): Hex private key for the paying Base | ||
| wallet (with or without `0x`). Falls back to `SKIM_WALLET_PRIVATE_KEY`. | ||
| - `base_url` (`str`, optional): Skim API base URL. Defaults to | ||
| `https://skim402.com`. | ||
| - `max_price_usd` (`float`, optional): Hard per-call price cap in USD. The wallet | ||
| refuses to sign for anything above this. Defaults to `0.01` (Skim is `$0.002`). | ||
| - `include_metadata` (`bool`, optional): When `True` (default), prepend a YAML | ||
| frontmatter block of the page metadata to the returned Markdown. | ||
| - `timeout` (`float`, optional): Per-request timeout in seconds. Defaults to `60`. |
Empty file.
221 changes: 221 additions & 0 deletions
221
lib/crewai-tools/src/crewai_tools/tools/skim_reader_tool/skim_reader_tool.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,221 @@ | ||
| """CrewAI tool for Skim — the x402-native clean reader API for AI agents. | ||
|
|
||
| Skim (https://skim402.com) turns any URL into clean, agent-ready Markdown plus | ||
| structured metadata. Reads are paid per call over the x402 protocol ($0.002 in | ||
| USDC on Base) using a wallet you control — no API keys, no signup. | ||
| """ | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| import importlib | ||
| import os | ||
| from typing import Any | ||
|
|
||
| from crewai.tools import BaseTool, EnvVar | ||
| from pydantic import BaseModel, ConfigDict, Field, PrivateAttr, SecretStr | ||
| import requests | ||
|
|
||
| from crewai_tools.security.safe_path import validate_url | ||
|
|
||
|
|
||
| DEFAULT_BASE_URL = "https://skim402.com" | ||
|
|
||
|
|
||
| def _yaml_scalar(value: Any) -> str: | ||
| """Render a metadata value as a safe single-line YAML scalar. | ||
|
|
||
| Collapses internal whitespace/newlines and double-quotes the value when it | ||
| contains characters that could otherwise produce invalid or ambiguous YAML. | ||
| """ | ||
| text = " ".join(str(value).split()) | ||
| needs_quoting = ( | ||
| text == "" | ||
| or text[0] in "!&*?|>%@`\"'#,[]{}:-" | ||
| or ": " in text | ||
| or text.endswith(":") | ||
| or text[0] == " " | ||
| ) | ||
| if needs_quoting: | ||
| escaped = text.replace("\\", "\\\\").replace('"', '\\"') | ||
| return f'"{escaped}"' | ||
| return text | ||
|
|
||
|
|
||
| _TOOL_DESCRIPTION = ( | ||
| "Fetch any URL and return clean, agent-ready Markdown via Skim (skim402.com). " | ||
| "Strips nav, ads, and boilerplate; preserves the article body plus structured " | ||
| "metadata (title, byline, published date, language, excerpt). Pays $0.002 per " | ||
| "call in USDC on Base over the x402 protocol — no API keys, no signup. Use this " | ||
| "whenever you need to read web content: articles, docs, blog posts, GitHub " | ||
| "READMEs, research papers, and similar pages." | ||
| ) | ||
|
|
||
|
|
||
| class SkimReaderToolSchema(BaseModel): | ||
| """Input schema for :class:`SkimReaderTool`.""" | ||
|
|
||
| url: str = Field( | ||
| description="The fully-qualified URL to fetch and clean (https://...)." | ||
| ) | ||
|
|
||
|
|
||
| class SkimReaderTool(BaseTool): | ||
| """Read any URL as clean Markdown via Skim, paying per call over x402. | ||
|
|
||
| The tool lazily builds a payment-aware HTTP session the first time it runs, | ||
| using your Base wallet's private key to sign USDC authorizations on demand. | ||
| The key is used only to sign locally and never leaves your machine. | ||
|
|
||
| Args: | ||
| private_key (SecretStr): Hex private key (with or without ``0x``) for the | ||
| Base wallet that pays for reads. Falls back to the | ||
| ``SKIM_WALLET_PRIVATE_KEY`` environment variable. Use a dedicated | ||
| wallet, never your personal one. | ||
| base_url (str): Skim API base URL. Defaults to ``https://skim402.com``. | ||
| max_price_usd (float): Hard per-call price cap in USD. The wallet refuses | ||
| to sign for anything above this. Defaults to ``0.01`` (Skim is | ||
| ``$0.002``). | ||
| include_metadata (bool): When ``True`` (default), prepend a YAML | ||
| frontmatter block of the page metadata to the returned Markdown. | ||
| timeout (float): Per-request timeout in seconds. Defaults to ``60``. | ||
| """ | ||
|
|
||
| model_config = ConfigDict( | ||
| arbitrary_types_allowed=True, validate_assignment=True, frozen=False | ||
| ) | ||
| name: str = "Skim web reader" | ||
| description: str = _TOOL_DESCRIPTION | ||
| args_schema: type[BaseModel] = SkimReaderToolSchema | ||
|
|
||
| private_key: SecretStr | None = Field(default=None, exclude=True, repr=False) | ||
| base_url: str = DEFAULT_BASE_URL | ||
| max_price_usd: float = 0.01 | ||
| include_metadata: bool = True | ||
| timeout: float = 60.0 | ||
|
|
||
| package_dependencies: list[str] = Field( | ||
| default_factory=lambda: ["x402", "eth-account", "requests"] | ||
| ) | ||
| env_vars: list[EnvVar] = Field( | ||
| default_factory=lambda: [ | ||
| EnvVar( | ||
| name="SKIM_WALLET_PRIVATE_KEY", | ||
| description=( | ||
| "Hex private key for the Base wallet that pays for Skim reads. " | ||
| "Used only to sign x402 payment authorizations locally." | ||
| ), | ||
| required=False, | ||
| ), | ||
| ] | ||
| ) | ||
|
|
||
| _session: Any = PrivateAttr(default=None) | ||
|
|
||
| def _get_session(self) -> Any: | ||
| """Build (and cache) a requests Session that auto-pays 402 responses.""" | ||
| if self._session is not None: | ||
| return self._session | ||
|
|
||
| try: | ||
| account_factory = importlib.import_module("eth_account").Account | ||
| x402_client_sync = importlib.import_module("x402").x402ClientSync | ||
| max_amount = importlib.import_module("x402.client").max_amount | ||
| wrap_with_payment = importlib.import_module( | ||
| "x402.http.clients.requests" | ||
| ).wrapRequestsWithPayment | ||
| register_exact_evm_client = importlib.import_module( | ||
| "x402.mechanisms.evm.exact.register" | ||
| ).register_exact_evm_client | ||
| eth_account_signer = importlib.import_module( | ||
| "x402.mechanisms.evm.signers" | ||
| ).EthAccountSigner | ||
| except ImportError as exc: | ||
| raise ImportError( | ||
| "SkimReaderTool needs the x402 client with EVM support. Install it " | ||
| "with: pip install 'x402[evm]' requests eth-account" | ||
| ) from exc | ||
|
|
||
| key = ( | ||
| self.private_key.get_secret_value() | ||
| if self.private_key is not None | ||
| else os.environ.get("SKIM_WALLET_PRIVATE_KEY") | ||
| ) | ||
| if not key: | ||
| raise ValueError( | ||
| "Skim requires payment via x402. Provide a Base wallet funded with " | ||
| "USDC by setting the SKIM_WALLET_PRIVATE_KEY environment variable, " | ||
| "or by passing private_key=... to SkimReaderTool(). The key never " | ||
| "leaves your machine — it only signs payment authorizations locally." | ||
| ) | ||
|
|
||
| normalized = key[2:] if key.startswith("0x") else key | ||
| if len(normalized) != 64 or any( | ||
| c not in "0123456789abcdefABCDEF" for c in normalized | ||
| ): | ||
| raise ValueError( | ||
| "SKIM_WALLET_PRIVATE_KEY must be a 64-character hex string (with or " | ||
| "without a 0x prefix)." | ||
| ) | ||
|
|
||
| account = account_factory.from_key("0x" + normalized) | ||
| cap_atomic = round(self.max_price_usd * 1_000_000) # USDC has 6 decimals | ||
| client = x402_client_sync() | ||
| register_exact_evm_client( | ||
| client, | ||
| eth_account_signer(account), | ||
| policies=[max_amount(cap_atomic)], | ||
| ) | ||
| self._session = wrap_with_payment(requests.Session(), client) | ||
| return self._session | ||
|
|
||
| def _run(self, url: str) -> str: | ||
| url = validate_url(url) | ||
| session = self._get_session() | ||
| endpoint = self.base_url.rstrip("/") + "/api/v1/read" | ||
|
|
||
| try: | ||
| res = session.post( | ||
| endpoint, | ||
| json={"url": url, "mode": "basic"}, | ||
| timeout=self.timeout, | ||
| ) | ||
| except Exception as exc: | ||
| raise RuntimeError( | ||
| f"Skim request failed: {exc}. Common causes: the wallet has no USDC " | ||
| f"on Base, or the price exceeded max_price_usd (${self.max_price_usd})." | ||
| ) from exc | ||
|
|
||
| if not getattr(res, "ok", res.status_code < 400): | ||
| body = (res.text or "").strip() | ||
| raise RuntimeError( | ||
| f"Skim returned {res.status_code} {getattr(res, 'reason', '')}: " | ||
| f"{body or '(no body)'}" | ||
| ) | ||
|
|
||
| try: | ||
| data = res.json() | ||
| except ValueError as exc: | ||
| raise RuntimeError( | ||
| "Skim returned a non-JSON response. This usually means the request " | ||
| f"did not reach the Skim API. Underlying error: {exc}" | ||
| ) from exc | ||
|
|
||
| if not isinstance(data, dict): | ||
| raise RuntimeError( | ||
| "Skim returned an unexpected response shape (expected a JSON object). " | ||
| "This usually means the request did not reach the Skim API." | ||
| ) | ||
|
|
||
| markdown: str = data.get("markdown") or data.get("text") or "" | ||
|
|
||
| metadata = data.get("metadata") | ||
| if self.include_metadata and isinstance(metadata, dict): | ||
| meta_lines = [ | ||
| f"{k}: {_yaml_scalar(v)}" | ||
| for k, v in metadata.items() | ||
| if v is not None and v != "" | ||
| ] | ||
| if meta_lines: | ||
| markdown = "---\n" + "\n".join(meta_lines) + "\n---\n\n" + markdown | ||
|
|
||
| return markdown |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new optional extra uses lower-bound-only dependency specifiers:
These are direct dependencies resolved when users install
crewai-tools[x402]; without upper bounds and an updated rootuv.lock, a future compromised or malicious release could be pulled into builds automatically. The project supply-chain guardrail requires bounded (~=or>=,<) constraints and committed lockfile updates.Remediation: Use bounded version constraints for each new direct dependency and regenerate/commit the repository
uv.lock.For more details, see the finding in Corridor.
Provide feedback: Reply with whether this is a valid vulnerability or false positive to help improve Corridor's accuracy.