ScrapeGraphAI JS SDK

Official TypeScript SDK for the ScrapeGraphAI AI API.

Install

npm i scrapegraph-js
# or
bun add scrapegraph-js

Quick Start

API key

Log in to the ScrapeGraphAI dashboard to create an API key. The dashboard also shows your request history, usage, credits, and crawl/monitor activity.

Set it in your environment:

export SGAI_API_KEY=...

import { ScrapeGraphAI } from "scrapegraph-js";

// reads SGAI_API_KEY from env, or pass explicitly: ScrapeGraphAI({ apiKey: "..." })
const sgai = ScrapeGraphAI();

const result = await sgai.scrape({
  url: "https://example.com",
  formats: [{ type: "markdown" }],
});

if (result.status === "success") {
  console.log(result.data?.results.markdown?.data);
} else {
  console.error(result.error);
}

Every function returns ApiResult<T> — no exceptions to catch:

type ApiResult<T> = {
  status: "success" | "error";
  data: T | null;
  error?: string;
  elapsedMs: number;
};

API

scrape

Scrape a webpage in multiple formats (markdown, html, screenshot, json, etc).

const res = await sgai.scrape({
  url: "https://example.com",
  formats: [
    { type: "markdown", mode: "reader" },
    { type: "screenshot", fullPage: true, width: 1440, height: 900 },
    { type: "json", prompt: "Extract product info" },
  ],
  contentType: "text/html",        // optional, auto-detected
  fetchConfig: {                   // optional
    mode: "js",                    // "auto" | "fast" | "js"
    stealth: true,
    timeout: 30000,
    wait: 2000,
    scrolls: 3,
    headers: { "Accept-Language": "en" },
    cookies: { session: "abc" },
    country: "us",
  },
});

Formats:

markdown — Clean markdown (modes: normal, reader, prune)
html — Raw HTML (modes: normal, reader, prune)
links — All links on the page
images — All image URLs
summary — AI-generated summary
json — Structured extraction with prompt/schema
branding — Brand colors, typography, logos
screenshot — Page screenshot (fullPage, width, height, quality)

extract

Extract structured data from a URL, HTML, or markdown using AI.

const res = await sgai.extract({
  url: "https://example.com",
  prompt: "Extract product names and prices",
  schema: { /* JSON schema */ },   // optional
  mode: "reader",                  // optional
  fetchConfig: { /* ... */ },      // optional
});
// Or pass html/markdown directly instead of url

search

Search the web and optionally extract structured data.

const res = await sgai.search({
  query: "best programming languages 2024",
  numResults: 5,                   // 1-20, default 3
  format: "markdown",              // "markdown" | "html"
  prompt: "Extract key points",    // optional, for AI extraction
  schema: { /* ... */ },           // optional
  timeRange: "past_week",          // optional
  locationGeoCode: "us",           // optional
  fetchConfig: { /* ... */ },      // optional
});

crawl

Crawl a website and its linked pages.

// Start a crawl
const start = await sgai.crawl.start({
  url: "https://example.com",
  formats: [{ type: "markdown" }],
  maxPages: 50,
  maxDepth: 2,
  maxLinksPerPage: 10,
  includePatterns: ["/blog/*"],
  excludePatterns: ["/admin/*"],
  fetchConfig: { /* ... */ },
});

// Check status
const status = await sgai.crawl.get(start.data?.id!);

// Fetch paginated pages with resolved scrape results
const pages = await sgai.crawl.pages(start.data?.id!, {
  cursor: 0,
  limit: 50,
});

// Control
await sgai.crawl.stop(id);
await sgai.crawl.resume(id);
await sgai.crawl.delete(id);

monitor

Monitor a webpage for changes on a schedule.

// Create a monitor
const mon = await sgai.monitor.create({
  url: "https://example.com",
  name: "Price Monitor",
  interval: "0 * * * *",           // cron expression
  formats: [{ type: "markdown" }],
  webhookUrl: "https://...",       // optional
  fetchConfig: { /* ... */ },
});

// Manage monitors
await sgai.monitor.list();
await sgai.monitor.get(cronId);
await sgai.monitor.update(cronId, { interval: "0 */6 * * *" });
await sgai.monitor.pause(cronId);
await sgai.monitor.resume(cronId);
await sgai.monitor.delete(cronId);

history

Fetch request history.

const list = await sgai.history.list({
  service: "scrape",               // optional filter
  page: 1,
  limit: 20,
});

const entry = await sgai.history.get("request-id");

credits / healthy

const credits = await sgai.credits();
// { remaining: 1000, used: 500, plan: "pro", jobs: { crawl: {...}, monitor: {...} } }

const health = await sgai.healthy();
// { status: "ok", uptime: 12345 }

Examples

Service	Example	Description
scrape	`scrape_basic.ts`	Basic markdown scraping
scrape	`scrape_multi_format.ts`	Multiple formats (markdown, links, images, screenshot, summary)
scrape	`scrape_json_extraction.ts`	Structured JSON extraction with schema
scrape	`scrape_pdf.ts`	PDF document parsing with OCR metadata
scrape	`scrape_with_fetchconfig.ts`	JS rendering, stealth mode, scrolling
extract	`extract_basic.ts`	AI data extraction from URL
extract	`extract_with_schema.ts`	Extraction with JSON schema
search	`search_basic.ts`	Web search with results
search	`search_with_extraction.ts`	Search + AI extraction
crawl	`crawl_basic.ts`	Start and monitor a crawl
crawl	`crawl_with_formats.ts`	Crawl with screenshots and patterns
monitor	`monitor_basic.ts`	Create a page monitor
monitor	`monitor_with_webhook.ts`	Monitor with webhook notifications
utilities	`credits.ts`	Check account credits and limits
utilities	`health.ts`	API health check
utilities	`history.ts`	Request history

Environment Variables

Variable	Description	Default
`SGAI_API_KEY`	Your ScrapeGraphAI API key	—
`SGAI_API_URL`	Override API base URL	`https://v2-api.scrapegraphai.com/api`
`SGAI_DEBUG`	Enable debug logging (`"1"`)	off
`SGAI_TIMEOUT`	Request timeout in seconds	`120`

Development

bun install
bun run test              # unit tests
bun run test:integration  # live API tests (requires SGAI_API_KEY)
bun run build             # tsup → dist/
bun run check             # tsc --noEmit + biome

License

MIT - ScrapeGraphAI AI

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
.claude/skills/test-sdk		.claude/skills/test-sdk
.github/workflows		.github/workflows
.husky		.husky
examples		examples
media		media
packages/ai-sdk		packages/ai-sdk
src		src
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.MD		CONTRIBUTING.MD
README.md		README.md
biome.json		biome.json
package.json		package.json
tsconfig.json		tsconfig.json
tsup.config.ts		tsup.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ScrapeGraphAI JS SDK

Install

Quick Start

API key

API

scrape

extract

search

crawl

monitor

history

credits / healthy

Examples

Environment Variables

Development

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ScrapeGraphAI JS SDK

Install

Quick Start

API key

API

scrape

extract

search

crawl

monitor

history

credits / healthy

Examples

Environment Variables

Development

License

About

Topics

Resources

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages