Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,7 @@ git clone git@github.com:jackwener/opencli.git && cd opencli && npm install && n
| **tieba** | `hot` `posts` `search` `read` |
| **twitter** | `trending` `search` `timeline` `bookmarks` `post` `download` `profile` `article` `like` `likes` `notifications` `reply` `reply-dm` `thread` `follow` `unfollow` `followers` `following` `block` `unblock` `bookmark` `unbookmark` `delete` `hide-reply` `accept` |
| **reddit** | `hot` `frontpage` `popular` `search` `subreddit` `user` `user-posts` `user-comments` `read` `save` `saved` `subscribe` `upvote` `upvoted` `comment` |
| **amazon** | `bestsellers` `search` `product` `offer` `discussion` |
| **notebooklm** | `status` `list` `open` `select` `current` `get` `metadata` `source-list` `source-get` `source-fulltext` `source-guide` `history` `note-list` `notes-list` `notes-get` `summary` |
| **spotify** | `auth` `status` `play` `pause` `next` `prev` `volume` `search` `queue` `shuffle` `repeat` |

Expand Down
1 change: 1 addition & 0 deletions README.zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -176,6 +176,7 @@ npm install -g @jackwener/opencli@latest
| **douban** | `search` `top250` `subject` `photos` `download` `marks` `reviews` `movie-hot` `book-hot` | 浏览器 |
| **facebook** | `feed` `profile` `search` `friends` `groups` `events` `notifications` `memories` `add-friend` `join-group` | 浏览器 |
| **google** | `news` `search` `suggest` `trends` | 公开 |
| **amazon** | `bestsellers` `search` `product` `offer` `discussion` | 浏览器 |
| **spotify** | `auth` `status` `play` `pause` `next` `prev` `volume` `search` `queue` `shuffle` `repeat` | OAuth API |
| **notebooklm** | `status` `list` `open` `select` `current` `get` `metadata` `source-list` `source-get` `source-fulltext` `source-guide` `history` `note-list` `notes-list` `notes-get` `summary` | 浏览器 |
| **36kr** | `news` `hot` `search` `article` | 公开 / 浏览器 |
Expand Down
1 change: 1 addition & 0 deletions docs/.vitepress/config.mts
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ export default defineConfig({
{ text: 'Band', link: '/adapters/browser/band' },
{ text: 'Chaoxing', link: '/adapters/browser/chaoxing' },
{ text: 'Grok', link: '/adapters/browser/grok' },
{ text: 'Amazon', link: '/adapters/browser/amazon' },
{ text: 'NotebookLM', link: '/adapters/browser/notebooklm' },
{ text: 'WeRead', link: '/adapters/browser/weread' },
{ text: 'Douban', link: '/adapters/browser/douban' },
Expand Down
53 changes: 53 additions & 0 deletions docs/adapters/browser/amazon.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# Amazon

**Mode**: 🔐 Browser · **Domain**: `amazon.com`

## Commands

| Command | Description |
|---------|-------------|
| `opencli amazon bestsellers [<best-sellers-url>]` | Read Amazon Best Sellers pages for ranked candidate discovery |
| `opencli amazon search "<query>"` | Read Amazon search results for coarse filtering |
| `opencli amazon product <asin-or-url>` | Read a product page with title, price, rating, breadcrumbs, and bullets |
| `opencli amazon offer <asin-or-url>` | Read seller / fulfillment / buy-box facts from the product page |
| `opencli amazon discussion <asin-or-url>` | Read review summary and sample customer reviews |

## Usage Examples

```bash
# Root Best Sellers page
opencli amazon bestsellers https://www.amazon.com/Best-Sellers/zgbs --limit 10 -f json

# Category-specific Best Sellers page
opencli amazon bestsellers "<category-best-sellers-url>" --limit 50 -f json

# Search products
opencli amazon search "desk shelf organizer" --limit 20 -f json

# Validate one product
opencli amazon product B0FJS72893 -f json

# Validate seller / offer facts
opencli amazon offer B0FJS72893 -f json

# Read review summary + samples
opencli amazon discussion B0FJS72893 --limit 5 -f json
```

## Prerequisites

- Chrome running with an active `amazon.com` session in the shared profile
- [Browser Bridge extension](/guide/browser-bridge) installed

## Notes

- This adapter only returns fields visible on public Amazon pages.
- `bestsellers` and `search` are for candidate discovery; `product`, `offer`, and `discussion` are the validation surfaces.
- `offer` is the right surface for `sold_by`, `ships_from`, and Amazon-retail exclusion.
- `discussion` may return review data even when Q&A is absent. Missing Q&A is a normal outcome, not an error.

## Troubleshooting

- If Amazon shows a robot-check page, clear it in Chrome and retry.
- If CDP is attached to the wrong tab, retry with `OPENCLI_CDP_TARGET=amazon.com`.
- Avoid running multiple Amazon browser commands in parallel against the same shared Chrome target.
1 change: 1 addition & 0 deletions docs/adapters/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ Run `opencli list` for the live registry.
| **[tiktok](./browser/tiktok)** | `explore` `search` `profile` `user` `following` `follow` `unfollow` `like` `unlike` `comment` `save` `unsave` `live` `notifications` `friends` | 🔐 Browser |
| **[google](./browser/google)** | `news` `search` `suggest` `trends` | 🌐 / 🔐 |
| **[jd](./browser/jd)** | `item` | 🔐 Browser |
| **[amazon](./browser/amazon)** | `bestsellers` `search` `product` `offer` `discussion` | 🔐 Browser |
| **[web](./browser/web)** | `read` | 🔐 Browser |
| **[weixin](./browser/weixin)** | `download` | 🔐 Browser |
| **[36kr](./browser/36kr)** | `news` `hot` `search` `article` | 🌐 / 🔐 |
Expand Down
22 changes: 22 additions & 0 deletions src/clis/amazon/bestsellers.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
import { describe, expect, it } from 'vitest';
import { __test__ } from './bestsellers.js';

describe('amazon bestsellers normalization', () => {
it('normalizes bestseller cards and infers review counts from card text', () => {
const result = __test__.normalizeBestsellerCandidate({
asin: 'B0DR31GC3D',
title: '',
href: 'https://www.amazon.com/NUTIKAS-Shelves-Desktop-Orgnizer-Shlef/dp/B0DR31GC3D/ref=zg_bs',
price_text: '$25.92',
rating_text: '4.3 out of 5 stars',
review_count_text: '',
card_text: 'Desk Shelves Desktop Organizer Shlef\n4.3 out of 5 stars\n435\n$25.92',
}, 2, 'Amazon Best Sellers: Best Desktop & Off-Surface Shelves', 'https://www.amazon.com/example');

expect(result.rank).toBe(2);
expect(result.asin).toBe('B0DR31GC3D');
expect(result.title).toBe('Desk Shelves Desktop Organizer Shlef');
expect(result.review_count).toBe(435);
expect(result.list_title).toBe('Amazon Best Sellers: Best Desktop & Off-Surface Shelves');
});
});
180 changes: 180 additions & 0 deletions src/clis/amazon/bestsellers.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
import { CommandExecutionError } from '../../errors.js';
import { cli, Strategy } from '../../registry.js';
import type { IPage } from '../../types.js';
import {
buildProvenance,
cleanText,
extractAsin,
extractReviewCountFromCardText,
firstMeaningfulLine,
normalizeProductUrl,
parsePriceText,
parseRatingValue,
parseReviewCount,
resolveBestsellersUrl,
uniqueNonEmpty,
assertUsableState,
gotoAndReadState,
} from './shared.js';

interface BestsellersPagePayload {
href?: string;
title?: string;
list_title?: string;
cards?: Array<{
rank_text?: string | null;
asin?: string | null;
title?: string | null;
href?: string | null;
price_text?: string | null;
rating_text?: string | null;
review_count_text?: string | null;
card_text?: string | null;
}>;
page_links?: string[];
}

function normalizeBestsellerCandidate(
candidate: NonNullable<BestsellersPagePayload['cards']>[number],
rank: number,
listTitle: string | null,
sourceUrl: string,
): Record<string, unknown> {
const productUrl = normalizeProductUrl(candidate.href);
const asin = extractAsin(candidate.asin ?? '') ?? extractAsin(productUrl ?? '') ?? null;
const title = cleanText(candidate.title) || firstMeaningfulLine(candidate.card_text);
const price = parsePriceText(cleanText(candidate.price_text) || candidate.card_text);
const ratingText = cleanText(candidate.rating_text) || null;
const reviewCountText = cleanText(candidate.review_count_text)
|| extractReviewCountFromCardText(candidate.card_text)
|| null;
const provenance = buildProvenance(sourceUrl);

return {
rank,
asin,
title: title || null,
product_url: productUrl,
list_title: listTitle,
...provenance,
price_text: price.price_text,
price_value: price.price_value,
currency: price.currency,
rating_text: ratingText,
rating_value: parseRatingValue(ratingText),
review_count_text: reviewCountText,
review_count: parseReviewCount(reviewCountText),
};
}

async function readBestsellersPage(page: IPage, url: string): Promise<BestsellersPagePayload> {
const state = await gotoAndReadState(page, url, 2500, 'bestsellers');
assertUsableState(state, 'bestsellers');

return await page.evaluate(`
(() => ({
href: window.location.href,
title: document.title || '',
list_title:
document.querySelector('#zg_banner_text')?.textContent
|| document.querySelector('h1')?.textContent
|| '',
cards: Array.from(document.querySelectorAll('.p13n-sc-uncoverable-faceout'))
.map((card) => ({
rank_text:
card.querySelector('.zg-bdg-text')?.textContent
|| card.querySelector('[class*="rank"]')?.textContent
|| '',
asin: card.id || '',
title:
card.querySelector('[class*="line-clamp"]')?.textContent
|| card.querySelector('img')?.getAttribute('alt')
|| '',
href: card.querySelector('a[href*="/dp/"]')?.href || '',
price_text: card.querySelector('.a-price .a-offscreen')?.textContent || '',
rating_text: card.querySelector('[aria-label*="out of 5 stars"]')?.getAttribute('aria-label') || '',
review_count_text:
card.querySelector('a[href*="#customerReviews"]')?.textContent
|| card.querySelector('.a-size-small')?.textContent
|| '',
card_text: card.innerText || '',
})),
page_links: Array.from(document.querySelectorAll('li.a-normal a, li.a-selected a'))
.map((anchor) => anchor.href || '')
.filter((href) => /\\/zgbs\\//.test(href) && /(?:[?&]pg=|ref=zg_bs_pg_)/.test(href)),
}))()
`) as BestsellersPagePayload;
}

cli({
site: 'amazon',
name: 'bestsellers',
description: 'Amazon Best Sellers pages for category candidate discovery',
domain: 'amazon.com',
strategy: Strategy.COOKIE,
navigateBefore: false,
args: [
{
name: 'input',
positional: true,
help: 'Best sellers URL or /zgbs path. Omit to use the root Best Sellers page.',
},
{
name: 'limit',
type: 'int',
default: 100,
help: 'Maximum number of ranked items to return (default 100)',
},
],
columns: ['rank', 'asin', 'title', 'price_text', 'rating_value', 'review_count'],
func: async (page, kwargs) => {
const limit = Math.max(1, Number(kwargs.limit) || 100);
const initialUrl = resolveBestsellersUrl(typeof kwargs.input === 'string' ? kwargs.input : undefined);

const queue = [initialUrl];
const visited = new Set<string>();
const seenAsins = new Set<string>();
const results: Record<string, unknown>[] = [];
let listTitle: string | null = null;

while (queue.length > 0 && results.length < limit) {
const nextUrl = queue.shift()!;
if (visited.has(nextUrl)) continue;
visited.add(nextUrl);

const payload = await readBestsellersPage(page, nextUrl);
const sourceUrl = cleanText(payload.href) || nextUrl;
listTitle = cleanText(payload.list_title) || cleanText(payload.title) || listTitle;
const cards = payload.cards ?? [];

for (const card of cards) {
const normalized = normalizeBestsellerCandidate(card, results.length + 1, listTitle, sourceUrl);
const asin = cleanText(String(normalized.asin ?? ''));
if (!asin || seenAsins.has(asin)) continue;
seenAsins.add(asin);
results.push(normalized);
if (results.length >= limit) break;
}

const pageLinks = uniqueNonEmpty(payload.page_links ?? []);
for (const href of pageLinks) {
if (!visited.has(href) && !queue.includes(href)) {
queue.push(href);
}
}
}

if (results.length === 0) {
throw new CommandExecutionError(
'amazon bestsellers did not expose any ranked items',
'Open the same best sellers page in Chrome, verify it is a real Amazon ranking page, and retry.',
);
}

return results.slice(0, limit);
},
});

export const __test__ = {
normalizeBestsellerCandidate,
};
38 changes: 38 additions & 0 deletions src/clis/amazon/discussion.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
import { describe, expect, it } from 'vitest';
import { __test__ } from './discussion.js';

describe('amazon discussion normalization', () => {
it('normalizes review summary and sample reviews', () => {
const result = __test__.normalizeDiscussionPayload({
href: 'https://www.amazon.com/product-reviews/B0FJS72893',
average_rating_text: '3.9 out of 5',
total_review_count_text: '27 global ratings',
qa_links: [],
review_samples: [
{
title: '5.0 out of 5 stars Great value and quality',
rating_text: '5.0 out of 5 stars',
author: 'GTreader2',
date_text: 'Reviewed in the United States on February 21, 2026',
body: 'Small but mighty.',
verified: true,
},
],
});

expect(result.asin).toBe('B0FJS72893');
expect(result.average_rating_value).toBe(3.9);
expect(result.total_review_count).toBe(27);
expect(result.review_samples).toEqual([
{
title: 'Great value and quality',
rating_text: '5.0 out of 5 stars',
rating_value: 5,
author: 'GTreader2',
date_text: 'Reviewed in the United States on February 21, 2026',
body: 'Small but mighty.',
verified_purchase: true,
},
]);
});
});
Loading