CLI for resolving URLs from a sitemap XML, single URL, or crawl and capturing full-page desktop and mobile screenshots with Playwright, with optional Markdown, metadata, and crawl graph export.
npm install
npx playwright install chromium
npm run build
node dist/cli.js --sitemap https://example.com/sitemap.xml --output ./shots --max 10Or capture a single URL:
node dist/cli.js --url https://example.com/ --output ./shotsOr crawl internal pages from a starting URL:
node dist/cli.js --crawl https://example.com/ --max 25 --output ./shotsGenerate screenshots and Markdown together:
node dist/cli.js --url https://example.com/ --output ./shots --markdownGenerate Markdown without screenshots:
node dist/cli.js --crawl https://example.com/ --depth 1 --output ./shots --shots false --markdownWrite metadata into Markdown frontmatter:
node dist/cli.js --url https://example.com/ --output ./shots --markdown --meta mdWrite metadata to a sidecar JSON file:
node dist/cli.js --url https://example.com/ --output ./shots --markdown false --meta jsonDisable screenshots explicitly:
node dist/cli.js --sitemap https://example.com/sitemap.xml --output ./shots --shots false --markdownSkip the confirmation prompt with --yes:
node dist/cli.js --sitemap https://example.com/sitemap.xml --output ./shots --max 10 --yesThe CLI prints the resolved URLs before prompting for confirmation. Screenshots are saved under:
<output>/<domain>/<YYYY-MM-DD>/<slug>-desktop.jpg
<output>/<domain>/<YYYY-MM-DD>/<slug>-mobile.jpg
<output>/<domain>/<YYYY-MM-DD>/<slug>.md
<output>/<domain>/<YYYY-MM-DD>/<slug>.meta.json
<output>/<domain>/<YYYY-MM-DD>/site-graph.json
<output>/<domain>/<YYYY-MM-DD>/site-graph.md
If the date folder already exists for that domain, the CLI creates <YYYY-MM-DD>-1, then -2, and so on. The root path / is saved as homepage-desktop.jpg and homepage-mobile.jpg.
For --crawl, the seed URL is depth 0, its direct internal links are depth 1, and the crawl stays on the same hostname only. The default crawl depth is 4. Crawl runs always write site-graph.json and site-graph.md. If you also pass --sitemap, the graph report detects orphaned sitemap pages. If you do not pass --sitemap, the CLI will try <origin>/sitemap.xml automatically and use it when available.
If --output is omitted, the CLI writes into ./results.