Our review
A tool to scrape Snowflake documentation pages into Markdown files with caching and configurable crawling depth.
Strengths
- Simple setup with automatic dependency installation
- SQLite cache reduces repeated requests and speeds up updates
- Flexible configuration for base path and spider depth
- Structured Markdown output with frontmatter metadata
Limitations
- Only works on docs.snowflake.com
- May download a large number of pages if depth is set high
- Cache expiration is fixed at 7 days with no customization option
When you need local copies of Snowflake documentation sections for offline reference or LLM context.
For scraping other websites or for real-time updates.
Security analysis
SafeThe skill is a documentation scraper that accesses a trusted domain (docs.snowflake.com) and writes to a local directory. It does not execute downloaded content, expose secrets, or perform destructive actions. The first-time setup uses a Python script that installs a tool via standard package managers, which is a common pattern and not inherently risky.
No concerns found
Examples
Scrape the Snowflake documentation migration guide with default settings and save it to ./migration-docsUse doc-scraper to scrape the Snowflake SQL reference section at /en/sql-reference/ with spider depth 2, output to ./sql-docsRun a dry run of doc-scraper for the base path /en/sql-reference/ to see which URLs will be scraped without writing files.name: doc-scraper description: Generic web scraper for extracting and organizing Snowflake documentation with intelligent caching and configurable spider depth. Scrapes any section of docs.snowflake.com controlled by --base-path.
Snowflake Documentation Scraper
Scrapes docs.snowflake.com sections to Markdown with SQLite caching (7-day expiration).
Usage
First time setup (auto-installs uv and doc-scraper):
python3 .claude/skills/doc-scraper/scripts/doc_scraper.py
Subsequent runs:
doc-scraper --output-dir=./snowflake-docs
doc-scraper --output-dir=./snowflake-docs --base-path="/en/sql-reference/"
doc-scraper --output-dir=./snowflake-docs --spider-depth=2
Command Options
| Option | Default | Description |
| ---------------- | ----------------- | ------------------------------------- |
| --output-dir | Required | Output directory for scraped docs |
| --base-path | /en/migrations/ | URL section to scrape |
| --spider-depth | 1 | Link depth: 0=seeds, 1=+links, 2=+2nd |
| --limit | None | Cap URLs (for testing) |
| --dry-run | - | Preview without writing |
Output
output-dir/
├── SKILL.md # Auto-generated index
├── scraper_config.yaml # Editable config (auto-created)
├── .cache/ # SQLite cache (auto-managed)
└── en/migrations/*.md # Scraped pages with frontmatter
Configuration
Auto-created at {output-dir}/scraper_config.yaml:
rate_limiting:
max_concurrent_threads: 4
spider:
max_pages: 1000
allowed_paths: ["/en/"]
scraped_pages:
expiration_days: 7
Troubleshooting
| Issue | Solution |
| ---------------- | ------------------------------------- |
| Too many pages | Lower --spider-depth or edit config |
| Missing pages | Increase --spider-depth |
| Cache corruption | Delete {output-dir}/.cache/ (rare) |
API Documentation Generator
Documentation
Automatically generates OpenAPI/Swagger API documentation.
Technical Writer
Documentation
Writes clear technical documentation following top style guides.
Technical Documentation Architect
Documentation
Analyzes existing codebases to produce comprehensive technical documentation (10-100+ pages), including architecture overviews, design decisions, and troubleshooting guides. Best used for system documentation, architecture guides, or technical deep-dives.