Notre avis
Automatise le navigateur avec des scripts pilotant l’inspection, les captures d’écran, le remplissage de formulaires et l’extraction de données via un état de page persistant.
Points forts
- Utilise des instantanés ARIA pour interagir sans sélecteurs CSS complexes.
- Conserve l’état de la page entre les actions, permettant des workflows multi-étapes.
- Fournit des captures d’écran pour un retour visuel immédiat.
Limites
- Nécessite de lancer un serveur séparé pour le navigateur.
- Le code dans page.evaluate() doit être en JavaScript pur, pas de TypeScript.
- Dépend de la librairie Otto pour la connexion, pas une solution universelle.
Quand on doit inspecter visuellement une interface, automatiser des interactions utilisateur ou extraire des données d’un frontend avec un état persistant.
Pour des tâches simples de scraping statique ou quand l’interface change fréquemment et que les sélecteurs ARIA deviennent instables.
Analyse de sécurité
PrudenceThe skill provides a browser automation toolkit that requires node process execution and directory creation. These tools are used legitimately for starting a server, running scripts, and managing screenshots. No destructive or exfiltrating commands are instructed, but the allowed tools are powerful and could be misused if combined with malicious instructions. Rating as 'caution' reflects the legitimate but potent capabilities.
- •Uses Bash(node *) to execute arbitrary Node.js scripts, which can perform filesystem access and network requests.
- •Includes Bash(mkdir *) for directory creation, which is powerful but necessary for the skill's workflow.
Exemples
Navigate to http://localhost:3000, take a screenshot, and describe the page structure using an ARIA snapshot.Fill the email field with 'test@example.com' and password with 'secret', then click the login button and verify the dashboard page loads.Extract all rows from the pricing table on the page and save them as a JSON file.name: browser description: Browser automation with persistent page state for navigation, screenshots, forms, data extraction, and testing. Use when inspecting UI, taking screenshots, filling forms, extracting page data, verifying frontend behavior, or automating browser workflows. argument-hint: [url | explore | verify | extract] model: opus allowed-tools: Read, Bash(node *), Bash(mkdir *)
Argument: $ARGUMENTS
| Command | Behavior |
|---------|----------|
| {url} | Navigate to URL, capture screenshot and ARIA snapshot |
| explore | Interactive exploration - navigate, inspect, understand UI |
| verify {description} | Verify specific UI behavior or state |
| extract {description} | Extract specific data from the frontend |
Choosing Your Approach
- Local/source-available sites: Read source code first to write selectors directly
- Unknown layouts: Use
getAISnapshot()for element discovery andselectSnapshotRef()for interactions - Visual feedback: Take screenshots to observe results
Setup
Start the browser server before running scripts:
node skills/otto/lib/browser/server.js &
Wait for "Ready" message, then connect:
import { connect, waitForPageLoad } from 'skills/otto/lib/browser/client.js'
const client = await connect({ headless: true })
Writing Scripts
Run scripts using npx tsx with heredocs for inline execution.
Key Principles:
- Small scripts doing one action each
- Evaluate state at completion
- Use descriptive page names
- Call
await client.disconnect()to exit (pages persist) - Use plain JavaScript in
page.evaluate()(no TypeScript syntax)
Workflow Loop
- Write script performing one action
- Run and observe output
- Evaluate results and current state
- Decide: complete or need another script?
- Repeat until task complete
Navigate & Capture
Determine the dev server URL from package.json scripts, running processes, or project config.
const page = await client.page('main')
await page.goto(url) // e.g., http://localhost:5173
await waitForPageLoad(page)
// Screenshot
await page.screenshot({ path: '.otto/screenshots/page.png' })
// ARIA snapshot
const snapshot = await client.getAISnapshot('main')
console.log(snapshot)
Interact
// Click by ref
const btn = await client.selectSnapshotRef('main', 'e3')
await btn.click()
// Fill input by ref
const input = await client.selectSnapshotRef('main', 'e5')
await input.fill('user@example.com')
// Re-capture after interaction
await waitForPageLoad(page)
const newSnapshot = await client.getAISnapshot('main')
Waiting
await waitForPageLoad(page)
await page.waitForSelector('.results')
await page.waitForURL('**/success')
No TypeScript in Browser Context
Code in page.evaluate() runs in browser context without TypeScript support. Use plain JavaScript only—type annotations break at runtime.
// ✓ Correct
await page.evaluate(() => {
const items = document.querySelectorAll('.item')
return items.length
})
// ✗ Wrong - TypeScript syntax fails
await page.evaluate(() => {
const items: NodeListOf<Element> = document.querySelectorAll('.item')
return items.length
})
Cleanup
await client.disconnect()
After completing the workflow, remove screenshots:
rm -rf .otto/screenshots
ARIA Snapshot Format
- banner:
- link "Home" [ref=e1]
- main:
- heading "Welcome" [ref=e2]
- form:
- textbox "Email" [ref=e3]
- button "Submit" [disabled] [ref=e4]
Use [ref=eN] values with selectSnapshotRef() to interact.
Error Recovery
Page state persists after failures. Debug using screenshots and state inspection to evaluate current conditions before next action.
// After an error, reconnect and inspect
const client = await connect({ headless: true })
const snapshot = await client.getAISnapshot('main')
await client.page('main').then(p => p.screenshot({ path: '.otto/screenshots/debug.png' }))
Scraping Data
For large datasets, intercept and replay network requests rather than scrolling the DOM. See references/scraping.md for the complete guide covering request capture, schema discovery, and paginated API replay.
TDD Red-Green-Refactor
Testing
Skill qui guide Claude a travers le cycle TDD complet.
Audit d'Accessibilité Web
Testing
Réalise un audit d'accessibilité web complet selon les normes WCAG.
Générateur de Tests UAT
Testing
Génère des cas de test d'acceptation utilisateur structurés et complets.