Notre avis
Génère des runbooks opérationnels pour des services, procédures ou réponses aux incidents, avec des étapes détaillées, guides de dépannage et chemins d'escalade, en analysant le code source et l'infrastructure.
Points forts
- Produit des procédures concrètes avec des commandes vérifiées
- Cartographie les dépendances et leur impact en cas de panne
- Fournit des guides de dépannage et des chemins d'escalade
- S'appuie sur une investigation multi-pistes (code, infrastructure, bonnes pratiques)
Limites
- Nécessite que le code et l'infrastructure soient accessibles en lecture
- Peut ne pas couvrir tous les scénarios rares ou non documentés
- Les commandes générées dépendent de la configuration actuelle qui peut évoluer
Utilisez cette compétence lorsque vous devez créer ou standardiser une documentation opérationnelle pour un service existant, une procédure de maintenance ou un plan de réponse aux incidents.
Évitez de l'utiliser pour des services purement théoriques ou sans accès au code source et à l'infrastructure, car les commandes et dépendances ne pourront pas être vérifiées.
Analyse de sécurité
SûrThe skill only uses Read, Glob, Grep, and Write tools. It does not execute any commands or network operations. It generates documentation with example commands, but these are not run by the skill. No data exfiltration or destructive actions possible.
Aucun point d'attention détecté
Exemples
Generate a runbook for the payment-service, covering deployment, scaling, and common failure scenarios.Create a runbook for PostgreSQL failover procedure in production, including pre-checks, steps, and rollback.Build an incident response runbook for high latency in the API gateway, including diagnosis, mitigation, and escalation.name: runbook description: Generate operational runbooks for services, procedures, or incident response with step-by-step procedures, troubleshooting guides, and escalation paths license: MIT compatibility:
- runtime:any allowed-tools:
- Read
- Glob
- Grep
- Write metadata: author: thoreinstein version: 1.0.0
Runbook
Generate operational runbooks for services, procedures, or incident response. Investigates the codebase and infrastructure to produce accurate, actionable procedures.
When to Use
- Creating operational documentation for a service
- Documenting deployment, scaling, or maintenance procedures
- Building incident response playbooks
- Standardizing operational procedures across teams
Input
- Topic: Service name, operation type, or incident scenario
- Scope: deployment, scaling, failover, maintenance, troubleshooting
- Optional: Specific scenarios to cover
Investigation Strategy
Launch parallel investigation tracks to gather comprehensive information:
Track 1: Codebase Exploration
- Identify service entry points and configuration
- Find health check endpoints
- Map dependencies (databases, caches, external services)
- Locate logging and metrics instrumentation
- Find existing scripts or automation
Track 2: Infrastructure Analysis
- Review deployment manifests (Kubernetes, Terraform, etc.)
- Identify scaling configuration
- Map service dependencies
- Find monitoring and alerting setup
- Review backup and recovery procedures
Track 3: External Research
- Find operational best practices for the service type
- Research common failure modes
- Identify industry-standard procedures
Output
Generate the runbook document using the template at references/templates/runbook.md.
The runbook should include:
- Service overview and architecture
- Dependencies with failure impact
- Step-by-step procedures with actual commands
- Troubleshooting guides for common issues
- Escalation paths and contacts
Behavior
- Parse topic to identify service and operation scope
- Launch parallel investigation tracks
- Extract configuration, endpoints, and dependencies from codebase
- Identify common operations and failure modes
- Generate step-by-step procedures with actual commands
- Document troubleshooting steps and escalation paths
Constraints
- Accuracy: All commands must be verified against actual codebase/infrastructure
- Actionable: Every procedure must have concrete, executable steps
- Complete: Include prerequisites, verification, and rollback for each procedure
- Maintainable: Note dependencies that may change and require updates
Example
Input: "Generate runbook for the payment-service"
Investigation:
- Found deployment at k8s/payment-service/
- Found health endpoints: /health, /ready
- Dependencies: PostgreSQL (critical), Redis (cache), Stripe API
- Scaling: HPA configured, min 3, max 10 replicas
- Alerts: Prometheus rules in monitoring/
Generated Runbook: payment-service-runbook.md
## Overview
- Service: payment-service
- Owner: payments-team
- Criticality: P1
## Dependencies
| Dependency | Type | Criticality | Failure Impact |
|------------|------|-------------|----------------|
| PostgreSQL | Database | Critical | Full outage |
| Redis | Cache | High | Degraded latency |
| Stripe API | External | Critical | Payment failures |
## Procedures
### Deployment
1. Verify no active transactions
```bash
kubectl exec -it payment-service-0 -- curl localhost:8080/metrics | grep active_transactions
- Apply new deployment
kubectl apply -f k8s/payment-service/deployment.yaml - Monitor rollout
kubectl rollout status deployment/payment-service
Scaling
kubectl scale deployment payment-service --replicas=5
Troubleshooting
High Latency
Symptoms: p99 latency > 500ms Diagnosis:
kubectl top pods -l app=payment-service
kubectl logs -l app=payment-service --tail=100 | grep -i slow
Resolution: Check Redis connection, scale if CPU > 80%
Begin by identifying the service or operation to document and launching investigation tracks.
Architecte Docker Compose
DevOps
Concoit des configurations Docker Compose optimisees.
Rapport de Post-Mortem
DevOps
Rédige des rapports post-mortem d'incidents structurés et blameless.
Créateur de Runbooks
DevOps
Crée des runbooks opérationnels clairs pour les procédures DevOps courantes.