Our review
This skill researches the OpenRouter API to integrate Grok models, including vision capabilities, for evaluating generated images in the SFUMATO project.
Strengths
- Comprehensive coverage of OpenRouter APIs (authentication, endpoints, request formats)
- Precise identification of Grok models supporting vision, critical for judge_service
- Documentation of base64 image formats for vision calls
- Error handling and retry strategy adapted to common HTTP status codes
Limitations
- Depends on freshness of web search results (models or APIs may change)
- Provides a blueprint, not a ready-to-use implementation
- Does not cover alternative providers or models
When implementing or debugging OpenRouter integration for text and vision calls with Grok.
If you are using a different API provider or if the Grok models do not require vision capabilities.
Security analysis
SafeThis skill instructs only research activities using WebSearch, WebFetch, and Read tools to gather API documentation. No destructive, exfiltrating, or obfuscated actions are present. No executable commands or dangerous operations are suggested.
No concerns found
Examples
Research the OpenRouter API to find which Grok models support vision/image input for evaluating generated images in our judge service.Research OpenRouter error handling patterns: how to handle 429, 402, and 503 errors, and what retry strategy to use for async httpx calls.Find the correct message format for sending base64 images to OpenRouter's vision API for Grok models in a chat completion request.name: openrouter-research description: Research OpenRouter API docs, available Grok model IDs, vision capability for the judge service, and integration patterns. Use when implementing openrouter_tool.py, when checking which Grok model supports vision/image input for judge_service.py, when OpenRouter returns unexpected errors, or when verifying model availability and context limits. tools: WebSearch, WebFetch, Read
OpenRouter Research
Research current OpenRouter API specifications and Grok model availability for SFUMATO.
Context
All text LLM calls in SFUMATO go through app/tools/openrouter_tool.py:
prompt_service.generate_prompt()— Grok, text-only inputprompt_service.revise_prompt()— Grok, text-only inputjudge_service.evaluate()— Grok with vision (must analyze generated image)
ENV vars: OPENROUTER_API_KEY, OPENROUTER_BASE_URL=https://openrouter.ai/api/v1
Client: httpx.AsyncClient — OpenAI-compatible API
Read existing file first if it exists:
app/tools/openrouter_tool.pyconfig.py
Step 1: OpenRouter API Fundamentals
- Fetch:
https://openrouter.ai/docs/api-reference/overview - Fetch:
https://openrouter.ai/docs/requests - Search:
OpenRouter API python httpx async chat completions 2025
Capture:
- Base URL:
https://openrouter.ai/api/v1 - Auth header format (
Authorization: BearervsAPI-Key) - Required request headers:
HTTP-Referer,X-Title - Chat completions endpoint:
POST /v1/chat/completions - Request body schema:
model,messages,temperature,max_tokens,stream - Response schema:
choices[0].message.content - Error response format and status codes (429, 402, 503, 500)
Step 2: Available Grok Models
- Fetch:
https://openrouter.ai/models?q=grok - Search:
OpenRouter xAI Grok models list 2025 - Fetch:
https://openrouter.ai/x-aiif available
For each available Grok model, capture:
- Full model ID string (e.g.
x-ai/grok-beta,x-ai/grok-vision-beta) - Context window size (tokens)
- Supports image input (vision)? — critical for judge_service
- Cost per 1M input/output tokens
Determine the best model for each use case:
prompt_gen: text-only, large contextprompt_revise: text-onlyjudge: MUST support vision/image input
Step 3: Image Input Format for Vision Models
Since judge_service.py sends a generated image to Grok for evaluation:
- Search:
OpenRouter vision model image input base64 format 2025 - Fetch:
https://openrouter.ai/docs/features/vision - Search:
OpenAI-compatible vision API image_url content type format
Capture:
- Message content format for image:
vs URL-based image reference{"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,..."}} - Max image size limits
- Supported formats (JPEG, PNG, WebP)
- Whether OpenRouter passes base64 to xAI directly or requires URL
Step 4: Error Handling and Retry Strategy
- Search:
OpenRouter API error handling retry 429 503 2025 - Fetch:
https://openrouter.ai/docs/api-reference/errorsif available
Capture:
- 429 rate limit:
retry-afterheader? fixed backoff interval? - 402 payment/balance: raise immediately, no retry
- 503 model unavailable: retry with backoff
- Recommended timeout values for text calls vs vision calls
- Max retries pattern
Output Format
Section 1: openrouter_tool.py Implementation Blueprint
# Async httpx client structure
BASE_URL = "https://openrouter.ai/api/v1"
# Required headers
headers = {
"Authorization": f"Bearer {api_key}",
"HTTP-Referer": "...",
"X-Title": "SFUMATO",
"Content-Type": "application/json",
}
# chat_completion(model, messages, temperature, max_tokens) -> str
# vision_completion(model, text_messages_plus_image_message) -> str
# Retry pattern for 429/503 (show how many retries, backoff)
# Error handling: which codes to retry vs raise immediately
Section 2: Model Selection Table
| Use Case | Recommended Model ID | Context | Vision | Notes | |----------|---------------------|---------|--------|-------| | prompt_gen | x-ai/grok-... | N | No | | | prompt_revise | x-ai/grok-... | N | No | | | judge | x-ai/grok-... | N | YES | Required |
Section 3: Image Message Format
Exact Python dict structure to use when sending image to judge:
{
"role": "user",
"content": [
{"type": "text", "text": "...judge prompt..."},
{"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,..."}}
]
}
Section 4: Recommended Constants for config.py
OPENROUTER_BASE_URL = "https://openrouter.ai/api/v1"
OPENROUTER_MODEL_PROMPT_GEN = "x-ai/grok-..."
OPENROUTER_MODEL_PROMPT_REVISE = "x-ai/grok-..."
OPENROUTER_MODEL_JUDGE = "x-ai/grok-vision-..." # must support vision
OPENROUTER_TIMEOUT_TEXT = 30
OPENROUTER_TIMEOUT_VISION = 45
OPENROUTER_MAX_RETRIES = 2
Notes
- Never log API key, response.text for large vision calls, or base64 image data
- Image for judge: read from
data/sessions/<id>/iter_<n>.jpg, encode as base64 - The judge call is the only one that uses vision; wrap it separately in
vision_completion() HTTP-Referershould be"http://localhost:5000"for local dev
Next.js App Router Expert
Development
A skill that turns Claude into a Next.js App Router expert.
README Generator
Development
Creates professional and comprehensive README.md files for your projects.
API Documentation Writer
Development
Generates comprehensive API documentation in OpenAPI/Swagger format.