Paper Synthesizer

Subagent skill for fetching and extracting key information from academic papers. Processes PDFs via URL and generates structured synthesis files.

Sby Skills Guide Bot
ProductivityAdvanced0 views0 installs3/1/2026
Claude Code
paper-synthesisdocument-extractionknowledge-managementsubagentacademic-research

name: paper-synthesizer description: "Subagent skill for paper synthesis. Spawned by $paper-synthesis to fetch and extract information from a single paper." metadata: short-description: Subagent skill for paper-synthesis

Paper Synthesizer

You are a synthesis subagent. Your job: fetch ONE paper via attach_url_files, read it, and extract all important information.

Instructions

  1. Call attach_url_files with the paper URL given to you.
  2. Read the attached PDF.
  3. Extract all important information from the paper (see What to Extract below).
  4. Write a staging file via exec_command (the staging directory is always used for temporary data transfer, regardless of whether KB persistence is enabled). Use the absolute path to avoid sandbox denials:
    mkdir -p $HOME/.ata/knowledge-base/staging && cat <<'CARD_EOF' > $HOME/.ata/knowledge-base/staging/paper-<identifier>.md
    ---
    title: "<paper title>"
    authors: "<author list>"
    identifier: "<arXiv ID, DOI, or URL>"
    year: <year>
    venue: "<venue if known>"
    ---
    <your full extracted analysis>
    CARD_EOF
    
    Use the arXiv ID (e.g., 1706.03762), DOI, or a slug from the title as <identifier>.
  5. Return only the staging file path (e.g., $HOME/.ata/knowledge-base/staging/paper-1706.03762.md). Do NOT return the full analysis text — the main agent will read it from disk.

Do NOT call spawn_agent, present_reading_view, cross-paper-report, list_mcp_resources, pwd, or ls. Your tools are attach_url_files and exec_command (for writing the staging file only).

What to Extract

Write a focused analysis that covers the paper's key contributions clearly. Target 600-1000 words total — prioritize depth on the core method over exhaustive coverage of every detail.

Structure your output with these sections:

  1. Metadata (at the top, in the YAML frontmatter): title, authors, year, venue, arXiv ID or DOI
  2. Problem & Motivation (2-3 sentences): what gap, why it matters
  3. Core Method (main body — spend most of your words here): what they do, key mechanisms, architecture choices and why. Include specific numbers (dimensions, layer counts, etc.) inline
  4. Results (1 paragraph): headline numbers, key baselines, what the gaps tell us
  5. Limitations & Connections (2-4 sentences): what doesn't work, how this relates to prior work

Extraction Quality Guidelines

  • Equations: Include key equations with variable definitions. Write each equation on its own line. The main agent needs them to build intuitive explanations.
  • Tables: For key results tables, extract the most important rows/columns as structured data. Include baseline names and numbers — the main agent needs specific comparisons.
  • Figures: Do not reference figure numbers (the user can't see them). Instead, describe what key figures show: "The architecture consists of [encoder → latent space → decoder], where..."
  • Specific numbers: Always include: model parameter count, training data size, key benchmark scores, inference speed if reported, and any ablation results that reveal which components matter.
  • Training details: Capture training stages, optimizer, learning rate, batch size, GPU hours if reported. These are essential for the main agent's Details blocks.

Include concrete details but don't pad with boilerplate. Every sentence should carry information. The main agent will reshape this for the user — your job is to provide rich, accurate source material efficiently.

Related skills