Maintenance de l'index kanji

VérifiéSûr

Guide pour maintenir l'index kanji avec attribution d'ID, mises à jour et dépannage. Couvre la structure des répertoires et les tâches courantes.

Spar Skills Guide Bot
Data & IAIntermédiaire
4002/06/2026
Claude Code
#kanji#index-maintenance#japanese-dictionary#data-automation

Recommandé pour

Notre avis

Cette compétence fournit des directives pour maintenir un index de kanji liant les entrées du dictionnaire aux caractères kanji, incluant l'attribution d'identifiants, la structure des répertoires et le dépannage.

Points forts

  • Format d'identifiant structuré et cohérent
  • Commandes de reconstruction claires pour divers scénarios
  • Étapes de dépannage détaillées pour les problèmes courants
  • Processus systématique d'ajout de nouveaux kanji

Limites

  • Nécessite une connaissance manuelle des lectures on'yomi et kun'yomi pour chaque kanji
  • Dépend d'une structure de répertoires spécifique qui peut ne pas s'adapter à d'autres projets
  • Ne gère pas automatiquement les variantes de kanji (anciennes formes, simplifiées)
Quand l'utiliser

Utilisez cette compétence lorsque vous ajoutez de nouvelles entrées à un dictionnaire japonais contenant des kanji inédits ou que vous devez reconstruire l'index des kanji.

Quand l'éviter

Évitez cette compétence lorsque vous travaillez avec des scripts non kanji (hiragana, katakana) ou lorsqu'un système d'attribution automatique des kanji est disponible.

Analyse de sécurité

Sûr
Score qualité92/100

The skill is a documentation guide for maintaining a kanji index feature in a dictionary project. It includes only legitimate build commands (python3 scripts) and no destructive, exfiltrating, or obfuscated instructions. There is no risk of running unsafe code.

Aucun point d'attention détecté

Exemples

Assign ID to new kanji
I have added entries with the new kanji '新'. Follow the kanji-index maintenance guidelines to assign it a kanji ID and update the index. First run --check-new, then determine readings and gloss for '新', update kanji_list.json, and rebuild.
Rebuild all kanji JSON
Run the rebuild-all command for the kanji index following the kanji-index skill.
Troubleshoot missing kanji page
The page for kanji '大' is missing. Use the troubleshooting steps from the kanji-index skill to diagnose and fix the issue.

name: kanji-index description: Guidelines for maintaining the kanji index feature. Covers kanji ID assignment, index updates, and troubleshooting.

Kanji Index Maintenance

The kanji index allows users to click on any kanji in a dictionary headword to find all other entries containing that same kanji.

How It Works

  1. Headword kanji are linked to kanji index pages
  2. Kanji index pages list all entries containing that kanji
  3. Entry lists are sorted by reading (hiragana order)

Directory Structure

kanji/
├── kanji_list.json       # Master list: kanji → kanji_id mapping
├── kanji_extracted.json  # Temporary: extracted kanji needing IDs
├── 00001_jin_hito_person.json  # Entry list for 人
├── 00002_nichi_hi_day.json     # Entry list for 日
└── ...

docs/kanji/
├── 00001_jin_hito_person.html  # HTML page for 人
├── 00002_nichi_hi_day.html     # HTML page for 日
└── ...

Kanji ID Format

Format: {5-digit}_{onyomi}_{kunyomi}_{gloss}

  • 5-digit: Sequential number (00001, 00002, ...)
  • onyomi: Most common on'yomi in romaji (or "none")
  • kunyomi: Most common kun'yomi in romaji without okurigana (or "none")
  • gloss: Single English word for primary meaning

Examples

| Kanji | Kanji ID | |-------|----------| | 人 | 00001_jin_hito_person | | 日 | 00002_nichi_hi_day | | 大 | 00003_dai_oo_big | | 畑 | 00004_none_hatake_field | | 茶 | 00005_cha_none_tea |

Romaji Rules

  • Long vowels: "ou" not "ō" (e.g., 高 → "kou")
  • Voiced: "ga", "za", "da", "ba" (e.g., 学 → "gaku")
  • No okurigana in kun'yomi (e.g., 高い → "taka", not "takai")

Assigning New Kanji IDs

When new entries introduce kanji not in kanji_list.json:

  1. Detect new kanji:

    python3 build/update_kanji_index.py --check-new
    
  2. Assign readings and gloss using your knowledge:

    • Most common on'yomi
    • Most common kun'yomi (without okurigana)
    • Single-word English gloss
  3. Update kanji_list.json:

    {
      "新": {
        "kanji_id": "00123_shin_atara_new",
        "onyomi": "shin",
        "kunyomi": "atara",
        "gloss": "new"
      }
    }
    
  4. Rebuild:

    python3 build/build_flat.py
    

Common Tasks

Check for New Kanji

python3 build/update_kanji_index.py --check-new

Rebuild All Kanji JSON Files

python3 build/update_kanji_index.py --rebuild-all

Rebuild Kanji HTML Pages

python3 build/build_kanji_html.py

Full Site Build (includes kanji)

python3 build/build_flat.py

Troubleshooting

"Warning: X kanji need IDs assigned"

New kanji were found in entries. Assign IDs manually:

  1. Run --check-new to see the full list
  2. For each kanji, determine on'yomi, kun'yomi, gloss
  3. Add to kanji/kanji_list.json
  4. Rebuild

Missing kanji index page

Check that:

  1. Kanji is in kanji/kanji_list.json
  2. JSON file exists: kanji/{kanji_id}.json
  3. Run python3 build/build_kanji_html.py

Kanji link not appearing in headword

Check that:

  1. Kanji is in kanji/kanji_list.json
  2. Entry HTML was rebuilt after kanji was added

Entry count wrong on kanji page

Rebuild the kanji JSON file:

python3 build/update_kanji_index.py --rebuild-all
python3 build/build_kanji_html.py

File Formats

kanji_list.json

{
  "metadata": {
    "description": "Index mapping kanji characters to their kanji index IDs",
    "generated": "2026-01-22T10:30:00Z",
    "total_kanji": 1500
  },
  "kanji": {
    "人": {
      "kanji_id": "00001_jin_hito_person",
      "onyomi": "jin",
      "kunyomi": "hito",
      "gloss": "person"
    }
  }
}

Individual kanji JSON

{
  "metadata": {
    "kanji": "人",
    "kanji_id": "00001_jin_hito_person",
    "onyomi": "jin",
    "kunyomi": "hito",
    "gloss": "person",
    "entry_count": 245,
    "generated": "2026-01-22T10:30:00Z"
  },
  "entries": [
    {
      "id": "01234_akunin",
      "headword": "{悪|あく}{人|にん}",
      "reading": "あくにん",
      "gloss": "villain, bad person"
    }
  ]
}

Design Decisions

Why invisible links?

  • Preserves clean headword appearance
  • Users discover feature through tooltip
  • No visual clutter

Why romaji in kanji IDs?

  • ASCII-safe file names
  • Human-readable
  • Easy to search and sort

Why sort by reading?

  • Natural Japanese ordering (gojuon)
  • Consistent with how dictionaries organize entries
  • Helps users find related words
Skills similaires