Kanji Index Maintenance

VerifiedSafe

Provides guidelines for maintaining the kanji index feature in a dictionary. Covers assigning kanji IDs using a format of sequential number, on'yomi, kun'yomi, and gloss, as well as rebuilding index files and troubleshooting missing links or incorrect entry counts.

Sby Skills Guide Bot
DevelopmentIntermediate
1206/2/2026
Claude CodeCursorWindsurfCopilotCodex
#kanji#index#dictionary#maintenance#japanese

Recommended for

Our review

This skill provides guidelines for maintaining a kanji index feature, including assigning IDs to new kanji, updating JSON files, and troubleshooting common issues.

Strengths

  • Provides clear step-by-step procedures for adding new kanji to the index.
  • Includes detailed troubleshooting for common problems like missing pages or incorrect entry counts.
  • Defines consistent naming conventions for kanji IDs (onyomi, kunyomi, gloss).
  • Partially automates detection of new kanji via Python scripts.

Limitations

  • Requires knowledge of Japanese on'yomi and kun'yomi readings to assign IDs.
  • Assumes a specific directory structure and build scripts specific to the project.
  • Does not cover automatic updating of individual entry files when a kanji is added.
When to use it

Use this skill when you need to add new kanji to a dictionary index or fix issues with existing kanji index pages.

When not to use it

Do not use this skill for general dictionary editing or for tasks unrelated to the kanji index feature.

Security analysis

Safe
Quality score95/100

The skill is purely documentation for maintaining a kanji index in a dictionary project. It describes running local Python scripts for data processing but contains no destructive commands, exfiltration, or obfuscation. No external network access or dangerous system operations are involved.

No concerns found

Examples

Add a new kanji to the index
I have a new kanji '学' that appears in dictionary entries. Assign it a kanji ID based on the guidelines (onyomi: 'gaku', kunyomi: 'mana', gloss: 'study') and update the kanji_list.json file.
Check for missing kanji
Run the check for new kanji and list any kanji that need IDs assigned.
Rebuild all kanji pages
Rebuild all kanji JSON files and HTML pages after adding several new kanji to the index.

name: kanji-index description: Guidelines for maintaining the kanji index feature. Covers kanji ID assignment, index updates, and troubleshooting.

Kanji Index Maintenance

The kanji index allows users to click on any kanji in a dictionary headword to find all other entries containing that same kanji.

How It Works

  1. Headword kanji are linked to kanji index pages
  2. Kanji index pages list all entries containing that kanji
  3. Entry lists are sorted by reading (hiragana order)

Directory Structure

kanji/
├── kanji_list.json       # Master list: kanji → kanji_id mapping
├── kanji_extracted.json  # Temporary: extracted kanji needing IDs
├── 00001_jin_hito_person.json  # Entry list for 人
├── 00002_nichi_hi_day.json     # Entry list for 日
└── ...

docs/kanji/
├── 00001_jin_hito_person.html  # HTML page for 人
├── 00002_nichi_hi_day.html     # HTML page for 日
└── ...

Kanji ID Format

Format: {5-digit}_{onyomi}_{kunyomi}_{gloss}

  • 5-digit: Sequential number (00001, 00002, ...)
  • onyomi: Most common on'yomi in romaji (or "none")
  • kunyomi: Most common kun'yomi in romaji without okurigana (or "none")
  • gloss: Single English word for primary meaning

Examples

| Kanji | Kanji ID | |-------|----------| | 人 | 00001_jin_hito_person | | 日 | 00002_nichi_hi_day | | 大 | 00003_dai_oo_big | | 畑 | 00004_none_hatake_field | | 茶 | 00005_cha_none_tea |

Romaji Rules

  • Long vowels: "ou" not "ō" (e.g., 高 → "kou")
  • Voiced: "ga", "za", "da", "ba" (e.g., 学 → "gaku")
  • No okurigana in kun'yomi (e.g., 高い → "taka", not "takai")

Assigning New Kanji IDs

When new entries introduce kanji not in kanji_list.json:

  1. Detect new kanji:

    python3 build/update_kanji_index.py --check-new
    
  2. Assign readings and gloss using your knowledge:

    • Most common on'yomi
    • Most common kun'yomi (without okurigana)
    • Single-word English gloss
  3. Update kanji_list.json:

    {
      "新": {
        "kanji_id": "00123_shin_atara_new",
        "onyomi": "shin",
        "kunyomi": "atara",
        "gloss": "new"
      }
    }
    
  4. Rebuild:

    python3 build/build_flat.py
    

Common Tasks

Check for New Kanji

python3 build/update_kanji_index.py --check-new

Rebuild All Kanji JSON Files

python3 build/update_kanji_index.py --rebuild-all

Rebuild Kanji HTML Pages

python3 build/build_kanji_html.py

Full Site Build (includes kanji)

python3 build/build_flat.py

Troubleshooting

"Warning: X kanji need IDs assigned"

New kanji were found in entries. Assign IDs manually:

  1. Run --check-new to see the full list
  2. For each kanji, determine on'yomi, kun'yomi, gloss
  3. Add to kanji/kanji_list.json
  4. Rebuild

Missing kanji index page

Check that:

  1. Kanji is in kanji/kanji_list.json
  2. JSON file exists: kanji/{kanji_id}.json
  3. Run python3 build/build_kanji_html.py

Kanji link not appearing in headword

Check that:

  1. Kanji is in kanji/kanji_list.json
  2. Entry HTML was rebuilt after kanji was added

Entry count wrong on kanji page

Rebuild the kanji JSON file:

python3 build/update_kanji_index.py --rebuild-all
python3 build/build_kanji_html.py

File Formats

kanji_list.json

{
  "metadata": {
    "description": "Index mapping kanji characters to their kanji index IDs",
    "generated": "2026-01-22T10:30:00Z",
    "total_kanji": 1500
  },
  "kanji": {
    "人": {
      "kanji_id": "00001_jin_hito_person",
      "onyomi": "jin",
      "kunyomi": "hito",
      "gloss": "person"
    }
  }
}

Individual kanji JSON

{
  "metadata": {
    "kanji": "人",
    "kanji_id": "00001_jin_hito_person",
    "onyomi": "jin",
    "kunyomi": "hito",
    "gloss": "person",
    "entry_count": 245,
    "generated": "2026-01-22T10:30:00Z"
  },
  "entries": [
    {
      "id": "01234_akunin",
      "headword": "{悪|あく}{人|にん}",
      "reading": "あくにん",
      "gloss": "villain, bad person"
    }
  ]
}

Design Decisions

Why invisible links?

  • Preserves clean headword appearance
  • Users discover feature through tooltip
  • No visual clutter

Why romaji in kanji IDs?

  • ASCII-safe file names
  • Human-readable
  • Easy to search and sort

Why sort by reading?

  • Natural Japanese ordering (gojuon)
  • Consistent with how dictionaries organize entries
  • Helps users find related words
Related skills