Kanji Index Maintenance

VerifiedSafe

Guidelines for maintaining the kanji index with ID assignment, updates, and troubleshooting.

Sby Skills Guide Bot
DevelopmentIntermediate
306/2/2026
Claude CodeCursor
#kanji-index#dictionary#japanese#index-maintenance#site-building

Recommended for

Our review

Guidelines for maintaining a kanji index feature in a Japanese dictionary, including ID assignment, index updates, and troubleshooting.

Strengths

  • Clear step-by-step procedures
  • Concrete examples of kanji IDs
  • Diagnostic tools (--check-new) and rebuild scripts
  • Well-documented file formats

Limitations

  • Requires knowledge of on'yomi and kun'yomi readings
  • Manual ID assignment
  • Can be tedious for many new kanji
When to use it

When you need to add new kanji to the index or rebuild the index after changes.

When not to use it

If you use an automated kanji management system or are only dealing with entries containing no new kanji.

Security analysis

Safe
Quality score87/100

The skill contains only maintenance guidelines for a local dictionary project, using pre-existing build scripts and no network operations, destructive commands, or data exfiltration.

No concerns found

Examples

Add new kanji
I need to add the kanji '新' to the kanji index. Assign a kanji ID based on the existing convention (onyomi: shin, kunyomi: atara, gloss: new). Update the kanji_list.json and rebuild the site.
Check for new kanji
Run the check for new kanji in the dictionary entries and report any that need IDs assigned.
Rebuild all kanji pages
Rebuild all kanji JSON files and HTML pages after updating the entry list. Use the rebuild-all and build-kanji-html commands.

name: kanji-index description: Guidelines for maintaining the kanji index feature. Covers kanji ID assignment, index updates, and troubleshooting.

Kanji Index Maintenance

The kanji index allows users to click on any kanji in a dictionary headword to find all other entries containing that same kanji.

How It Works

  1. Headword kanji are linked to kanji index pages
  2. Kanji index pages list all entries containing that kanji
  3. Entry lists are sorted by reading (hiragana order)

Directory Structure

kanji/
├── kanji_list.json       # Master list: kanji → kanji_id mapping
├── kanji_extracted.json  # Temporary: extracted kanji needing IDs
├── 00001_jin_hito_person.json  # Entry list for 人
├── 00002_nichi_hi_day.json     # Entry list for 日
└── ...

docs/kanji/
├── 00001_jin_hito_person.html  # HTML page for 人
├── 00002_nichi_hi_day.html     # HTML page for 日
└── ...

Kanji ID Format

Format: {5-digit}_{onyomi}_{kunyomi}_{gloss}

  • 5-digit: Sequential number (00001, 00002, ...)
  • onyomi: Most common on'yomi in romaji (or "none")
  • kunyomi: Most common kun'yomi in romaji without okurigana (or "none")
  • gloss: Single English word for primary meaning

Examples

| Kanji | Kanji ID | |-------|----------| | 人 | 00001_jin_hito_person | | 日 | 00002_nichi_hi_day | | 大 | 00003_dai_oo_big | | 畑 | 00004_none_hatake_field | | 茶 | 00005_cha_none_tea |

Romaji Rules

  • Long vowels: "ou" not "ō" (e.g., 高 → "kou")
  • Voiced: "ga", "za", "da", "ba" (e.g., 学 → "gaku")
  • No okurigana in kun'yomi (e.g., 高い → "taka", not "takai")

Assigning New Kanji IDs

When new entries introduce kanji not in kanji_list.json:

  1. Detect new kanji:

    python3 build/update_kanji_index.py --check-new
    
  2. Assign readings and gloss using your knowledge:

    • Most common on'yomi
    • Most common kun'yomi (without okurigana)
    • Single-word English gloss
  3. Update kanji_list.json:

    {
      "新": {
        "kanji_id": "00123_shin_atara_new",
        "onyomi": "shin",
        "kunyomi": "atara",
        "gloss": "new"
      }
    }
    
  4. Rebuild:

    python3 build/build_flat.py
    

Common Tasks

Check for New Kanji

python3 build/update_kanji_index.py --check-new

Rebuild All Kanji JSON Files

python3 build/update_kanji_index.py --rebuild-all

Rebuild Kanji HTML Pages

python3 build/build_kanji_html.py

Full Site Build (includes kanji)

python3 build/build_flat.py

Troubleshooting

"Warning: X kanji need IDs assigned"

New kanji were found in entries. Assign IDs manually:

  1. Run --check-new to see the full list
  2. For each kanji, determine on'yomi, kun'yomi, gloss
  3. Add to kanji/kanji_list.json
  4. Rebuild

Missing kanji index page

Check that:

  1. Kanji is in kanji/kanji_list.json
  2. JSON file exists: kanji/{kanji_id}.json
  3. Run python3 build/build_kanji_html.py

Kanji link not appearing in headword

Check that:

  1. Kanji is in kanji/kanji_list.json
  2. Entry HTML was rebuilt after kanji was added

Entry count wrong on kanji page

Rebuild the kanji JSON file:

python3 build/update_kanji_index.py --rebuild-all
python3 build/build_kanji_html.py

File Formats

kanji_list.json

{
  "metadata": {
    "description": "Index mapping kanji characters to their kanji index IDs",
    "generated": "2026-01-22T10:30:00Z",
    "total_kanji": 1500
  },
  "kanji": {
    "人": {
      "kanji_id": "00001_jin_hito_person",
      "onyomi": "jin",
      "kunyomi": "hito",
      "gloss": "person"
    }
  }
}

Individual kanji JSON

{
  "metadata": {
    "kanji": "人",
    "kanji_id": "00001_jin_hito_person",
    "onyomi": "jin",
    "kunyomi": "hito",
    "gloss": "person",
    "entry_count": 245,
    "generated": "2026-01-22T10:30:00Z"
  },
  "entries": [
    {
      "id": "01234_akunin",
      "headword": "{悪|あく}{人|にん}",
      "reading": "あくにん",
      "gloss": "villain, bad person"
    }
  ]
}

Design Decisions

Why invisible links?

  • Preserves clean headword appearance
  • Users discover feature through tooltip
  • No visual clutter

Why romaji in kanji IDs?

  • ASCII-safe file names
  • Human-readable
  • Easy to search and sort

Why sort by reading?

  • Natural Japanese ordering (gojuon)
  • Consistent with how dictionaries organize entries
  • Helps users find related words
Related skills