Skip to main content

Kana kanji simple inversion library

Project description

Overview

Documentation Status PyPI version https://raw.githubusercontent.com/vshymanskyy/StandWithUkraine/main/badges/StandWithUkraine.svg

pykakasi is a Python Natural Language Processing (NLP) library to transliterate hiragana, katakana and kanji (Japanese text) into rōmaji (Latin/Roman alphabet). It can handle characters in NFC form.

Its algorithms are based on the kakasi library, which is written in C.

Supported python versions

  • pykakasi supports python 3.8, 3.9, 3.10, 3.11, 3.12, 3.13 and pypy3

Usage

Transliterate Japanese text to kana, hiragana and romaji:

import pykakasi
kks = pykakasi.kakasi()
text = "かな漢字"
result = kks.convert(text)
for item in result:
    print("{}: kana '{}', hiragana '{}', romaji: '{}'".format(item['orig'], item['kana'], item['hira'], item['hepburn']))

かな: kana 'カナ', hiragana: 'かな', romaji: 'kana'
漢字: kana 'カンジ', hiragana: 'かんじ', romaji: 'kanji'

Here is an example that output as similar with furigana mode.

import pykakasi
kks = pykakasi.kakasi()
text = "かな漢字交じり文"
result = kks.convert(text)
for item in result:
    print("{}[{}] ".format(item['orig'], item['hepburn'].capitalize()), end='')
print()

かな[Kana] 漢字[Kanji] 交じり[Majiri] [Bun]

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page