Skip to main content

No project description provided

Project description

chinormfilter

PyPi version PyTest

Filter synonym written in lucene format to avoid duplication with Sudachi normalization. Mainly used when migrating to sudachi analyzer.

Usage

$ chinormfilter tests/test.txt -o out.txt

filtered result is following.

レナリドミド,レナリドマイド
リンゴ => 林檎
飲む,呑む
tlc => tlc,全肺気量
リンたんぱく質,リン蛋白質,リンタンパク質

↓ filter

レナリドミド,レナリドマイド
tlc => tlc,全肺気量

Specify system dict

$ chinormfilter tests/test.txt -s full -o out.txt

Use Custom Dict

Specify dict via sudachi.json

$ chinormfilter tests/test.txt -s sudachi.json -o out.txt

TODO

  • custom dict test

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chinormfilter-0.5.3.tar.gz (6.2 kB view hashes)

Uploaded Source

Built Distribution

chinormfilter-0.5.3-py3-none-any.whl (7.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page