Skip to main content

No project description provided

Project description

Installation from pip3

pip3 install --verbose subtitle_analyzer
python -m spacy download en_core_web_trf
python -m spacy download es_dep_news_trf
python -m spacy download de_dep_news_trf

Usage

Please refer to api docs.

Excutable usage

Since the x2cdict needs environment variables DICT_DB_HOST and DEEPL_AUTH_KEY, so Don't forget!!!.

  • Write ass file with vocabulary information
sta_vocab --srtfile movie.srt --lang en --assfile en_vocab.ass --external False
  • Write ass file with phrase information
sta_phrase --srtfile movie.srt --lang en --assfile en_phrase.ass --external False

Package usage

from subtitlecore import Subtitle
from subtitle_analyzer import VocabAnalyzer, PhraseAnalyzer
from subtitle_analyzer import VocabASSWriter, PhraseASSWriter
import json

def subtitle_vocab(srtfile, lang, assfile, external):

  phase = {"step": 1, "msg": "Start sentenizing"}
  print(json.dumps(phase), flush=True)

  sf = Subtitle(srtfile, lang)
  sens = sf.sentenize()
  for e in sens:
    print(e)

  phase = {"step": 2, "msg": "Finish sentenizing"}
  print(json.dumps(phase), flush=True)

  analyzer = VocabAnalyzer(lang)
  exs = analyzer.get_line_vocabs(sens, external)
  shown = exs[:20]

  phase = {"step": 3, "msg": "Finish vocabs dictionary lookup", "vocabs": shown}
  print(json.dumps(phase), flush=True)

  if assfile:
    ass_writer = VocabASSWriter(srtfile)
    ass_writer.write(exs, assfile, {"animation": False})
    
    phase = {"step": 4, "msg": "Finish ass saving"} 
    print(json.dumps(phase), flush=True)

def subtitle_phrase(srtfile, lang, assfile, external):

  phase = {"step": 1, "msg": "Start sentenizing"}
  print(json.dumps(phase), flush=True)

  sf = Subtitle(srtfile, lang)
  sens = sf.sentenize()
  for e in sens:
    print(e)

  phase = {"step": 2, "msg": "Finish sentenizing"}
  print(json.dumps(phase), flush=True)

  analyzer = PhraseAnalyzer(lang)
  exs = analyzer.get_line_phrases(sens, external)

  phase = {"step": 3, "msg": "Finish phrases dictionary lookup", "vocabs": exs[:10]}
  print(json.dumps(phase), flush=True)

  if assfile:
    ass_writer = PhraseASSWriter(srtfile)
    ass_writer.write(exs, assfile, {"animation": False})
    
    phase = {"step": 4, "msg": "Finish ass saving"} 
    print(json.dumps(phase), flush=True)

Development

Clone project

git clone https://github.com/qishe-nlp/subtitle-analyzer.git

Install poetry

Install dependencies

poetry update

Test

poetry run pytest -rP

which run tests under tests/*

Execute

poetry run sta_vocab --help
poetry run sta_phrase --help

Create sphinx docs

poetry shell
cd apidocs
sphinx-apidoc -f -o source ../subtitle_analyzer
make html
python -m http.server -d build/html

Host docs on github pages

cp -rf apidocs/build/html/* docs/

Build

  • Change version in pyproject.toml and subtitle_analyzer/__init__.py
  • Build python package by poetry build

Git commit and push

Publish from local dev env

  • Set pypi test environment variables in poetry, refer to poetry doc
  • Publish to pypi test by poetry publish -r test

Publish through CI

git tag [x.x.x]
git push origin master

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

subtitle_analyzer-0.1.23.tar.gz (8.3 kB view hashes)

Uploaded Source

Built Distribution

subtitle_analyzer-0.1.23-py3-none-any.whl (11.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page