Skip to main content

Performs extractive, hierarchical, summarization out of a corpus of documents.

Project description

Structured and Interactive Summarization

PyPI Status Build Status Documentation Status Code Coverage

Performs extractive, hierarchical, summarization out of a corpus of documents.

Features

  • Preprocessing tools.

Credits

This package was created with Cookiecutter and the francois-durand/package_helper_2 project template.

History

0.X.X (2021-XX-XX): TODO

  • Run experiments and comparison on the flat summarizer

  • Start converting the Wikipedia Animals dataset

  • Start converting hierarchical summary notebook

0.2.0 (2021-03-14): Flat summarizer

  • Main update: fully-functional flat summarizer!

    • Fully customisable;

    • Fully documented.

  • Two tutorials

    • Building a Gismo for the Covid dataset;

    • Flat summarizer on the Covid dataset.

  • Change of paradigm: start from the notebook and build the module cell by cell.

  • Consequences: all non-converted modules from 0.1.2 are moved to the pit. They will be restored during the notebook transformation.

  • sentence splitter optimized big time using nltk hidden features!

0.1.2 (2021-03-03): the pit

  • Batch import of remaining modules in a temporary submodule (the pit). The pit will be dispatched afterwards.

  • Fix import issues (e.g. spacy neuralcoref version incompatibility, Qt5, sknetwork…)

  • submodule gismo_wrapper on death row (may never leave the pit)

  • Embedding_idf OK

  • Building summary: summarize and make_tree have been updated to work.

  • Lot’s of cleaning remains (separating covid/generic, unified pre-proc and source convention,…)

  • Take down neuralcoref for the moment. Does not build on github.

0.1.1 (2021-02-23): data_loader

  • Finish import / transformation of the data_loader module.

0.1.0 (2021-02-23): First release

  • First release on PyPI.

  • Preprocessing submodule deployed

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sisu-0.2.0.tar.gz (54.2 kB view hashes)

Uploaded Source

Built Distribution

sisu-0.2.0-py2.py3-none-any.whl (53.4 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page