Project description

flex-nlp

The flex-nlp package consists of a set of tools and utilities to work with Natural Language Processing (NLP) datasets and models. It is designed to be used with the FLEXible framework, as it is an extension of it.

flex-nlp comes with some tools to work with NLP datasets, that are the following ones:

ss_triplet_input_adapter a Semantic Textual Similarity (STS) dataset adapter: It is a dataset adapter that allows to work with the TripletQQP dataset and other datasets that are similar to it.
default_data_collator_classification: It is a data collator that allows to work with the classification task, and it is the default data collator for the classification task.
basic_collate_pad_sequence_classification: It is a data collator that allows to work with the classification task, and it is a basic data collator for the classification task. This collator pads the sequences to the maximum length of the batch, and it puts the batch dimension in the first position.

We also provide an aggregator to work with neural networks, clip_avg. Alonside, we have used some aggregator available in the FLEXible framework.

`Aggregator`	`Description`	`Citation`
clip_avg	It is a federated aggregator that clips the weights recieved by the clients, averaging only those that surpass a selected threshold.	Reviewing Federated Learning Aggregation Algorithms; Strategies, Contributions, Limitations and Future Perspectives
fedavg	It is a federated aggregator that compute the mean of the weights recieved by the clients.	Communication-Efficient Learning of Deep Networks from Decentralized Data
weighted_avg	Similar to fedavg, it is a federated aggregator that add weights to the clients in order of giving more importance to some clients than to another clients.	Communication-Efficient Learning of Deep Networks from Decentralized Data

Tutorials

To get started with flex-nlp, you can check the notebooks available in the repository. They cover the following topics:

In the following we detail the tasks, models, and the datasets used in the notebooks:

`Task`	`Model`	`Dataset`
Sentiment Analysis (SA)	BiGRU	IMDb
Question Answering (QA)	DistilBERT	SQuAD
Semantic Textual Similarity (STS)	DistilRoberta	QQP-Triplets

Installation

We recommend Anaconda/Miniconda as the package manager. The following is the corresponding flex-nlp versions and supported Python versions.

`flex`	`flex-nlp`	Python
`main` / `nightly`	`main` / `nightly`	`>=3.8`, `<=3.11`
`v0.6.0`	`v0.1.0`	`>=3.8`, `<=3.11`

To install the package, you can use the following commands:

Using pip:

pip install flexnlp

Download the repository and install it locally:

git clone git@github.com:FLEXible-FL/flex-nlp.git
cd flex-nlp
pip install -e .

## Citation

If you use this package, please cite the following paper:

TODO: Add citation

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

0.1.0

Mar 14, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flexnlp-0.1.0.tar.gz (20.1 kB view hashes)

Uploaded Mar 14, 2024 Source

Built Distribution

flexnlp-0.1.0-py3-none-any.whl (23.7 kB view hashes)

Uploaded Mar 14, 2024 Python 3

Hashes for flexnlp-0.1.0.tar.gz

Hashes for flexnlp-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`fc38244f3a6def70a22b229e127aac601ca19a00355f7f84a29704854ae153c1`
MD5	`075ac66b64c7430ebe0f1db0ad5e23a2`
BLAKE2b-256	`3b17c88b1eb89718c7bccd5ddceeaf0476c8929b499d8e4ddd56362094e25686`

Hashes for flexnlp-0.1.0-py3-none-any.whl

Hashes for flexnlp-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1f0abd84235bb22d4671aafb3cd6b350ba0644fc4e2bbe37421ec609d7a818e8`
MD5	`3a8a13adcae1ae6ec5264e546696734b`
BLAKE2b-256	`b18025a16dda8025e7910f714ea162532b9fd2da9a2e3c4b8b04f21c41ca72d0`