Project description

Helpful Review Predictor

The Helpful Review Predictor is a Python package that predicts the helpfulness of reviews using machine learning techniques. It takes textual reviews as input and provides a binary classification indicating whether the review is likely to be helpful or not. A prediction of 1 indicates a helpful review, while a prediction of 0 indicates a review that is not helpful.

For a comprehensive understanding of the model's training process and methodology, I have documented it in an academic research paper. To stay updated on the latest developments and access the research paper upon publication, I invite you to follow my LinkedIn profile: Mojtaba Maleki.

Dataset

The data used for training the model is sourced from the Amazon Electronics Reviews dataset available on Kaggle. This 5-core dataset contains product reviews from the Electronics category on Amazon from May 1996 to July 2014, totaling 1,689,188 entries.

The dataset is provided by Julian McAuley, UCSD, and is available here.

Features

Preprocesses textual reviews, including lowercasing, punctuation removal, contractions expansion, and lemmatization.
Utilizes TF-IDF vectorization to convert text data into numerical feature vectors.
Addresses class imbalance using Random Over Sampling.
Supports training and evaluation of multiple classifiers, including Gaussian Naive Bayes, Logistic Regression, and Decision Trees.
Performs hyperparameter tuning using Grid Search and Stratified K-Fold Cross Validation.
Provides visualization tools for comparing different classifiers and evaluating model performance.
Saves the best model and TF-IDF vectorizer for future use.

Installation

You can install the Helpful Review Predictor package using pip:

pip install helpful-review-predictor

Usage

from helpfulReviewPredictor import PredictHelpfulness



string_input = "Your input string here"

predictor = PredictHelpfulness(string_input)

result = predictor.get_result()

print(result)  # Output: 1 for Helpful, 0 for Not Helpful

Requirements

joblib
numpy
scikit-learn
scipy
TfidfVectorizer from sklearn.feature_extraction.text

These changes provide more clarity about the purpose of the package, the dataset used, and the expected output. They also improve the formatting and readability of the document.

Project details

These details have not been verified by PyPI

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 5 - Production/Stable
Intended Audience
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language

Release history Release notifications | RSS feed

This version

Feb 16, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

helpful_review_predictor-6.tar.gz (2.1 MB view hashes)

Uploaded Feb 16, 2024 Source

Built Distribution

helpful_review_predictor-6-py3-none-any.whl (2.2 MB view hashes)

Uploaded Feb 16, 2024 Python 3

Hashes for helpful_review_predictor-6.tar.gz

Hashes for helpful_review_predictor-6.tar.gz
Algorithm	Hash digest
SHA256	`0058c3d26b13f47d33568b5131952d975a9309a5f2098b12f0e55d49f34da412`
MD5	`2ffacc1c7f0e32c5fba25df01eb13bfd`
BLAKE2b-256	`b3b2becda502e6d6e0bf9826e8dbc873a0ef8d662dafdddb817f427f6f848257`

Hashes for helpful_review_predictor-6-py3-none-any.whl

Hashes for helpful_review_predictor-6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`686c88a53d1ef225b493efccb55a10127ca14c4b52c2eb5059c37475b8aaedb9`
MD5	`abd1f64c567b0c9513adb03b1914bd63`
BLAKE2b-256	`aa9bb3569d1cda5e47faad9dd7277dd0a14a90ecf65e9f25bf412a94c8a8abe6`