A CLI-based dataset preprocessing tool for machine learning tasks. Features include data exploration, null value handling, one-hot encoding, and feature scaling, and download the modified dataset effortlessly.
Project description
WizCraft - CLI-Based Dataset Preprocessing Tool
WizCraft is a cutting-edge Command Line Interface (CLI) tool developed to simplify the process of dataset preprocessing for machine learning tasks. It aims to provide a seamless and efficient experience for data scientists of all levels, facilitating the preparation of data for various machine-learning applications.
Table of Contents
Features
- Load and preprocess your dataset effortlessly through a Command Line Interface (CLI).
- View dataset statistics, null value counts, and perform data imputation.
- Encode categorical variables using one-hot encoding.
- Normalize and standardize numerical features for better model performance.
- Download the preprocessed dataset with your desired modifications.
Getting Started
Installation
- Run the pip command:
pip install wiz-craft
- To use the module, use the commands:
from wizcraft.preprocess import Preprocess wiz_obj = Preprocess() wiz_obj.start()
- Follow the on-screen prompts to load your dataset, select target variables, and perform preprocessing tasks.
Features Available
Data Description
- View statistics and properties of numeric columns.
- Explore unique values and statistics of categorical columns.
- Display a snapshot of the dataset.
Handle Null Values
- Show NULL value counts in each column.
- Remove specific columns or fill NULL values with mean, median, or mode, or even using KNN technique.
Encode Categorical Values
- Identify and list categorical columns.
- Perform one-hot encoding on categorical columns.
Feature Scaling
- Normalize the data in a column using Min-Max scaling or Standard Scaler.
Save Preprocessed Dataset
- Download the modified dataset with applied preprocessing steps.