Skip to main content

A tool for comparing two Pandas DataFrame objects

Project description

dfcompy

Description

dfcompy is a Python package that provides a comprehensive tool for comparing two Pandas DataFrame objects. It can identify rows that are inserted, deleted, or updated between two DataFrames, catering especially to data analysis and data cleaning processes.

Installation

Install dfcompy using pip:

pip install dfcompy

Usage

import pandas as pd

from dfcompy import DataFrameComparator



# Create example DataFrames

# ... [example DataFrame creation]



# Create a DataFrameComparator instance

comparator = DataFrameComparator(df1, df2, on=['ID'], subset=['Name', 'Age'])



# Detect deleted rows

print("Deleted Rows:")

print(comparator.rows_deleted())



# Detect inserted rows

print("\nInserted Rows:")

print(comparator.rows_inserted())



# Detect updated rows

print("\nUpdated Rows:")

print(comparator.rows_before_update())



# Detect unchanged rows

print("\nUnchanged Rows:")

print(comparator.rows_in_common())

Contributing

Contributions are welcome! For major changes, please open an issue first to discuss what you would like to change.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dfcompy-1.0.0.tar.gz (5.3 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page