A versatile Python package for data extraction from JSON-like structures with user-defined format keys, enhanced with synonym retrieval capabilities.
Project description
Complex Parser
Complex Parser is a powerful Python package designed to streamline the process of data extraction from JSON-like structures while also enriching the extracted data with synonym retrieval capabilities. Whether you're working with complex nested JSON data or simple dictionaries, this package provides a flexible and intuitive solution for extracting specific data elements based on user-defined format keys, all while expanding the semantic richness of your data through synonym retrieval.
Features
Data Extraction
- Structured Data Extraction: Extract specific data elements from nested JSON-like structures based on user-specified format keys.
- Customizable Format Keys: Define format keys to precisely target the data elements you need, making it adaptable to a wide range of data structures.
- Efficient Data Parsing: Utilizes efficient algorithms to parse through the data and extract relevant information with minimal computational overhead.
- Thread Based chuncking: Utilises threads to quickly sort through larger data sets.
Synonym Retrieval
- Semantic Enrichment: Enhance the semantic richness of your data by retrieving synonyms for key terms using both WordNet and custom synonym lists.
- Flexible Synonym Loading: Load additional synonyms from custom lists to expand the synonym pool for specific terms, allowing for fine-tuned control over synonym retrieval.
Ease of Use
- Simple Integration: Integrate seamlessly into your Python projects with an intuitive interface and straightforward usage.
- Comprehensive Documentation: Detailed documentation and examples provided for easy reference and quick integration into your projects.
Installation
You can install Complex Parser via pip:
pip install complex-parser
Usage:
Here's a simple example demonstrating how to use the package
from complex_parser import extract_data
# Example data
data = {
"people":[
{
"name": "John",
"age": 30,
"address": {
"road": "123 Main St",
"city": "Anytown"
}
},
{
"name": "Joshua",
"age": 3100,
"address": {
"road": "657 Loud St",
"city": "Basictown",
"landmark": "Town Square"
}
},
{
"name": "John",
"age": 30,
"location": {
"road": "8474 Main St",
"city": "None"
}
},
{
"fullname": "Job",
"age": 27,
"destination": {
"road": "8474 John's St",
"city": "London"
}
},
{
"unknown": "Job",
"age": 27,
"destination": {
"road": "8474 John's St",
"city": "London"
}
}
]
}
format_keys = ["name", "address"]
load_lists= {
"address":[
"location"
],
"name": [
"fullname"
]
}
# Extract data with specified format keys
extracted_data = extract_data(data=data, format_keys=format_keys,load_lists=load_lists)
print(extracted_data)
results:
[{'name': 'John', 'age': 30, 'address': {'road': '123 Main St', 'city': 'Anytown'}}, {'name': 'Joshua', 'age': 3100, 'address': {'road': '657 Loud St', 'city': 'Basictown', 'landmark': 'Town Square'}}, {'name': 'John', 'age': 30, 'location': {'road': '8474 Main St', 'city': 'None'}}, {'fullname': 'Job', 'age': 27, 'destination': {'road': "8474 John's St", 'city': 'London'}}]
License:
This project is licensed under the Mozilla Public License Version 2.0 - see the LICENSE file for details.
Contributing
Contributions are welcome! Please feel free to submit bug reports, feature requests, or pull requests on the GitHub repository.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for complex_parser-0.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0099cf21db3f920093dede21b04d0ddb318a86d3e54ffa4ed4424cb9499bd17e |
|
MD5 | 950a144939e2009ff8e3100f141d89b9 |
|
BLAKE2b-256 | 25d4aabadd38b4987116e5beeb5d183bfbc6264f9db0f5d74f03060f42e79c6f |