Skip to main content

Convenience functions to work with pandas triple dataframes 🐼🐼🐼

Project description

kiez logo

Actions Status Code style: black

Convenience functions for pandas dataframes containing triples. Fun fact: a group of pandas (e.g. three) is commonly referred to as an embarrassment.

This library's main focus is to easily make commonly used functions available, when exploring triples stored in pandas dataframes. It is not meant to be an efficient graph analysis library.

Usage

You can use a variety of convenience functions, let's create some simple example triples:

>>> import pandas as pd
>>> rel = pd.DataFrame([("e1","rel1","e2"), ("e3", "rel2", "e1")], columns=["head","relation","tail"])
>>> attr = pd.DataFrame([("e1","attr1","lorem ipsum"), ("e2","attr2","dolor")], columns=["head","relation","tail"])

Search in attribute triples:

>>> from embarrassment import search
>>> search(attr, "lorem ipsum")
  head relation         tail
0   e1    attr1  lorem ipsum
>>> search(attr, "lorem", method="substring")
  head relation         tail
0   e1    attr1  lorem ipsum

Select triples with a specific relation:

>>> from embarrassment import select_rel
>>> select_rel(rel, "rel1")
  head relation tail
0   e1     rel1   e2

Perform operations on the immediate neighbor(s) of an entity, e.g. get the attribute triples:

>>> from embarrassment import neighbor_attr_triples
>>> neighbor_attr_triples(rel, attr, "e1")
  head relation   tail
1   e2    attr2  dolor

Or just get the triples:

>>> from embarrassment import neighbor_rel_triples
>>> neighbor_rel_triples(rel, "e1")
  head relation tail
1   e3     rel2   e1
0   e1     rel1   e2

By default you get in- and out-links, but you can specify a direction:

>>> neighbor_rel_triples(rel, "e1", in_out_both="in")
  head relation tail
1   e3     rel2   e1
>>> neighbor_rel_triples(rel, "e1", in_out_both="out")
  head relation tail
0   e1     rel1   e2

Using pandas' pipe operator you can chain operations. Let's see a more elaborate example by loading a dataset from sylloge:

>>> from sylloge import MovieGraphBenchmark
>>> from embarrassment import clean, neighbor_attr_triples, search, select_rel
>>> ds = MovieGraphBenchmark()
>>> # clean attribute triples
>>> cleaned_attr = clean(ds.attr_triples_left)
>>> # find uri of James Tolkan
>>> jt = search(cleaned_attr, query="James Tolkan")["head"].iloc[0]
>>> # get neighbor triples
>>> # and select triples with title and show values
>>> title_rel = "https://www.scads.de/movieBenchmark/ontology/title"
>>> neighbor_attr_triples(ds.rel_triples_left, cleaned_attr, jt).pipe(
            select_rel, rel=title_rel
        )["tail"]
    )
    12234    A Nero Wolfe Mystery
    12282           Door to Death
    12440          Die Like a Dog
    12461        The Next Witness
    Name: tail, dtype: object

Installation

You can install embarrassment via pip:

pip install embarrassment

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

embarrassment-0.1.0.tar.gz (7.0 kB view hashes)

Uploaded Source

Built Distribution

embarrassment-0.1.0-py3-none-any.whl (5.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page