snapquery: Introduce Named Queries and Named Query Middleware to wikidata

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

snapQuery

Just query wikidata by name of query ...

snapquery cats

is all you need

This endpoint and query detail independent style of querying wikidata and ther SPARQL services makes your queries future proof. No worries about blazegraph being replaced, the graph being split or timeouts haunting you. snapquery introduces named queries and named query middleware to wikidata and other SPARQL endpoints

snapquery is a tool that simplifies the process of previewing, annotating, rating, commenting, running, and exploring Wikidata queries across different SPARQL backends. It enhances user experience by storing query results and allowing easy comparison across various backends and over time.

This tool is designed to assist users in curating and collaborating on queries, ensuring their continued functionality over time. Developers and data consumers can access data conveniently through APIs, streamlining their workflow.

Demos

Background

In the Wikimedia ecosystem, we now boast several SPARQL engines and backends housing the complete Wikidata graph for querying purposes.

Over recent years, WDQS has encountered escalating timeouts as the graph expands. Users desire alternative endpoints without grappling with disparities among SPARQL engines and their impact on query construction.

Recognizing the needs of Wikidata data consumers, we aim to establish a system that simplifies:

Discovering pre-existing queries
Facilitating easy forking, sharing, rating, and monitoring of queries
Executing queries (or their variations) on diverse endpoints
Comparing query results over time and/or across multiple endpoints
Cultivating a collaborative community for query construction
Ensuring the reliability of query results
Providing alerts if a query no longer yields the expected results
Developing tools that access data from dependable, middleware-cached queries via an accessible API, eliminating delays for downstream users.

Features

Planned

These are the planned features

support for naming queries (#9) ✅
support for sharing queries (unique identifier) (#2) ✅
query multiple backends simultaneously and repeatedly
stores queries and adaptations needed for different backends
support user login
support for spam protection
support for rating queries
support for commenting on queries
support for detecting when a query returns different results between different backends
support for query states: reliable, needs investigation, need verification
support for autodetecting when a query returns fewer results than before (change in underlying data/model in Wikidata) -> needs investigation
support for marking queries as reliable query by users
support for seeing a state history per query
support for storing query results so you don’t have to wait
support for adding metadata to queries
- add main subject (QID) to query
- author -> id of author in the system
- forked from x
has REST API for data consumers e.g. LLM developers who want to present user-verified queries and data to users to increase reliability

User stories

as a user I want to know in advance if the query is returning what I expect
as a user I want to find all the bands in Wikidata without having to know how it is modeled
as a user I want pay someone to help me get the information from Wikidata that I need
as a user I want to know how a query performed in the past so I can trust that the underlying model is stable and I get the expected results
as a user I want to comment on a query
as a user I want to read comments from others on a query so I get and idea how reliable it is
as a user I want to rate a query with 1-5 stars
as a user I want to get information from multiple sparql engines at the same time
as a user I don’t want to wait for a fresh query to finish and just get the information from the latest time a query succeeded.
as a user I want a list of queries in the middleware
as a user I want to sort the list based on the rate of queries
as a user I want to annotate a query with a name
as a user I want to annotate a query with a wikidata item as main subject
as a user I want to see a list of queries that is tagged with a certain topic (wikidata item)
as a user I want an API to get information from the middleware about queries
as a user I want an api endpoint /list that gives me all queries with the main subject=Qxxxx
as a user I want the system to warn me and annotate a query that no longer returns the data the user expects, ie. if a query suddenly start returning no results or fewer results
as a user I want to see a state on a query
as a user I want to log in using github to avoid the hassle of creating another account
as a user I want to log in using gitlab to avoid the hassle of creating another account
as a user I want to log in using facebook to avoid the hassle of creating another account
as a user I want to know how many backends a query is working on, so I get an overview
as a user I want to get query results immediately if possible so I don’t have to wait
as a user I want to import a query by copy pasting a url from WDQS
as a user I want to run a query on multiple backends with one click
as a user I want to fork a query and build on it
as a user I want an email if a query I'm watching needs investigation
as a user I want settings to control whether I get email notifications or not for all queries I'm watching
as a user I want to watch a query
as a user I want to see the history of actions of other users
as a user I want to know who created a query
as a user I want a setting to get email about new comments on queries I'm watching
as a user I want a setting to get emails about new queries every day, week, month
as a user I want to star a query
as a user I want to browse queries and sort by number of stars
as a user I want to see who starred a query
as a user I want to see a list of my notifications
as a wikidata contributor I want to be able to override the “bad query” state
as a wikidata user I want to be able to log in using my wmf credentials to avoid the hassle of creating another account
as a wikidata user I want to link my current account to my wmf account so others can find me by username
as a developer I want to expose the data in API endpoints
as a LLM developer I want to consume the queries and use them to improve my LLM so it can suggest KNOWN GOOD queries to users

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

0.0.10

May 28, 2024

0.0.9

May 12, 2024

0.0.8

May 10, 2024

0.0.7

May 7, 2024

0.0.6

May 6, 2024

0.0.5

May 5, 2024

0.0.4

May 4, 2024

0.0.3

May 4, 2024

0.0.2

May 3, 2024

0.0.1

May 3, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

snapquery-0.0.10.tar.gz (207.9 kB view hashes)

Uploaded May 28, 2024 Source

Built Distribution

snapquery-0.0.10-py3-none-any.whl (207.2 kB view hashes)

Uploaded May 28, 2024 Python 3

Hashes for snapquery-0.0.10.tar.gz

Hashes for snapquery-0.0.10.tar.gz
Algorithm	Hash digest
SHA256	`5df339e86f101c5e30155ad81f965b80920084cfdb9b3ffdf93018cf9bdba9ac`
MD5	`521aac8e6ae1393980152c342c1b0f44`
BLAKE2b-256	`e71c9c48081431778490767371105119953934228b98fc64a7244fab8b364267`

Hashes for snapquery-0.0.10-py3-none-any.whl

Hashes for snapquery-0.0.10-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fbfc1fe7e4d95b0abbf5bd9e9de76e1351134bb87b3a1914b3a76c38056fdf17`
MD5	`e5bf083f9d7ab87b79c46336da5931ad`
BLAKE2b-256	`ff6b38049859199e6f44953d353860516575c34f4f45fe046fb04f527c34bccc`