Significance Analysis for HPO-algorithms performing on multiple benchmarks

These details have not been verified by PyPI

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Significance Analysis

This package is used to analyse datasets of different HPO-algorithms performing on multiple benchmarks.

Note

As indicated with the v0.x.x version number, Significance Analysis is early stage code and APIs might change in the future.

Documentation

Please have a look at our example. The dataset should have the following format:

system_id (algorithm name)	input_id (benchmark name)	metric (mean/estimate)	optional: bin_id (budget/traininground)
Algorithm1	Benchmark1	x.xxx	1
Algorithm1	Benchmark1	x.xxx	2
Algorithm1	Benchmark2	x.xxx	1
...	...	...	...
Algorithm2	Benchmark2	x..xxx	2

In this dataset, there are two different algorithms, trained on two benchmarks for two iterations each. The variable-names (system_id, input_id...) can be customized, but have to be consistent throughout the dataset, i.e. not "mean" for one benchmark and "estimate" for another. The conduct_analysis function is then called with the dataset and the variable-names as parameters. Optionally the dataset can be binned according to a fourth variable (bin_id) and the analysis is conducted on each of the bins seperately, as shown in the code example above. To do this, provide the name of the bin_id-variable and if wanted the exact bins and bin labels. Otherwise a bin for each unique value will be created.

Installation

Using R, >=4.0.0 install packages: Matrix, emmeans, lmerTest and lme4

Using pip

pip install significance-analysis

Usage

Generate data from HPO-algorithms on benchmarks, saving data according to our format.
Call function conduct_analysis on dataset, while specifying variable-names

In code, the usage pattern can look like this:

import pandas as pd
from signficance_analysis import conduct_analysis

# 1. Generate/import dataset
data = pd.read_csv("./significance_analysis_example/exampleDataset.csv")

# 2. Analyse dataset
conduct_analysis(data, "mean", "acquisition", "benchmark")

For more details and features please have a look at our example.

Project details

These details have not been verified by PyPI

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

0.1.11

Oct 6, 2023

0.1.10

Aug 19, 2023

0.1.9

Aug 9, 2023

0.1.8

Aug 8, 2023

0.1.7

Apr 8, 2023

0.1.6

Apr 7, 2023

0.1.5

Mar 27, 2023

0.1.4

Mar 27, 2023

0.1.2

Mar 14, 2023

0.1.1

Mar 14, 2023

0.1.0

Mar 14, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

significance_analysis-0.1.11.tar.gz (9.8 kB view hashes)

Uploaded Oct 6, 2023 Source

Built Distribution

significance_analysis-0.1.11-py3-none-any.whl (8.5 kB view hashes)

Uploaded Oct 6, 2023 Python 3

Hashes for significance_analysis-0.1.11.tar.gz

Hashes for significance_analysis-0.1.11.tar.gz
Algorithm	Hash digest
SHA256	`045edefad21b913e2d4a8e57b8bbf43b3a04236368cc42466a8e1119c7d63a40`
MD5	`e27d317ca3787243f012ec35e8b8ad82`
BLAKE2b-256	`bca2c035ec747ffc49ec005266599b57822540597806f1a7ef6f2312a4a81e05`

Hashes for significance_analysis-0.1.11-py3-none-any.whl

Hashes for significance_analysis-0.1.11-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fd99c929b333755f0602d8dde95b0178f79295c9d76fc3b324855257bc225a56`
MD5	`781413bbef074a75babe56ac450a56b2`
BLAKE2b-256	`3b6ebef44c4e43806fc9d87fefa3e8164ebc854f328b50aea9d4f2261d2ed3f3`