Skip to main content

Automated machine learning framework for time series analysis

Project description

Fedot Industrial logo

Code

PyPi version Supported Python Versions

CI/CD

coverage GitLab mirror for this repository Integration Tests Status

Docs & Examples

Documentation Status binder

Downloads

Downloads

Support

Support

Languages

eng rus

Funding

Acknowledgement to ITMO Acknowledgement to SAI

Fedot.Ind is a automated machine learning framework designed to solve industrial problems related to time series forecasting, classification, and regression. It is based on the AutoML framework FEDOT and utilizes its functionality to build and tune pipelines.

Installation

Fedot.Ind is available on PyPI and can be installed via pip:

pip install fedot_ind

To install the latest version from the main branch:

git clone https://github.com/aimclub/Fedot.Industrial.git
cd FEDOT.Industrial
pip install -r requirements.txt
pytest -s test/

How to Use

Fedot.Ind provides a high-level API that allows you to use its capabilities in a simple way. The API can be used for classification, regression, and time series forecasting problems, as well as for anomaly detection.

To use the API, follow these steps:

  1. Import FedotIndustrial class

from fedot_ind.api.main import FedotIndustrial

2. Initialize the FedotIndustrial object and define the type of modeling task. It provides a fit/predict interface:

  • FedotIndustrial.fit() begins the feature extraction, optimization and returns the resulting composite pipeline;

  • FedotIndustrial.predict() predicts target values for the given input data using an already fitted pipeline;

  • FedotIndustrial.get_metrics() estimates the quality of predictions using selected metrics.

NumPy arrays or Pandas DataFrames can be used as sources of input data. In the case below, x_train / x_test, y_train / y_test are pandas.DataFrame() and numpy.ndarray respectively:

dataset_name = 'Epilepsy'
industrial = FedotIndustrial(problem='classification',
                             metric='f1',
                             timeout=5,
                             n_jobs=2,
                             logging_level=20)

train_data, test_data = DataLoader(dataset_name=dataset_name).load_data()

model = industrial.fit(train_data)

labels = industrial.predict(test_data)
probs = industrial.predict_proba(test_data)
metrics = industrial.get_metrics(target=test_data[1],
                                 rounding_order=3,
                                 metric_names=['f1', 'accuracy', 'precision', 'roc_auc'])

More information about the API is available in the documentation section.

Documentation and examples

The comprehensive documentation is available on readthedocs.

Useful tutorials and examples can be found in the examples folder.

Topic

Example

Time series classification

Basic_TSC and Advanced_TSC

Time series regression

Basic_TSR, Advanced_TSR, Multi-TS

Forecasting

SSA example

Anomaly detection

soon will be available

Model ensemble

Notebook

Benchmarking

Univariate time series classification

Benchmarking was performed on the collection of 112 out of 144 datasets from the UCR archive.

Algorithm

Top-1

Top-3

Top-5

Top-Half

Fedot_Industrial

17.0

23.0

26.0

38

HC2

16.0

55.0

77.0

88

FreshPRINCE

15.0

22.0

32.0

48

InceptionT

14.0

32.0

54.0

69

Hydra-MR

13.0

48.0

69.0

77

RDST

7.0

21.0

50.0

73

RSTSF

6.0

19.0

35.0

65

WEASEL_D

4.0

20.0

36.0

59

TS-CHIEF

3.0

11.0

21.0

30

HIVE-COTE v1.0

2.0

9.0

18.0

27

PF

2.0

9.0

27.0

40

Multivariate time series classification

Benchmarking was performed on the following datasets: BasicMotions, Cricket, LSST, FingerMovements, HandMovementDirection, NATOPS, PenDigits, RacketSports, Heartbeat, AtrialFibrillation, SelfRegulationSCP2

Algorithm

Mean Rank

HC2

5.038

ROCKET

6.481

Arsenal

7.615

Fedot_Industrial

7.712

DrCIF

7.712

CIF

8.519

MUSE

8.700

HC1

9.212

TDE

9.731

ResNet

10.346

mrseql

10.625

Time series regression

Benchmarking was performed on the following datasets: HouseholdPowerConsumption1, AppliancesEnergy, HouseholdPowerConsumption2, IEEEPPG, FloodModeling1, BeijingPM25Quality, BenzeneConcentration, FloodModeling3, BeijingPM10Quality, FloodModeling2, AustraliaRainfall

Algorithm

Mean Rank

FreshPRINCE

6.014

DrCIF

6.786

Fedot_Industrial

8.114

InceptionT

8.957

RotF

9.414

RIST

9.786

TSF

9.929

RandF

10.286

MultiROCKET

10.557

ResNet

11.171

SingleInception

11.571

Real world cases

Building energy consumption

Link to the dataset on Kaggle

Full notebook with solution is here

The challenge is to develop accurate counterfactual models that estimate energy consumption savings post-retrofit. Leveraging a dataset comprising three years of hourly meter readings from over a thousand buildings, the goal is to predict energy consumption (in kWh). Key predictors include air temperature, dew temperature, wind direction, and wind speed.

building target building results

Results:

Algorithm

RMSE_average

FPCR

455.941

Grid-SVR

464.389

FPCR-Bs

465.844

5NN-DTW

469.378

CNN

484.637

Fedot.Industrial

486.398

RDST

527.927

RandF

527.343

Permanent magnet synchronous motor (PMSM) rotor temperature

Link to the dataset on Kaggle

Full notebook with solution is here

This dataset focuses on predicting the maximum recorded rotor temperature of a permanent magnet synchronous motor (PMSM) during 30-second intervals. The data, sampled at 2 Hz, includes sensor readings such as ambient temperature, coolant temperatures, d and q components of voltage, and current. These readings are aggregated into 6-dimensional time series of length 60, representing 30 seconds.

The challenge is to develop a predictive model using the provided predictors to accurately estimate the maximum rotor temperature, crucial for monitoring the motor’s performance and ensuring optimal operating conditions.

rotor temp solution

Results:

Algorithm

RMSE_average

Fedot.Industrial

1.158612

FreshPRINCE

1.490442

RIST

1.501047

RotF

1.559385

DrCIF

1.594442

TSF

1.684828


R&D plans

– Expansion of anomaly detection model list.

– Development of new time series forecasting models.

– Implementation of explainability module (Issue)

Citation

Here we will provide a list of citations for the project as soon as the articles are published.

@article{REVIN2023110483,
title = {Automated machine learning approach for time series classification pipelines using evolutionary optimisation},
journal = {Knowledge-Based Systems},
pages = {110483},
year = {2023},
issn = {0950-7051},
doi = {https://doi.org/10.1016/j.knosys.2023.110483},
url = {https://www.sciencedirect.com/science/article/pii/S0950705123002332},
author = {Ilia Revin and Vadim A. Potemkin and Nikita R. Balabanov and Nikolay O. Nikitin
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fedot_ind-0.4.1.2.tar.gz (196.3 kB view hashes)

Uploaded Source

Built Distribution

fedot_ind-0.4.1.2-py3-none-any.whl (242.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page