Skip to main content

MAT-classification: Analysis and Classification methods for Multiple Aspect Trajectory Data Mining

Project description

MAT-classification: Analysis and Classification methods for Multiple Aspect Trajectory Data Mining [MAT-Tools Framework]


[Publication] [citation.bib] [GitHub] [PyPi]

The present package offers a tool, to support the user in the task of classification of multiple aspect trajectories. It integrates into a unique framework for multiple aspects trajectories and in general for multidimensional sequence data mining methods.

Created on Dec, 2023 Copyright (C) 2023, License GPL Version 3 or superior (see LICENSE file)

Installation

Install directly from PyPi repository, or, download from github. (python >= 3.7 required)

    pip3 install mat-classification

Getting Started

On how to use this package, see MAT-classification-Tutorial.ipynb (or the HTML MAT-classification-Tutorial.html)

Available Classifiers (TODO update):

  • MLP (Movelet): Multilayer-Perceptron (MLP) with movelets features. The models were implemented using the Python language, with the keras, fully-connected hidden layer of 100 units, Dropout Layer with dropout rate of 0.5, learning rate of 10−3 and softmax activation function in the Output Layer. Adam Optimization is used to avoid the categorical cross entropy loss, with 200 of batch size, and a total of 200 epochs per training. [REFERENCE]
  • RF (Movelet): Random Forest (RF) with movelets features, that consists of an ensemble of 300 decision trees. The models were implemented using the Python language, with the keras. [REFERENCE]
  • SVN (Movelet): Support Vector Machine (SVM) with movelets features. The models were implemented using the Python language, with the keras, linear kernel and default structure. Other structure details are default settings. [REFERENCE]
  • POI-S: Frequency-based method to extract features of trajectory datasets (TF-IDF approach), the method runs one dimension at a time (or more if concatenated). The models were implemented using the Python language, with the keras. [REFERENCE]
  • MARC: Uses word embeddings for trajectory classification. It encapsulates all trajectory dimensions: space, time and semantics, and uses them as input to a neural network classifier, and use the geoHash on the spatial dimension, combined with others. The models were implemented using the Python language, with the keras. [REFERENCE]
  • TRF: Random Forest for trajectory data (TRF). Find the optimal set of hyperparameters for each model, applying the grid-search technique: varying number of trees (ne), the maximum number of features to consider at every split (mf), the maximum number of levels in a tree (md), the minimum number of samples required to split a node (mss), the minimum number of samples required at each leaf node (msl), and finally, the method of selecting samples for training each tree (bs). [REFERENCE]
  • XGBost: Find the optimal set of hyperparameters for each model, applying the grid-search technique: number of estimators (ne), the maximum depth of a tree (md), the learning rate (lr), the gamma (gm), the fraction of observations to be randomly samples for each tree (ss), the sub sample ratio of columns when constructing each tree (cst), the regularization parameters (l1) and (l2). [REFERENCE]
  • BITULER: Find the optimal set of hyperparameters for each model, applying the grid-search technique: keeps 64 as the batch size and 0.001 as the learning rate and vary the units (un) of the recurrent layer, the embedding size to each attribute (es) and the dropout (dp). [REFERENCE]
  • TULVAE: Find the optimal set of hyperparameters for each model, applying the grid-search technique: keeps 64 as the batch size and 0.001 as the learning rate and vary the units (un) of the recurrent layer, the embedding size to each attribute (es), the dropout (dp), and latent variable (z). [REFERENCE]
  • DEEPEST: DeepeST employs a Recurrent Neural Network (RNN), both LSTM and Bidirectional LSTM (BLSTM). Find the optimal set of hyperparameters for each model, applying the grid-search technique: keeps 64 as the batch size and 0.001 as the learning rate and vary the units (un) of the recurrent layer, the embedding size to each attribute (es) and the dropout (dp). [REFERENCE]

Available Scripts (TODO update):

By installing the package the following python scripts will be installed for use in system command line tools:

  • MAT-TC.py: Script to run classifiers on trajectory datasets, to details type: MAT-TC.py --help;
  • MAT-MC.py: Script to run movelet-based classifiers on trajectory datasets, to details type: MAT-MC.py --help;
  • POIS-TC.py: Script to run POI-F/POI-S classifiers on the methods feature matrix, to details type: POIS-TC.py --help;
  • MARC.py: Script to run MARC classifier on trajectory datasets, to details type: MARC.py --help.

One script for running the POI-F/POI-S method:

  • POIS.py: Script to run POI-F/POI-S feature extraction methods (poi, npoi, and wnpoi), to details type: POIS.py --help.

Citing

If you use matclassification please cite the following paper (this package is fragmented from automatize realease):

Portela, Tarlis Tortelli; Bogorny, Vania; Bernasconi, Anna; Renso, Chiara. AutoMATise: Multiple Aspect Trajectory Data Mining Tool Library. 2022 23rd IEEE International Conference on Mobile Data Management (MDM), 2022, pp. 282-285, doi: 10.1109/MDM55031.2022.00060.

Bibtex:

@inproceedings{Portela2022automatise,
    title={AutoMATise: Multiple Aspect Trajectory Data Mining Tool Library},
    author={Portela, Tarlis Tortelli and Bogorny, Vania and Bernasconi, Anna and Renso, Chiara},
    booktitle = {2022 23rd IEEE International Conference on Mobile Data Management (MDM)},
    volume={},
    number={},
    address = {Online},
    year={2022},
    pages = {282--285},
    doi={10.1109/MDM55031.2022.00060}
}

Collaborate with us

Any contribution is welcome. This is an active project and if you would like to include your code, feel free to fork the project, open an issue and contact us.

Feel free to contribute in any form, such as scientific publications referencing this package, teaching material and workshop videos.

Related packages

This package is part of MAT-Tools Framework for Multiple Aspect Trajectory Data Mining, check the guide project:

  • mat-tools: Reference guide for MAT-Tools Framework repositories

And others:

Change Log

This is a package under construction, see CHANGELOG.md

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mat_classification-0.1b0.tar.gz (30.2 MB view hashes)

Uploaded Source

Built Distribution

mat_classification-0.1b0-py3-none-any.whl (175.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page