-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'master' of github.com:arnedb/tsfuse
- Loading branch information
Showing
1 changed file
with
91 additions
and
10 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,25 +1,106 @@ | ||
![tests](https://github.com/arnedb/tsfuse/workflows/tests/badge.svg) | ||
<h1 align="center">TSFuse</h1> | ||
|
||
# TSFuse | ||
<p align="center">Python package for automatically constructing features from multiple time series</p> | ||
|
||
Python package for automatically constructing features from multi-view time series data. | ||
<p align="center"> | ||
<a href="https://badge.fury.io/py/tsfuse"> | ||
<img alt="PyPI" src="https://badge.fury.io/py/tsfuse.svg"> | ||
</a> | ||
<a href="https://github.com/arnedb/tsfuse/actions/workflows/tests.yml"> | ||
<img alt="tests" src="https://github.com/arnedb/tsfuse/workflows/tests/badge.svg" /> | ||
</a> | ||
</p> | ||
|
||
## Installation | ||
<hr> | ||
|
||
TSFuse requires Python 3 and is available on PyPI: | ||
## Installation | ||
|
||
Install the latest release using pip: | ||
|
||
pip install tsfuse | ||
|
||
Alternatively, you can install the latest, unreleased version from GitHub: | ||
## Quickstart | ||
|
||
The example below shows the basic usage of TSFuse. | ||
|
||
### Data format | ||
|
||
The input of TSFuse is a dataset where each instance is a window that consists of multiple time series and a label. | ||
|
||
#### Time series | ||
|
||
Time series are represented using a dictionary where each entry represents a univariate or multivariate time series. As an example, let's create a dictionary with two univariate time series: | ||
|
||
```python | ||
from pandas import DataFrame | ||
from tsfuse.data import Collection | ||
X = { | ||
"x1": Collection(DataFrame({ | ||
"id": [0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3], | ||
"time": [0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2], | ||
"data": [1, 2, 3, 1, 2, 3, 3, 2, 1, 3, 2, 1], | ||
})), | ||
"x2": Collection(DataFrame({ | ||
"id": [0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3], | ||
"time": [0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2], | ||
"data": [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3], | ||
})), | ||
} | ||
``` | ||
|
||
The two univariate time series are named `x1` and `x2` and each series is represented as a `Collection` object. Each ``Collection`` is initialized with a DataFrame that has three columns: | ||
|
||
pip install git+https://github.com/arnedb/tsfuse#egg=tsfuse | ||
- `id` which is the identifier of each instance, i.e., each window, | ||
- `time` which contains the time stamps, | ||
- `data` contains the time series data itself. | ||
|
||
For multivariate time series data, there can be multiple columns similar to the `data` column. For example, the data of a tri-axial accelerometer would have three columns `x`, `y`, `z` instead of `data` as it simultaneously measures the `x`, `y`, `z` acceleration. | ||
|
||
#### Labels | ||
|
||
There should be one target value for each window, so we create a `Series` where the index contains all unique `id` values of the time series data and the data consists of the labels: | ||
|
||
```python | ||
from pandas import Series | ||
y = Series(index=[0, 1, 2, 3], data=[0, 0, 1, 1]) | ||
``` | ||
|
||
### Feature construction | ||
|
||
To construct features, TSFuse provides a `construct` function which takes time series data `X` and target data `y` as input, and returns a `DataFrame` where each column corresponds to a feature. In addition, this function can return a computation graph which contains all transformation steps required to compute the features for new data: | ||
|
||
```python | ||
from tsfuse import construct | ||
features, graph = construct(X, y, return_graph=True) | ||
``` | ||
|
||
To apply this computation graph to new data, simply call `transform` with a time series dictionary `X` as input: | ||
|
||
```python | ||
features = graph.transform(X) | ||
``` | ||
|
||
## Documentation | ||
|
||
The documentation is available on [https://arnedb.github.io/tsfuse/](https://arnedb.github.io/tsfuse/) | ||
|
||
## Paper | ||
## Citing TSFuse | ||
|
||
If you use TSFuse for a scientific publication, please consider citing this paper: | ||
|
||
To learn more about TSFuse's feature construction method, read the following paper: | ||
> De Brabandere, A., Op De Beéck, T., Hendrickx, K., Meert, W., & Davis, J. [TSFuse: automated feature construction for multiple time series data](https://doi.org/10.1007/s10994-021-06096-2). *Machine Learning* (2022) | ||
> De Brabandere, A., Op De Beéck, T., Hendrickx, K., Meert, W., & Davis, J. (2022). [TSFuse: Automated feature construction for multiple time series data](https://link.springer.com/article/10.1007/s10994-021-06096-2). Machine Learning. | ||
```bibtex | ||
@article{tsfuse, | ||
author = {De Brabandere, Arne | ||
and Op De Be{\'e}ck, Tim | ||
and Hendrickx, Kilian | ||
and Meert, Wannes | ||
and Davis, Jesse}, | ||
title = {TSFuse: automated feature construction for multiple time series data}, | ||
journal = {Machine Learning}, | ||
year = {2022}, | ||
doi = {10.1007/s10994-021-06096-2}, | ||
url = {https://doi.org/10.1007/s10994-021-06096-2} | ||
} | ||
``` |