Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Prophet algorithm in augurs-prophet crate #118

Merged
merged 68 commits into from
Oct 10, 2024
Merged
Show file tree
Hide file tree
Changes from 63 commits
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
8b9365a
WIP - add Prophet algorithm in augurs-prophet crate
sd2k Oct 3, 2024
3dad4c8
Clean up some APIs, add comments to public methods and modules
sd2k Oct 3, 2024
5162862
Rename float error
sd2k Oct 3, 2024
4f64a11
Export OptProphetOptions
sd2k Oct 3, 2024
cdb22fa
Use strongly typed OptimizedParams
sd2k Oct 3, 2024
3aa29f3
Add README
sd2k Oct 3, 2024
5b5c1ed
Add .bacon-locations to gitignore
sd2k Oct 4, 2024
84f68db
Don't panic on predictions, return an error instead
sd2k Oct 4, 2024
606cd08
Rename seasonal indicators to seasonality conditions
sd2k Oct 4, 2024
5ee30da
Update predict method to take a struct, too
sd2k Oct 4, 2024
c530ceb
Add test for growth_init from Prophet; fix bug
sd2k Oct 4, 2024
fca7053
Add test for minmax growth init
sd2k Oct 4, 2024
0c8c4a8
Refactor and add a couple more tests
sd2k Oct 4, 2024
02270ce
Finish adding trend and piecewise methods and tests
sd2k Oct 7, 2024
05f3f39
Fix fourier_series, add tests
sd2k Oct 7, 2024
bce2173
Get simple point predictions working
sd2k Oct 7, 2024
6371dc9
Add doc-test bacon job, remove broken doc-test examples for now
sd2k Oct 7, 2024
ae5d5ef
Fix linting, do some small refactoring
sd2k Oct 7, 2024
efd6a6d
Get uncertainty working for MAP/MLE
sd2k Oct 8, 2024
d1ea0b8
Add some doc comments; rename extra_regressors to regressors
sd2k Oct 8, 2024
1b7d756
Split private Prophet impls across submodules to make it easier to na…
sd2k Oct 8, 2024
3c138ad
Use workspace dependency for rand
sd2k Oct 8, 2024
93a11dd
Include ds in predictions
sd2k Oct 8, 2024
ca34f37
Use i64 instead of u64 to represent timestamps
sd2k Oct 8, 2024
68c13f2
Add 'prophet' feature to augurs wrapper crate
sd2k Oct 8, 2024
ca8e49b
Handle possible invalid value in Laplace sampling
sd2k Oct 8, 2024
9557d64
Mention just/bacon in README; add new crate
sd2k Oct 8, 2024
afa2812
Add LICENSE files
sd2k Oct 8, 2024
afc9dc9
Improve error message & handling for PositiveFloat
sd2k Oct 8, 2024
cb6ffb6
Remove old NotImplemented error variant
sd2k Oct 8, 2024
2954e7b
Use FromPrimitive in nanmean
sd2k Oct 8, 2024
44637e8
Remove as_any method from Optimizer trait in tests
sd2k Oct 8, 2024
dfdbe47
Don't set trend to beta in mock optimizer
sd2k Oct 8, 2024
c03a256
Make sigma_obs parameter a PositiveFloat
sd2k Oct 8, 2024
5a81b86
Use a type parameter for the optimizer
sd2k Oct 8, 2024
d830efc
Add errors section to TrainingData methods
sd2k Oct 8, 2024
04cc2ed
Return NaN error instead of Infinite error when checking regressors
sd2k Oct 8, 2024
d73af57
Validate names aren't already present in seasonality
sd2k Oct 8, 2024
5344f1f
Add simple tests for lower/upper bounds
sd2k Oct 8, 2024
32188d0
Add newtype for interval width parameter
sd2k Oct 8, 2024
7ee90f8
Add link in README
sd2k Oct 8, 2024
6effe8f
Add test for nanmean
sd2k Oct 8, 2024
21e4e99
Correctly return NaNValue error from PredictionData::with_regressors
sd2k Oct 8, 2024
776c834
Make 'bacon' a link in README
sd2k Oct 8, 2024
59d69d9
Fix std dev calculation in regression standardization
sd2k Oct 8, 2024
ce7a0de
Handle holiday features, and refactor to remove string typing
sd2k Oct 9, 2024
3454254
Remove old TODO
sd2k Oct 9, 2024
a9faeba
Split 'custom' columns into separate hashmaps
sd2k Oct 9, 2024
c24d605
Calculate predictions for holidays, seasonalities and regressor terms…
sd2k Oct 9, 2024
4330913
Validate length of lower/upper window
sd2k Oct 9, 2024
6836f1e
Use match instead of if/else in Modes::insert
sd2k Oct 9, 2024
42bf289
Set jacobian = true if MAP was requested
sd2k Oct 9, 2024
938d272
Add test for constant regressor, fix logic for adding regressor scales
sd2k Oct 9, 2024
ad5b01d
Add basic test for fit/predict with a holiday
sd2k Oct 9, 2024
ac19559
Add test for custom holiday priors, fix handling of defaults
sd2k Oct 9, 2024
1e6b1fb
Add test for conditional custom seasonalities
sd2k Oct 9, 2024
90d29cf
Use statrs crate rather than copy/pasted code
sd2k Oct 9, 2024
6c4aec1
Merge branch 'main' into prophet
sd2k Oct 10, 2024
467a531
Refactor predict_features
sd2k Oct 10, 2024
bbb9fcc
Comment out EstimationMode::Mcmc, for now
sd2k Oct 10, 2024
9fd92a9
Mark Error and EstimationMode as non_exhaustive
sd2k Oct 10, 2024
d106f8b
Derive default for some additional options
sd2k Oct 10, 2024
37bb836
Improve detail on 'Scaling' error variant
sd2k Oct 10, 2024
7f139d2
Clarify error docs in PositiveFloat
sd2k Oct 10, 2024
628b34c
Use a single contiguous Vec for X matrix
sd2k Oct 10, 2024
8de6e90
Derive bytemuck::Pod and bytemuck::Zeroable for PositiveFloat
sd2k Oct 10, 2024
6f2ca8a
Change Optimizer to take parameters by reference
sd2k Oct 10, 2024
cf53b65
Enable method chaining with add_regressor and add_seasonality
sd2k Oct 10, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
/target
/Cargo.lock
.bacon-locations
.vscode
6 changes: 6 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -31,13 +31,18 @@ augurs-ets = { version = "0.3.1", path = "crates/augurs-ets" }
augurs-forecaster = { path = "crates/augurs-forecaster" }
augurs-mstl = { version = "0.3.1", path = "crates/augurs-mstl" }
augurs-outlier = { version = "0.3.1", path = "crates/augurs-outlier" }
augurs-prophet = { version = "0.3.1", path = "crates/augurs-prophet" }
augurs-seasons = { version = "0.3.1", path = "crates/augurs-seasons" }
augurs-testing = { path = "crates/augurs-testing" }

chrono = "0.4.38"
distrs = "0.2.1"
itertools = "0.13.0"
num-traits = "0.2.19"
rand = "0.8.5"
roots = "0.0.8"
serde = { version = "1.0.166", features = ["derive"] }
statrs = "0.17.1"
thiserror = "1.0.40"
tinyvec = "1.6.0"
tracing = "0.1.37"
Expand All @@ -46,6 +51,7 @@ assert_approx_eq = "1.1.0"
criterion = "0.5.1"
iai = "0.1.1"
pprof = { version = "0.13.0", features = ["criterion", "frame-pointer", "prost-codec"] }
pretty_assertions = "1.4.1"

# See https://nnethercote.github.io/perf-book/build-configuration.html
# for more information on why we're using these settings.
Expand Down
11 changes: 11 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,11 +29,19 @@ APIs are subject to change, and functionality may not be fully implemented.
| [`augurs-ets`][] | Automatic exponential smoothing models | alpha - non-seasonal models working and tested against statsforecast |
| [`augurs-mstl`][] | Multiple Seasonal Trend Decomposition using LOESS (MSTL) | beta - working and tested against R |
| [`augurs-outlier`][] | Outlier detection for time series | alpha |
| [`augurs-prophet`][] | The Prophet time series forecasting algorithm | alpha |
| [`augurs-seasons`][] | Seasonality detection using periodograms | alpha - working and tested against Python in limited scenarios |
| [`augurs-testing`][] | Testing data and, eventually, evaluation harness for implementations | alpha - just data right now |
| [`augurs-js`][] | WASM bindings to augurs | alpha |
| [`pyaugurs`][] | Python bindings to augurs | alpha |

## Developing

This project uses [`just`] as a command runner; this will need to be installed separately.
See the [`justfile`](./justfile) for more information.

Some of the tasks require [`bacon`], which will also need to be installed separately.

## Releasing

Releases are made using `release-plz`: a PR should be automatically created for each release, and merging will perform the release and publish automatically.
Expand Down Expand Up @@ -73,6 +81,9 @@ Licensed under the Apache License, Version 2.0 `<http://www.apache.org/licenses/
[`augurs-mstl`]: https://crates.io/crates/augurs-mstl
[`augurs-js`]: https://crates.io/crates/augurs-js
[`augurs-outlier`]: https://crates.io/crates/augurs-outlier
[`augurs-prophet`]: https://crates.io/crates/augurs-prophet
[`augurs-seasons`]: https://crates.io/crates/augurs-seasons
[`augurs-testing`]: https://crates.io/crates/augurs-testing
[`pyaugurs`]: https://crates.io/crates/pyaugurs
[`just`]: https://just.systems/man/en/
[`bacon`]: https://dystroy.org/bacon
sd2k marked this conversation as resolved.
Show resolved Hide resolved
113 changes: 113 additions & 0 deletions bacon.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
# This is a configuration file for the bacon tool
#
# Bacon repository: https://github.com/Canop/bacon
# Complete help on configuration: https://dystroy.org/bacon/config/
# You can also check bacon's own bacon.toml file
# as an example: https://github.com/Canop/bacon/blob/main/bacon.toml

default_job = "clippy"
summary = true

[jobs.check]
command = ["cargo", "check", "--color", "always"]
need_stdout = false

[jobs.check-all]
command = ["cargo", "check", "--all-targets", "--color", "always"]
need_stdout = false

# Run clippy on the default target
[jobs.clippy]
command = [
"cargo", "clippy",
"--color", "always",
]
need_stdout = false

# Run clippy on all targets
# To disable some lints, you may change the job this way:
# [jobs.clippy-all]
# command = [
# "cargo", "clippy",
# "--all-targets",
# "--color", "always",
# "--",
# "-A", "clippy::bool_to_int_with_if",
# "-A", "clippy::collapsible_if",
# "-A", "clippy::derive_partial_eq_without_eq",
# ]
# need_stdout = false
[jobs.clippy-all]
command = [
"cargo", "clippy",
"--all-targets",
"--color", "always",
]
need_stdout = false

# This job lets you run
# - all tests: bacon test
# - a specific test: bacon test -- config::test_default_files
# - the tests of a package: bacon test -- -- -p config
[jobs.test]
command = [
"cargo", "nextest", "--color", "always",
"run", "--all-features", "--workspace",
]
need_stdout = true
analyzer = "nextest"

[jobs.doc-test]
command = [
"cargo", "test", "--doc", "--color", "always",
"--all-features", "--workspace",
"--exclude", "augurs-js",
"--exclude", "pyaugurs",
]
need_stdout = true

[jobs.doc]
command = ["cargo", "doc", "--color", "always", "--no-deps"]
need_stdout = false

# If the doc compiles, then it opens in your browser and bacon switches
# to the previous job
[jobs.doc-open]
command = ["cargo", "doc", "--color", "always", "--no-deps", "--open"]
need_stdout = false
on_success = "back" # so that we don't open the browser at each change

# You can run your application and have the result displayed in bacon,
# *if* it makes sense for this crate.
# Don't forget the `--color always` part or the errors won't be
# properly parsed.
# If your program never stops (eg a server), you may set `background`
# to false to have the cargo run output immediately displayed instead
# of waiting for program's end.
[jobs.run]
command = [
"cargo", "run",
"--color", "always",
# put launch parameters for your program behind a `--` separator
]
need_stdout = true
allow_warnings = true
background = true

# This parameterized job runs the example of your choice, as soon
# as the code compiles.
# Call it as
# bacon ex -- my-example
[jobs.ex]
command = ["cargo", "run", "--color", "always", "--example"]
need_stdout = true
allow_warnings = true

# You may define here keybindings that would be specific to
# a project, for example a shortcut to launch a specific job.
# Shortcuts to internal functions (scrolling, toggling, etc.)
# should go in your personal global prefs.toml file instead.
[keybindings]
# alt-m = "job:my-job"
c = "job:clippy-all" # comment this to have 'c' run clippy on only the default target
d = "job:doc-test"
2 changes: 1 addition & 1 deletion crates/augurs-ets/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ distrs.workspace = true
itertools.workspace = true
lstsq = "0.6.0"
nalgebra = "0.33.0"
rand = "0.8.5"
rand.workspace = true
rand_distr = "0.4.3"
roots.workspace = true
serde = { workspace = true, optional = true, features = ["derive"] }
Expand Down
2 changes: 1 addition & 1 deletion crates/augurs-outlier/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ bench = false
itertools.workspace = true
rayon = { version = "1.10.0", optional = true }
roots.workspace = true
rand = "0.8.5"
rand.workspace = true
rustc-hash = "2.0.0"
rv = { version = "0.17.0", default-features = false }
serde = { workspace = true, features = ["derive"], optional = true }
Expand Down
21 changes: 21 additions & 0 deletions crates/augurs-prophet/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
[package]
name = "augurs-prophet"
license.workspace = true
authors.workspace = true
documentation.workspace = true
repository.workspace = true
version.workspace = true
edition.workspace = true
keywords.workspace = true

[dependencies]
itertools.workspace = true
num-traits.workspace = true
rand.workspace = true
statrs.workspace = true
thiserror.workspace = true

[dev-dependencies]
augurs-testing.workspace = true
chrono.workspace = true
pretty_assertions.workspace = true
1 change: 1 addition & 0 deletions crates/augurs-prophet/LICENSE-APACHE
1 change: 1 addition & 0 deletions crates/augurs-prophet/LICENSE-MIT
72 changes: 72 additions & 0 deletions crates/augurs-prophet/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# Prophet: forecasting at scale

`augurs-prophet` contains an implementation of the [Prophet]
time series forecasting library.

This crate aims to be low-dependency to enable it to run in as
many places as possible. With that said, we need to talk about
optimizers…

## Optimizers

The original Prophet library uses [Stan] to handle optimization and MCMC sampling.
Stan is a platform for statistical modeling which can perform Bayesian statistical
inference as well as maximum likelihood estimation using optimizers such as L-BFGS.
However, it is written in C++ and has non-trivial dependencies, which makes it
difficult to interface with from Rust (or, indeed, Python).

`augurs-prophet` (similar to the Python library) abstracts optimization
and sampling implementations using the `Optimizer` and `Sampler` traits.
These are yet to be implemented, but I have a few ideas:

### `cmdstan`

This is the approach now taken by the Python implementation, which uses
the `cmdstanpy` package and compiles the Stan program into a standalone
binary on installation. It then executes that binary during the fitting
stage to perform optimization or sampling, passing the data and
parameters between Stan and Python using files on the filesystem.

This works fine if you're operating in a desktop or server environment,
but poses issues when running in more esoteric environments such as
WebAssembly.

### `libstan`

We could choose to write a `libstan` crate which uses [`cxx`][cxx] to
interface directly with the C++ library generated by Stan. Since the
model code is constant (unless we upgrade the version of `stanc` used to
generate it), we could also write a small amount of C++ to make it
possible for us to pass data directly to it from Rust.

In theory this should work OK for any target which Stan can compile to.
The problem I've noticed is that Stan isn't particularly careful about
which headers it imports, so even just compiling the `model.hpp` library,
you end up with a bunch of I/O and filesystem related headers imported,
which aren't available when using standard WASM.

Perhaps we could clean Stan up so it didn't import those things? We should
be able to target most environments in that case.

### WASM Components

For WASM, we could abstract the C++ side of things behind a
[WASM component] which exposes an `optimize` interface,
and create a second Prophet component which imports that
interface to implement the `Optimizer` trait of this crate.

### A reimplementation of Stan

We could re-implement Stan in a new Rust crate and use that
here. This is likely to be by far the largest amount of work!

## Credits

This implementation is based heavily on the original [Prophet] Python
package. Some changes have been made to make the APIs more idiomatic
Rust or to take advantage of the type system.

[Prophet]: https://facebook.github.io/prophet/
[Stan]: https://mc-stan.org/
[cxx]: https://cxx.rs/
[WASM component]: https://component-model.bytecodealliance.org/
Loading
Loading