Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add power transformation logic to forecaster transforms #185

Merged
merged 34 commits into from
Dec 11, 2024
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
4b0d100
add transformation crate and boxcox
edwardcqian Dec 5, 2024
0bcedb4
move power transform logic to forecasting and make transform pub
edwardcqian Dec 5, 2024
4408e2d
formatting update
edwardcqian Dec 5, 2024
834cf61
assert that box cox is strictly positive
edwardcqian Dec 5, 2024
a6e8d56
order dependencies
edwardcqian Dec 6, 2024
6ae5e95
check that data is not empty in boxcoxllf
edwardcqian Dec 6, 2024
d922b47
handle error in optimize lambda
edwardcqian Dec 6, 2024
8a61a50
specify lambda in BoxCox
edwardcqian Dec 6, 2024
4e43431
panic -> assert
edwardcqian Dec 6, 2024
5f7f8cc
add power transformation
edwardcqian Dec 6, 2024
c89d42e
change box cox problem from vec tor array
edwardcqian Dec 6, 2024
efc36c1
change visibilities
edwardcqian Dec 6, 2024
c3a908d
boxcox -> box_cox
edwardcqian Dec 6, 2024
ebfd3c7
fix format and naming
edwardcqian Dec 6, 2024
8d4d2d7
update tests
edwardcqian Dec 9, 2024
214bee9
update scope
edwardcqian Dec 9, 2024
3d5ea97
update optimize_lambda to handle errors
edwardcqian Dec 9, 2024
f5fa4c9
panic if no optimal lambda found
edwardcqian Dec 9, 2024
4460c58
add test for power transformation
edwardcqian Dec 9, 2024
0d92206
assert -> error for box_cox
edwardcqian Dec 9, 2024
b164377
Merge remote-tracking branch 'origin/main' into add-transformation-cr…
edwardcqian Dec 9, 2024
7f4a719
added yeo_johnson transformation + corresponding power logic
edwardcqian Dec 9, 2024
eeecd63
linting fix
edwardcqian Dec 9, 2024
f8ae9b0
linting updare
edwardcqian Dec 9, 2024
2d7c724
fmt update
edwardcqian Dec 9, 2024
434ab52
complete comment for power_transform
edwardcqian Dec 10, 2024
75bf73a
revert scope changes
edwardcqian Dec 10, 2024
934e431
remove duplicated logic
edwardcqian Dec 10, 2024
6e82356
fix yeo johnson implementation and add tests
edwardcqian Dec 10, 2024
0a8ff4e
use explicit import instead of *
edwardcqian Dec 10, 2024
6b7644c
change power_transform to return error
edwardcqian Dec 10, 2024
575242d
extract default parameters
edwardcqian Dec 10, 2024
2e431ae
error checking for inverse box cox
edwardcqian Dec 10, 2024
c1c9303
Reformat doc comments for some methods
sd2k Dec 11, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion crates/augurs-forecaster/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,13 @@ description = "A high-level API for the augurs forecasting library."
bench = false

[dependencies]
argmin = "0.10.0"
augurs-core.workspace = true
itertools.workspace = true
thiserror.workspace = true

[dev-dependencies]
augurs.workspace = true
augurs = { workspace = true, features = ["mstl", "ets", "forecaster"]}
augurs-testing.workspace = true

[lints]
Expand Down
1 change: 1 addition & 0 deletions crates/augurs-forecaster/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
mod data;
mod error;
mod forecaster;
mod power_transforms;
pub mod transforms;

pub use data::Data;
Expand Down
72 changes: 72 additions & 0 deletions crates/augurs-forecaster/src/power_transforms.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
use crate::transforms::box_cox;
use argmin::core::*;
use argmin::solver::brent::BrentOpt;

fn box_cox_log_likelihood(data: &[f64], lambda: f64) -> f64 {
let n = data.len() as f64;
assert!(n > 0.0, "Data must not be empty");
let transformed_data: Vec<f64> = data.iter().map(|&x| box_cox(x, lambda)).collect();
let mean_transformed: f64 = transformed_data.iter().copied().sum::<f64>() / n;
let variance: f64 = transformed_data
.iter()
.map(|&x| (x - mean_transformed).powi(2))
.sum::<f64>()
/ n;
edwardcqian marked this conversation as resolved.
Show resolved Hide resolved

// Avoid log(0) by ensuring variance is positive
let log_likelihood =
-0.5 * n * variance.ln() + (lambda - 1.0) * data.iter().map(|&x| x.ln()).sum::<f64>();
sd2k marked this conversation as resolved.
Show resolved Hide resolved
log_likelihood
}

#[derive(Clone)]
struct BoxCoxProblem<'a> {
data: &'a [f64],
}

impl CostFunction for BoxCoxProblem<'_> {
type Param = f64;
type Output = f64;

// The goal is to minimize the negative log-likelihood
fn cost(&self, lambda: &Self::Param) -> Result<Self::Output, Error> {
Ok(-box_cox_log_likelihood(&self.data, *lambda))
}
}
edwardcqian marked this conversation as resolved.
Show resolved Hide resolved

/// Optimize the lambda parameter for the Box-Cox transformation
pub (crate) fn optimize_lambda(data: &[f64]) -> f64 {
let cost = BoxCoxProblem { data: data };
let init_param = 0.5;
let solver = BrentOpt::new(-2.0, 2.0);
sd2k marked this conversation as resolved.
Show resolved Hide resolved

let result = Executor::new(cost, solver)
.configure(|state| state.param(init_param).max_iters(100))
.run();

match result {
Ok(result) => result.state().best_param.unwrap(),
Err(error) => panic!("Optimization failed: {}", error),
}
}

#[cfg(test)]
mod test {
use super::*;
use augurs_testing::assert_approx_eq;

#[test]
fn correct_optimal_lambda() {
let data = &[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7];
let got = optimize_lambda(data);
assert_approx_eq!(got, 0.7123778635679304);
}

#[test]
fn test_boxcox_llf() {
let data = &[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7];
let lambda = 1.0;
let got = box_cox_log_likelihood(data, lambda);
assert_approx_eq!(got, 11.266065387038703);
}
}
150 changes: 147 additions & 3 deletions crates/augurs-forecaster/src/transforms.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,14 +7,22 @@ use augurs_core::{
Forecast,
};

use crate::power_transforms::optimize_lambda;

/// Transforms and Transform implementations.
///
/// The `Transforms` struct is a collection of `Transform` instances that can be applied to a time series.
/// The `Transform` enum represents a single transformation that can be applied to a time series.
#[derive(Debug, Default)]
pub(crate) struct Transforms(Vec<Transform>);

impl Transforms {
/// create a new `Transforms` instance with the given transforms.
pub(crate) fn new(transforms: Vec<Transform>) -> Self {
Self(transforms)
}

/// Apply the transformations to the given time series.
pub(crate) fn transform<'a, T>(&'a self, input: T) -> Box<dyn Iterator<Item = f64> + 'a>
where
T: Iterator<Item = f64> + 'a,
Expand All @@ -24,6 +32,7 @@ impl Transforms {
.fold(Box::new(input) as _, |y, t| t.transform(y))
}

/// Apply the inverse transformations to the given forecast.
pub(crate) fn inverse_transform(&self, forecast: Forecast) -> Forecast {
self.0
.iter()
Expand All @@ -46,6 +55,13 @@ pub enum Transform {
Logit,
/// Log transform.
Log,
/// Box-Cox transform.
BoxCox {
/// The lambda parameter for the Box-Cox transformation.
/// If lambda == 0, the transformation is equivalent to the natural logarithm.
/// Otherwise, the transformation is (x^lambda - 1) / lambda.
lambda: f64,
},
}

impl Transform {
Expand Down Expand Up @@ -81,7 +97,28 @@ impl Transform {
Self::Log
}

pub(crate) fn transform<'a, T>(&'a self, input: T) -> Box<dyn Iterator<Item = f64> + 'a>
/// Create a new Box-Cox transform.
///
/// This transform applies the Box-Cox transformation to each item.
/// The Box-Cox transformation is defined as:
/// - if lambda == 0: x.ln()
/// - otherwise: (x^lambda - 1) / lambda
pub fn boxcox(lambda: f64) -> Self {
edwardcqian marked this conversation as resolved.
Show resolved Hide resolved
Self::BoxCox { lambda }
}

/// Create the power transform that optimizes the lambda parameter for the Box-Cox transformation.
///
/// This transform applies the Power transformation to each item.
/// The Power transformation is defined as:
///
edwardcqian marked this conversation as resolved.
Show resolved Hide resolved
pub fn power_transform(data: &[f64]) -> Self {
let lambda = optimize_lambda(data);
Self::BoxCox { lambda }
}
edwardcqian marked this conversation as resolved.
Show resolved Hide resolved

/// Apply the transformation to the given time series.
pub fn transform<'a, T>(&'a self, input: T) -> Box<dyn Iterator<Item = f64> + 'a>
sd2k marked this conversation as resolved.
Show resolved Hide resolved
where
T: Iterator<Item = f64> + 'a,
{
Expand All @@ -90,10 +127,12 @@ impl Transform {
Self::MinMaxScaler(params) => Box::new(input.min_max_scale(params.clone())),
Self::Logit => Box::new(input.logit()),
Self::Log => Box::new(input.log()),
Self::BoxCox { lambda } => Box::new(input.box_cox(*lambda)),
}
}

pub(crate) fn inverse_transform<'a, T>(&'a self, input: T) -> Box<dyn Iterator<Item = f64> + 'a>
/// Apply the inverse transformation to the given time series.
pub fn inverse_transform<'a, T>(&'a self, input: T) -> Box<dyn Iterator<Item = f64> + 'a>
sd2k marked this conversation as resolved.
Show resolved Hide resolved
where
T: Iterator<Item = f64> + 'a,
{
Expand All @@ -102,10 +141,12 @@ impl Transform {
Self::MinMaxScaler(params) => Box::new(input.inverse_min_max_scale(params.clone())),
Self::Logit => Box::new(input.logistic()),
Self::Log => Box::new(input.exp()),
Self::BoxCox { lambda } => Box::new(input.inverse_box_cox(*lambda)),
}
}

pub(crate) fn inverse_transform_forecast(&self, mut f: Forecast) -> Forecast {
/// Apply the inverse transformations to the given forecast.
pub fn inverse_transform_forecast(&self, mut f: Forecast) -> Forecast {
f.point = self.inverse_transform(f.point.into_iter()).collect();
if let Some(mut intervals) = f.intervals.take() {
intervals.lower = self
Expand Down Expand Up @@ -371,6 +412,91 @@ trait ExpExt: Iterator<Item = f64> {

impl<T> ExpExt for T where T: Iterator<Item = f64> {}

/// Returns the Box-Cox transformation of the given value.
/// Assumes x > 0.
pub fn box_cox(x: f64, lambda: f64) -> f64 {
assert!(
x > 0.0,
"Input x must be positive for Box-Cox transformation."
);
if lambda == 0.0 {
x.ln()
} else {
(x.powf(lambda) - 1.0) / lambda
}
}
edwardcqian marked this conversation as resolved.
Show resolved Hide resolved

/// An iterator adapter that applies the Box-Cox transformation to each item.
#[derive(Clone, Debug)]
struct BoxCox<T> {
inner: T,
lambda: f64,
}

impl<T> Iterator for BoxCox<T>
where
T: Iterator<Item = f64>,
{
type Item = f64;
fn next(&mut self) -> Option<Self::Item> {
self.inner.next().map(|x| box_cox(x, self.lambda))
}
}

trait BoxCoxExt: Iterator<Item = f64> {
fn box_cox(self, lambda: f64) -> BoxCox<Self>
where
Self: Sized,
{
BoxCox {
inner: self,
lambda,
}
}
}

impl<T> BoxCoxExt for T where T: Iterator<Item = f64> {}

/// Returns the inverse Box-Cox transformation of the given value.
fn inverse_box_cox(y: f64, lambda: f64) -> f64 {
if lambda == 0.0 {
y.exp()
} else {
(y * lambda + 1.0).powf(1.0 / lambda)
}
}
Copy link
Contributor

@coderabbitai coderabbitai bot Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add domain validation to prevent invalid computations.

The function should validate that y * lambda + 1.0 is positive when computing the power function.

fn inverse_box_cox(y: f64, lambda: f64) -> Result<f64, &'static str> {
    if lambda == 0.0 {
        Ok(y.exp())
    } else {
        let value = y * lambda + 1.0;
        if value <= 0.0 {
            Err("Invalid domain for inverse Box-Cox transformation")
        } else {
            Ok(value.powf(1.0 / lambda))
        }
    }
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@edwardcqian should we be doing this? 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think we will reasonably get non-pos value here but doesn't hurt.


/// An iterator adapter that applies the inverse Box-Cox transformation to each item.
#[derive(Clone, Debug)]
struct InverseBoxCox<T> {
inner: T,
lambda: f64,
}

impl<T> Iterator for InverseBoxCox<T>
where
T: Iterator<Item = f64>,
{
type Item = f64;
fn next(&mut self) -> Option<Self::Item> {
self.inner.next().map(|y| inverse_box_cox(y, self.lambda))
}
}

trait InverseBoxCoxExt: Iterator<Item = f64> {
fn inverse_box_cox(self, lambda: f64) -> InverseBoxCox<Self>
where
Self: Sized,
{
InverseBoxCox {
inner: self,
lambda,
}
}
}

impl<T> InverseBoxCoxExt for T where T: Iterator<Item = f64> {}

#[cfg(test)]
mod test {
use augurs_testing::{assert_all_close, assert_approx_eq};
Expand Down Expand Up @@ -502,4 +628,22 @@ mod test {
assert_approx_eq!(params.scaled_min, 0.0);
assert_approx_eq!(params.scaled_max, 1.0);
}

#[test]
fn box_cox_test() {
let data = vec![1.0, 2.0, 3.0];
let lambda = 0.5;
let expected = vec![0.0, 0.8284271247461903, 1.4641016151377544];
let actual: Vec<_> = data.into_iter().box_cox(lambda).collect();
assert_all_close(&expected, &actual);
}

#[test]
fn inverse_box_cox_test() {
let data = vec![0.0, 0.5_f64.ln(), 1.0_f64.ln()];
let lambda = 0.5;
let expected = vec![1.0, 0.426966072919605, 1.0];
let actual: Vec<_> = data.into_iter().inverse_box_cox(lambda).collect();
assert_all_close(&expected, &actual);
}
Comment on lines +779 to +795
Copy link
Contributor

@coderabbitai coderabbitai bot Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add tests for Yeo-Johnson transformation

Consider adding tests for Yeo-Johnson transformation similar to the Box-Cox tests, including:

  1. Basic transformation test
  2. Inverse transformation test
  3. Tests with negative values (unique to Yeo-Johnson)
#[test]
fn yeo_johnson_test() {
    let data = vec![-1.0, 0.0, 1.0];
    let lambda = 0.5;
    let expected = vec![-0.7071067811865476, 0.0, 0.8284271247461903];
    let actual: Vec<_> = data.into_iter().yeo_johnson(lambda).collect();
    assert_all_close(&expected, &actual);
}

#[test]
fn inverse_yeo_johnson_test() {
    let data = vec![-0.7071067811865476, 0.0, 0.8284271247461903];
    let lambda = 0.5;
    let expected = vec![-1.0, 0.0, 1.0];
    let actual: Vec<_> = data.into_iter().inverse_yeo_johnson(lambda).collect();
    assert_all_close(&expected, &actual);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coderabbit saved me here. After adding tests I realize the inverse yeo_johnson implementation wasn't correct

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@edwardcqian, glad to hear that adding the tests helped identify and fix the issue with the inverse Yeo-Johnson implementation!


If you found this review helpful, would you consider giving us a shout-out on X?
Thank you for using CodeRabbit!

}
Loading