Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add 'transforms' JS crate and include in augurs JS bindings #195

Merged
merged 5 commits into from
Dec 12, 2024

Conversation

sd2k
Copy link
Collaborator

@sd2k sd2k commented Dec 11, 2024

This provides a way to access the power transform functions from JS.

I'd like to add the Forecaster functionality to the Prophet bindings
somehow but that limits the Prophet APIs to only return yhat and
the bounds. That's probably enough for many cases but it's not ideal.

This is a quick workaround to get the transform functionality into the
JS package really.

Summary by CodeRabbit

  • New Features

    • Introduced a new augurs-transforms-js package for JavaScript bindings related to data transformations.
    • Added PowerTransform structure for applying power transformations, including methods for transformation and inverse transformation.
  • Dependency Updates

    • Added argmin as a workspace dependency.
    • Updated several dependencies to use workspace-level management.
  • Bug Fixes

    • Improved accessibility of transformation methods in the Transform enum.
    • Corrected author's email in the package metadata.
  • Documentation

    • Updated package.json.tmpl to include a new export for transforms.
  • Tests

    • Added a test suite for the PowerTransform class to validate its functionality.

Copy link
Contributor

coderabbitai bot commented Dec 11, 2024

Walkthrough

The changes in this pull request involve modifications to several Cargo.toml files across multiple packages, introducing a new dependency argmin and transitioning existing dependencies to a workspace-level management approach. The Transform enum in the augurs-forecaster package has had its methods made publicly accessible. Additionally, a new module for JavaScript bindings has been created, encapsulating power transformation logic. A test suite for this new functionality has also been added. Other minor adjustments include updates to build configurations and package exports.

Changes

File Path Change Summary
Cargo.toml - Clarified exclude comments, fixed author's email bracket, updated package version to "0.7.0".
- Added new dependency argmin = "0.10.0".
crates/augurs-forecaster/Cargo.toml - Changed argmin to workspace declaration.
- Updated other dependencies to workspace declarations.
- Updated lints section to workspace level.
crates/augurs-forecaster/src/transforms.rs - Changed visibility of transform and inverse_transform methods from pub(crate) to pub.
js/augurs-transforms-js/Cargo.toml - Introduced new package with metadata.
- Declared several workspace dependencies.
- Updated wasm-pack optimization settings.
js/augurs-transforms-js/src/lib.rs - Added PowerTransform struct and methods for transformation and inverse transformation.
- Introduced PowerTransformOptions struct for input data.
js/justfile - Updated build target to include transforms module.
js/package.json.tmpl - Added new export for ./transforms.
js/testpkg/transforms.test.ts - Introduced a test suite for the PowerTransform class, validating its functionality.

Possibly related PRs

🐇 In the code, we hop and play,
With transforms brightening the way.
New dependencies join our crew,
Power transforms, oh so true!
Tests to guard our code so tight,
Hopping forward, all feels right! 🌟


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Experiment)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@sd2k sd2k requested a review from edwardcqian December 11, 2024 17:33
@sd2k sd2k marked this pull request as ready for review December 12, 2024 08:27
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (7)
js/augurs-transforms-js/src/lib.rs (2)

1-1: Typographical error: 'Javascript' should be 'JavaScript'

In the module documentation, 'Javascript' should be correctly capitalized as 'JavaScript'.


33-38: Consider handling empty data in PowerTransform::new

Adding a check for empty opts.data can improve error handling. If the input data is empty, returning a meaningful error prevents potential runtime issues.

js/testpkg/transforms.test.ts (2)

1-13: Consider adding JSDoc documentation for the test suite.

The initialization code is well-structured, but adding a brief JSDoc comment describing the test suite's purpose would improve maintainability.

+/**
+ * Test suite for PowerTransform class from @bsull/augurs/transforms.
+ * Validates both transform and inverse transform operations.
+ */
 import { webcrypto } from 'node:crypto'

50-62: Add edge case tests for the PowerTransform.

While the basic functionality is tested, consider adding tests for:

  • Empty arrays
  • Arrays with negative values
  • Arrays with zero values
  • Arrays with NaN/Infinity
Cargo.toml (2)

47-47: Fix formatting of argmin dependency.

There's a spacing issue in the argmin dependency declaration.

-argmin= "0.10.0"
+argmin = "0.10.0"

Line range hint 15-17: Fix author email format.

The author email is missing a closing bracket.

-  "Ben Sully <[email protected]",
+  "Ben Sully <[email protected]>",
crates/augurs-forecaster/src/transforms.rs (1)

Line range hint 154-182: Consider adding documentation for public API.

Since these methods are now part of the public API, they should have comprehensive documentation including examples.

+    /// Applies the transformation to the given time series.
+    /// 
+    /// # Arguments
+    /// 
+    /// * `input` - An iterator over f64 values to transform
+    /// 
+    /// # Returns
+    /// 
+    /// A boxed iterator over the transformed values
+    /// 
+    /// # Examples
+    /// 
+    /// ```
+    /// use augurs_forecaster::transforms::Transform;
+    /// 
+    /// let data = vec![1.0, 2.0, 3.0];
+    /// let transform = Transform::log();
+    /// let transformed: Vec<_> = transform.transform(data.into_iter()).collect();
+    /// ```
     pub fn transform<'a, T>(&'a self, input: T) -> Box<dyn Iterator<Item = f64> + 'a>
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5f4fae4 and 7cd0929.

📒 Files selected for processing (8)
  • Cargo.toml (1 hunks)
  • crates/augurs-forecaster/Cargo.toml (1 hunks)
  • crates/augurs-forecaster/src/transforms.rs (2 hunks)
  • js/augurs-transforms-js/Cargo.toml (1 hunks)
  • js/augurs-transforms-js/src/lib.rs (1 hunks)
  • js/justfile (1 hunks)
  • js/package.json.tmpl (1 hunks)
  • js/testpkg/transforms.test.ts (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • js/augurs-transforms-js/Cargo.toml
🔇 Additional comments (5)
js/testpkg/transforms.test.ts (1)

27-47: LGTM! Well-implemented custom matcher.

The custom matcher toAllBeCloseTo is well-implemented with clear error messages and appropriate tolerance handling.

crates/augurs-forecaster/src/transforms.rs (1)

Line range hint 154-167: Verify the impact of increased visibility.

The transform and inverse_transform methods have been made public to support JS bindings. The implementation looks solid with proper error handling.

Also applies to: 169-182

✅ Verification successful

Based on the search results, I can generate the final response as I have enough information to verify the impact of the visibility changes:

Visibility change is safe and aligned with intended usage

The increased visibility of transform and inverse_transform methods is intentionally supporting the JS bindings (js/augurs-transforms-js/src/lib.rs) while maintaining proper encapsulation. All existing usages are within the crate or through the intended public JS interface. The implementation includes proper error handling and type safety.

  • ✓ Used internally in crates/augurs-forecaster/src/forecaster.rs
  • ✓ Exposed correctly through JS bindings in js/augurs-transforms-js/src/lib.rs
  • ✓ No unexpected usages or implementations found in the codebase
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for any existing usages of these methods that might be affected
# by the visibility change

# Search for any direct usage of these methods
rg -A 2 "transform\(" --type rust

# Search for any implementations or traits using these methods
ast-grep --pattern 'impl $_ for $_ {
  $$$
  transform($$$) {
    $$$
  }
  $$$
}'

Length of output: 4693

crates/augurs-forecaster/Cargo.toml (1)

16-19: LGTM! Workspace dependency management looks good.

The transition to workspace-level dependency management is a good practice that will help maintain consistency across the project.

Let's verify workspace dependency consistency:

✅ Verification successful

Let me verify the workspace dependencies more specifically by checking the root Cargo.toml for workspace dependencies and comparing with the specific crate's usage.


Workspace dependency configuration is correctly implemented

The verification confirms that:

  • The dependencies (argmin, augurs-core, itertools, thiserror) are properly defined in the root workspace's dependencies section
  • The augurs-forecaster crate correctly references these dependencies using the .workspace = true syntax
  • The versions in the workspace match the intended dependency management structure
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for any inconsistent dependency versions across workspace

# Check if all workspace members use workspace dependencies consistently
rg "^\w+\s*=\s*\"[^\"]+\"" -g "Cargo.toml" -A 1

Length of output: 10941


Script:

#!/bin/bash
# Check workspace dependencies in root Cargo.toml
echo "=== Checking root Cargo.toml workspace dependencies ==="
rg "^\[workspace.dependencies\]" -A 50 Cargo.toml

# Check augurs-forecaster's specific dependency declarations
echo -e "\n=== Checking augurs-forecaster's dependencies ==="
cat crates/augurs-forecaster/Cargo.toml

Length of output: 2823

js/package.json.tmpl (1)

29-30: LGTM! Export configuration follows established patterns.

The new transforms export is properly configured and maintains alphabetical ordering with other exports.

Let's verify the export path consistency:

✅ Verification successful

Let me verify the content of the justfile to ensure we have all the necessary build targets.


Export configuration is correctly aligned with build targets

All exported JS files in package.json.tmpl have corresponding build targets in the justfile. The transforms module is properly included in both the exports and the build targets list, maintaining consistency with the established pattern.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that all exported JS files have corresponding build targets

# Extract export paths and check against build targets
jq -r '.exports | to_entries[] | .value' js/package.json.tmpl | sed 's/^\.\///' | while read -r file; do
    echo "Checking $file..."
    if ! grep -q "$(basename "$file" .js)" js/justfile; then
        echo "Warning: $file might be missing a build target"
    fi
done

Length of output: 1350


Script:

#!/bin/bash
# Check the content of justfile for build targets
cat js/justfile

Length of output: 1077

js/justfile (1)

10-11: LGTM! Build target properly integrated.

The transforms module is correctly added to the build process, maintaining consistency with other modules.

Let's verify the build target configuration:

✅ Verification successful

Build target properly integrated and verified

All build targets in the justfile have corresponding directories in the codebase:

  • The transforms module is correctly added to the build process
  • Directory js/augurs-transforms-js exists
  • The structure is consistent with other modules (changepoint, clustering, core, etc.)
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify build target directory structure

# Check if all build targets have corresponding directories
while read -r target; do
    target_dir="js/augurs-${target}-js"
    if [ ! -d "$target_dir" ]; then
        echo "Warning: Missing directory for target: $target"
    fi
done < <(grep "build-inner" js/justfile | sed -n 's/.*"\\([^"]*\\)".*/\\1/p')

Length of output: 179


Script:

#!/bin/bash
# Let's try a simpler approach to verify the build targets and directory structure

# First, let's see the content of justfile
echo "=== Content of justfile ==="
cat js/justfile

# Then check for the corresponding directories
echo -e "\n=== Checking directories ==="
fd -t d "augurs-.*-js$" js/

Length of output: 1461

sd2k added 5 commits December 12, 2024 08:37
This provides a way to access the power transform functions from JS.

I'd like to add the Forecaster functionality to the Prophet bindings
somehow but that limits the Prophet APIs to only return yhat and
the bounds. That's probably enough for many cases but it's not ideal.

This is a quick workaround to get the transform functionality into the
JS package really.
This is required otherwise the WASM build assumes some features are
being provided in the environment, which isn't true.
@sd2k sd2k force-pushed the js-power-transform branch from 7cd0929 to 80cfe32 Compare December 12, 2024 08:37
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (4)
crates/augurs-forecaster/src/transforms.rs (4)

154-167: Enhance documentation with error handling examples

While the documentation is good, consider adding examples that demonstrate:

  • Error handling scenarios (e.g., invalid inputs)
  • Edge cases (e.g., NaN, infinity)
  • Return type behavior in error conditions
 /// # Example
 ///
 /// ```
 /// use augurs_forecaster::transforms::Transform;
 ///
 /// let data = vec![1.0, 2.0, 3.0];
 /// let transform = Transform::log();
 /// let transformed: Vec<_> = transform.transform(data.into_iter()).collect();
+/// 
+/// // Error handling example
+/// let invalid_data = vec![-1.0];  // Invalid for log transform
+/// let result: Vec<_> = transform.transform(invalid_data.into_iter()).collect();
+/// assert!(result[0].is_nan());  // Returns NaN for invalid inputs
 /// ```

Also applies to: 183-196


Line range hint 449-457: Standardize error handling across transformations

The error handling approach is inconsistent between the direct functions and their iterator implementations. While the functions return Result, the iterators silently convert errors to NaN. This could lead to hard-to-debug issues.

Consider either:

  1. Making all transformations return Result consistently, or
  2. Adding error reporting capabilities to the iterator implementations
 impl<T> Iterator for BoxCox<T>
 where
     T: Iterator<Item = f64>,
 {
     type Item = f64;
     fn next(&mut self) -> Option<Self::Item> {
-        self.inner
-            .next()
-            .map(|x| box_cox(x, self.lambda).unwrap_or(f64::NAN))
+        self.inner.next().map(|x| {
+            match box_cox(x, self.lambda) {
+                Ok(val) => val,
+                Err(_) => {
+                    // Log error or use a custom error handling strategy
+                    f64::NAN
+                }
+            }
+        })
     }
 }

Also applies to: 516-524


Line range hint 293-308: Optimize MinMaxScale performance

The current implementation recalculates scaling factors for each item. Consider pre-calculating these values to improve performance.

 struct MinMaxScale<T> {
     inner: T,
     params: MinMaxScaleParams,
+    scale_factor: f64,
+    offset: f64,
 }

 impl<T> Iterator for MinMaxScale<T>
 where
     T: Iterator<Item = f64>,
 {
     type Item = f64;
     fn next(&mut self) -> Option<Self::Item> {
-        let Self {
-            params:
-                MinMaxScaleParams {
-                    data_min,
-                    data_max,
-                    scaled_min,
-                    scaled_max,
-                },
-            ..
-        } = self;
-        self.inner.next().map(|x| {
-            *scaled_min + ((x - *data_min) * (*scaled_max - *scaled_min)) / (*data_max - *data_min)
-        })
+        self.inner.next().map(|x| self.offset + (x * self.scale_factor))
     }
 }

 trait MinMaxScaleExt: Iterator<Item = f64> {
     fn min_max_scale(self, params: MinMaxScaleParams) -> MinMaxScale<Self>
     where
         Self: Sized,
     {
+        let scale_factor = (params.scaled_max - params.scaled_min) / (params.data_max - params.data_min);
+        let offset = params.scaled_min - (params.data_min * scale_factor);
         MinMaxScale {
             inner: self,
             params,
+            scale_factor,
+            offset,
         }
     }
 }

Line range hint 601-700: Enhance test coverage for edge cases

While the test coverage is good, consider adding tests for:

  • NaN handling
  • Infinity handling
  • Error conditions (e.g., invalid lambda values)
 #[cfg(test)]
 mod test {
+    #[test]
+    fn test_box_cox_edge_cases() {
+        let data = vec![f64::NAN, f64::INFINITY, f64::NEG_INFINITY];
+        let lambda = 0.5;
+        let results: Vec<_> = data.into_iter().box_cox(lambda).collect();
+        assert!(results.iter().all(|x| x.is_nan()));
+    }
+
+    #[test]
+    fn test_yeo_johnson_invalid_lambda() {
+        let data = vec![1.0];
+        let lambda = f64::NAN;
+        let results: Vec<_> = data.into_iter().yeo_johnson(lambda).collect();
+        assert!(results.iter().all(|x| x.is_nan()));
+    }
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7cd0929 and 80cfe32.

📒 Files selected for processing (8)
  • Cargo.toml (2 hunks)
  • crates/augurs-forecaster/Cargo.toml (1 hunks)
  • crates/augurs-forecaster/src/transforms.rs (2 hunks)
  • js/augurs-transforms-js/Cargo.toml (1 hunks)
  • js/augurs-transforms-js/src/lib.rs (1 hunks)
  • js/justfile (1 hunks)
  • js/package.json.tmpl (1 hunks)
  • js/testpkg/transforms.test.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (7)
  • js/package.json.tmpl
  • js/justfile
  • crates/augurs-forecaster/Cargo.toml
  • Cargo.toml
  • js/augurs-transforms-js/Cargo.toml
  • js/augurs-transforms-js/src/lib.rs
  • js/testpkg/transforms.test.ts
🔇 Additional comments (1)
crates/augurs-forecaster/src/transforms.rs (1)

Line range hint 1-700: Overall implementation looks good!

The transformations are well-implemented with proper mathematical formulas, good test coverage, and clear documentation. The suggested improvements are mostly optimizations and enhancements rather than critical issues.

@sd2k sd2k enabled auto-merge (squash) December 12, 2024 08:45
@sd2k sd2k merged commit 954bd91 into main Dec 12, 2024
24 checks passed
@sd2k sd2k deleted the js-power-transform branch December 12, 2024 08:49
@sd2k sd2k mentioned this pull request Dec 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants