Skip to content

Commit

Permalink
Release 0.3 (#235)
Browse files Browse the repository at this point in the history
  • Loading branch information
Mec-iS authored Nov 8, 2022
1 parent aab3817 commit 161d249
Show file tree
Hide file tree
Showing 30 changed files with 131 additions and 101 deletions.
11 changes: 11 additions & 0 deletions .github/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,17 @@ Take a look to the conventions established by existing code:
* Every module should provide comprehensive tests at the end, in its `mod tests {}` sub-module. These tests can be flagged or not with configuration flags to allow WebAssembly target.
* Run `cargo doc --no-deps --open` and read the generated documentation in the browser to be sure that your changes reflects in the documentation and new code is documented.

#### digging deeper
* a nice overview of the codebase is given by [static analyzer](https://mozilla.github.io/rust-code-analysis/metrics.html):
```
$ cargo install rust-code-analysis-cli
// print metrics for every module
$ rust-code-analysis-cli -m -O json -o . -p src/ --pr
// print full AST for a module
$ rust-code-analysis-cli -p src/algorithm/neighbour/fastpair.rs --ls 22 --le 213 -d > ast.txt
```
* find more information about what happens in your binary with [`twiggy`](https://rustwasm.github.io/twiggy/install.html). This need a compiled binary so create a brief `main {}` function using `smartcore` and then point `twiggy` to that file.

## Issue Report Process

1. Go to the project's issues.
Expand Down
5 changes: 4 additions & 1 deletion .github/DEVELOPERS.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
# Smartcore: Introduction to modules
# smartcore: Introduction to modules

Important source of information:
* [Rust API guidelines](https://rust-lang.github.io/api-guidelines/about.html)

## Walkthrough: traits system and basic structures

Expand Down
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -26,4 +26,6 @@ src.dot
out.svg

FlameGraph/
out.stacks
out.stacks
*.json
*.txt
27 changes: 17 additions & 10 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,22 +4,29 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]
## [0.3.0] - 2022-11-09

## Added
- Seeds to multiple algorithims that depend on random number generation.
- Added feature `js` to use WASM in browser
- Drop `nalgebra-bindings` feature
- Complete refactoring with *extensive API changes* that includes:
- WARNING: Breaking changes!
- Complete refactoring with **extensive API changes** that includes:
* moving to a new traits system, less structs more traits
* adapting all the modules to the new traits system
* moving towards Rust 2021, in particular the use of `dyn` and `as_ref`
* reorganization of the code base, trying to eliminate duplicates
* moving to Rust 2021, use of object-safe traits and `as_ref`
* reorganization of the code base, eliminate duplicates
- implements `readers` (needs "serde" feature) for read/write CSV file, extendible to other formats
- default feature is now Wasm-/Wasi-first

## BREAKING CHANGE
- Added a new parameter to `train_test_split` to define the seed.
## Changed
- WARNING: Breaking changes!
- Seeds to multiple algorithims that depend on random number generation
- Added a new parameter to `train_test_split` to define the seed
- changed use of "serde" feature

## Dropped
- WARNING: Breaking changes!
- Drop `nalgebra-bindings` feature, only `ndarray` as supported library

## [0.2.1] - 2022-05-10
## [0.2.1] - 2021-05-10

## Added
- L2 regularization penalty to the Logistic Regression
Expand Down
30 changes: 18 additions & 12 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,16 +1,23 @@
[package]
name = "smartcore"
description = "The most advanced machine learning library in rust."
description = "Machine Learning in Rust."
homepage = "https://smartcorelib.org"
version = "0.4.0"
authors = ["SmartCore Developers"]
version = "0.3.0"
authors = ["smartcore Developers"]
edition = "2021"
license = "Apache-2.0"
documentation = "https://docs.rs/smartcore"
repository = "https://github.com/smartcorelib/smartcore"
readme = "README.md"
keywords = ["machine-learning", "statistical", "ai", "optimization", "linear-algebra"]
categories = ["science"]
exclude = [
".github",
".gitignore",
"smartcore.iml",
"smartcore.svg",
"tests/"
]

[dependencies]
approx = "0.5.1"
Expand All @@ -19,32 +26,31 @@ ndarray = { version = "0.15", optional = true }
num-traits = "0.2.12"
num = "0.4"
rand = { version = "0.8.5", default-features = false, features = ["small_rng"] }
getrandom = "*"
rand_distr = { version = "0.4", optional = true }
serde = { version = "1", features = ["derive"], optional = true }

[features]
default = ["serde", "datasets"]
default = []
serde = ["dep:serde"]
ndarray-bindings = ["dep:ndarray"]
datasets = ["dep:rand_distr", "std"]
std = ["rand/std_rng", "rand/std"]
# wasm32 only
datasets = ["dep:rand_distr", "std_rand", "serde"]
std_rand = ["rand/std_rng", "rand/std"]
# used by wasm32-unknown-unknown for in-browser usage
js = ["getrandom/js"]

[target.'cfg(target_arch = "wasm32")'.dependencies]
getrandom = { version = "0.2", optional = true }

[target.'cfg(all(target_arch = "wasm32", not(target_os = "wasi")))'.dev-dependencies]
wasm-bindgen-test = "0.3"

[dev-dependencies]
itertools = "*"
criterion = { version = "0.4", default-features = false }
serde_json = "1.0"
bincode = "1.3.1"

[target.'cfg(all(target_arch = "wasm32", not(target_os = "wasi")))'.dev-dependencies]
wasm-bindgen-test = "0.3"

[workspace]
resolver = "2"

[profile.test]
debug = 1
Expand Down
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -186,7 +186,7 @@
same "printed page" as the copyright notice for easier
identification within third-party archives.

Copyright 2019-present at SmartCore developers (smartcorelib.org)
Copyright 2019-present at smartcore developers (smartcorelib.org)

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
<p align="center">
<a href="https://smartcorelib.org">
<img src="smartcore.svg" width="450" alt="SmartCore">
<img src="smartcore.svg" width="450" alt="smartcore">
</a>
</p>
<p align = "center">
Expand All @@ -18,4 +18,4 @@
-----
[![CI](https://github.com/smartcorelib/smartcore/actions/workflows/ci.yml/badge.svg)](https://github.com/smartcorelib/smartcore/actions/workflows/ci.yml)

To start getting familiar with the new Smartcore v0.5 API, there is now available a [**Jupyter Notebook environment repository**](https://github.com/smartcorelib/smartcore-jupyter). Please see instructions there, contributions welcome see [CONTRIBUTING](.github/CONTRIBUTING.md).
To start getting familiar with the new smartcore v0.5 API, there is now available a [**Jupyter Notebook environment repository**](https://github.com/smartcorelib/smartcore-jupyter). Please see instructions there, contributions welcome see [CONTRIBUTING](.github/CONTRIBUTING.md).
2 changes: 1 addition & 1 deletion smartcore.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
10 changes: 5 additions & 5 deletions src/algorithm/neighbour/cover_tree.rs
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ struct Node {
max_dist: f64,
parent_dist: f64,
children: Vec<Node>,
scale: i64,
_scale: i64,
}

#[derive(Debug)]
Expand All @@ -84,7 +84,7 @@ impl<T: Debug + PartialEq, D: Distance<T>> CoverTree<T, D> {
max_dist: 0f64,
parent_dist: 0f64,
children: Vec::new(),
scale: 0,
_scale: 0,
};
let mut tree = CoverTree {
base,
Expand Down Expand Up @@ -245,7 +245,7 @@ impl<T: Debug + PartialEq, D: Distance<T>> CoverTree<T, D> {
max_dist: 0f64,
parent_dist: 0f64,
children: Vec::new(),
scale: 100,
_scale: 100,
}
}

Expand Down Expand Up @@ -306,7 +306,7 @@ impl<T: Debug + PartialEq, D: Distance<T>> CoverTree<T, D> {
max_dist: 0f64,
parent_dist: 0f64,
children,
scale: 100,
_scale: 100,
}
} else {
let mut far: Vec<DistanceSet> = Vec::new();
Expand Down Expand Up @@ -375,7 +375,7 @@ impl<T: Debug + PartialEq, D: Distance<T>> CoverTree<T, D> {
max_dist: self.max(consumed_set),
parent_dist: 0f64,
children,
scale: (top_scale - max_scale),
_scale: (top_scale - max_scale),
}
}
}
Expand Down
8 changes: 4 additions & 4 deletions src/cluster/kmeans.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
//! these re-calculated centroids becoming the new centers of their respective clusters. Next all instances of the training set are re-assigned to their closest cluster again.
//! This iterative process continues until convergence is achieved and the clusters are considered settled.
//!
//! Initial choice of K data points is very important and has big effect on performance of the algorithm. SmartCore uses k-means++ algorithm to initialize cluster centers.
//! Initial choice of K data points is very important and has big effect on performance of the algorithm. `smartcore` uses k-means++ algorithm to initialize cluster centers.
//!
//! Example:
//!
Expand Down Expand Up @@ -74,7 +74,7 @@ pub struct KMeans<TX: Number, TY: Number, X: Array2<TX>, Y: Array1<TY>> {
k: usize,
_y: Vec<usize>,
size: Vec<usize>,
distortion: f64,
_distortion: f64,
centroids: Vec<Vec<f64>>,
_phantom_tx: PhantomData<TX>,
_phantom_ty: PhantomData<TY>,
Expand Down Expand Up @@ -313,7 +313,7 @@ impl<TX: Number, TY: Number, X: Array2<TX>, Y: Array1<TY>> KMeans<TX, TY, X, Y>
k: parameters.k,
_y: y,
size,
distortion,
_distortion: distortion,
centroids,
_phantom_tx: PhantomData,
_phantom_ty: PhantomData,
Expand Down Expand Up @@ -470,7 +470,7 @@ mod tests {
wasm_bindgen_test::wasm_bindgen_test
)]
#[test]
fn fit_predict_iris() {
fn fit_predict() {
let x = DenseMatrix::from_2d_array(&[
&[5.1, 3.5, 1.4, 0.2],
&[4.9, 3.0, 1.4, 0.2],
Expand Down
2 changes: 1 addition & 1 deletion src/dataset/mod.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
//! Datasets
//!
//! In this module you will find small datasets that are used in SmartCore mostly for demonstration purposes.
//! In this module you will find small datasets that are used in `smartcore` mostly for demonstration purposes.
pub mod boston;
pub mod breast_cancer;
pub mod diabetes;
Expand Down
2 changes: 1 addition & 1 deletion src/ensemble/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
//! set and then aggregate their individual predictions to form a final prediction. In classification setting the overall prediction is the most commonly
//! occurring majority class among the individual predictions.
//!
//! In SmartCore you will find implementation of RandomForest - a popular averaging algorithms based on randomized [decision trees](../tree/index.html).
//! In `smartcore` you will find implementation of RandomForest - a popular averaging algorithms based on randomized [decision trees](../tree/index.html).
//! Random forests provide an improvement over bagged trees by way of a small tweak that decorrelates the trees. As in bagging, we build a number of
//! decision trees on bootstrapped training samples. But when building these decision trees, each time a split in a tree is considered,
//! a random sample of _m_ predictors is chosen as split candidates from the full set of _p_ predictors.
Expand Down
5 changes: 1 addition & 4 deletions src/ensemble/random_forest_classifier.rs
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,6 @@ pub struct RandomForestClassifier<
X: Array2<TX>,
Y: Array1<TY>,
> {
parameters: Option<RandomForestClassifierParameters>,
trees: Option<Vec<DecisionTreeClassifier<TX, TY, X, Y>>>,
classes: Option<Vec<TY>>,
samples: Option<Vec<Vec<bool>>>,
Expand Down Expand Up @@ -198,7 +197,6 @@ impl<TX: Number + FloatNumber + PartialOrd, TY: Number + Ord, X: Array2<TX>, Y:
{
fn new() -> Self {
Self {
parameters: Option::None,
trees: Option::None,
classes: Option::None,
samples: Option::None,
Expand Down Expand Up @@ -501,7 +499,6 @@ impl<TX: FloatNumber + PartialOrd, TY: Number + Ord, X: Array2<TX>, Y: Array1<TY
}

Ok(RandomForestClassifier {
parameters: Some(parameters),
trees: Some(trees),
classes: Some(classes),
samples: maybe_all_samples,
Expand Down Expand Up @@ -637,7 +634,7 @@ mod tests {
wasm_bindgen_test::wasm_bindgen_test
)]
#[test]
fn fit_predict_iris() {
fn fit_predict() {
let x = DenseMatrix::from_2d_array(&[
&[5.1, 3.5, 1.4, 0.2],
&[4.9, 3.0, 1.4, 0.2],
Expand Down
3 changes: 0 additions & 3 deletions src/ensemble/random_forest_regressor.rs
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,6 @@ pub struct RandomForestRegressor<
X: Array2<TX>,
Y: Array1<TY>,
> {
parameters: Option<RandomForestRegressorParameters>,
trees: Option<Vec<DecisionTreeRegressor<TX, TY, X, Y>>>,
samples: Option<Vec<Vec<bool>>>,
}
Expand Down Expand Up @@ -177,7 +176,6 @@ impl<TX: Number + FloatNumber + PartialOrd, TY: Number, X: Array2<TX>, Y: Array1
{
fn new() -> Self {
Self {
parameters: Option::None,
trees: Option::None,
samples: Option::None,
}
Expand Down Expand Up @@ -434,7 +432,6 @@ impl<TX: Number + FloatNumber + PartialOrd, TY: Number, X: Array2<TX>, Y: Array1
}

Ok(RandomForestRegressor {
parameters: Some(parameters),
trees: Some(trees),
samples: maybe_all_samples,
})
Expand Down
Loading

0 comments on commit 161d249

Please sign in to comment.