Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prediction of GaussianNB panics in some conditions #245

Open
1 task done
chriamue opened this issue Nov 20, 2022 · 4 comments · May be fixed by #246
Open
1 task done

Prediction of GaussianNB panics in some conditions #245

chriamue opened this issue Nov 20, 2022 · 4 comments · May be fixed by #246

Comments

@chriamue
Copy link

I'm submitting a

  • bug report.

Current Behaviour:

Prediction of GaussianNB fails when trained on very small data.

Expected Behaviour:

Assertion of Y may fail but should not panic.

Steps to reproduce:

    #[test]
    fn run_gaussian_naive_bayes_with_few_samples() {
        let x = DenseMatrix::<f64>::from_2d_array(&[
            &[-1., -1.],
            &[-2., -1.],
            &[-3., -2.],
            &[1., 1.],
        ]);
        let y: Vec<u32> = vec![1, 1, 1, 2];

        let gnb = GaussianNB::fit(&x, &y, Default::default()).unwrap();

        assert_eq!(gnb.classes(), &[1, 2]);
        let y_hat = gnb.predict(&x).unwrap();
        assert_eq!(y_hat, y);
    }

Snapshot:

thread 'classifier::bayes::tests::test_detector' panicked at 'called `Option::unwrap()` on a `None` value', /Users/chriamue/.cargo/registry/src/github.com-1ecc6299db9ec823/smartcore-0.3.0/src/naive_bayes/mod.rs:109:67

https://github.com/smartcorelib/smartcore/blob/development/src/naive_bayes/mod.rs#L109

.map(|(class_index, class)| {
    (
        class,
        self.distribution.log_likelihood(class_index, &row)
            + self.distribution.prior(class_index).ln(),
    )
})
.max_by(|(_, p1), (_, p2)| p1.partial_cmp(p2).unwrap())

p1 or p2 is NaN so partial_cmp is None which panics on unwrap().

Do you want to work on this issue?

A solution would be to check the log_likelihood for NaN and return 0, but I am not sure if this is mathematically right.
https://github.com/smartcorelib/smartcore/blob/development/src/naive_bayes/gaussian.rs#L77

    fn log_likelihood<'a>(&self, class_index: usize, j: &'a Box<dyn ArrayView1<X> + 'a>) -> f64 {
        let mut likelihood = 0f64;
        for feature in 0..j.shape() {
            let value = X::to_f64(j.get(feature)).unwrap();
            let mean = self.theta[class_index][feature];
            let variance = self.var[class_index][feature];
            likelihood += self.calculate_log_probability(value, mean, variance);
        }
        if likelihood.is_nan() {
            0f64
        } else {
            likelihood
        }
    }

Another solution is to check for NaN in predict function:
https://github.com/smartcorelib/smartcore/blob/development/src/naive_bayes/mod.rs#L102

.map(|(class_index, class)| {self.distribution.prior(class_index).ln());
  let mut log_likelihood = self.distribution.log_likelihood(class_index, &row);
  if log_likelihood.is_nan() {
      log_likelihood = 0.0
  }
  (
      class,
      log_likelihood
          + self.distribution.prior(class_index).ln(),
  )
})
@Mec-iS
Copy link
Collaborator

Mec-iS commented Nov 20, 2022

thanks for finding this 👍

@Mec-iS
Copy link
Collaborator

Mec-iS commented Nov 21, 2022

I think the right approach would be to return an error if log_likehood is NaN. Let me check.

@Mec-iS Mec-iS linked a pull request Nov 21, 2022 that will close this issue
@grinapo
Copy link

grinapo commented Nov 19, 2024

I got that as well. Handling it any way would be preferred to panic, or so I believe. :-)

@morenol
Copy link
Collaborator

morenol commented Nov 26, 2024

must be solved by #274

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants