Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with anomaly detection #938

Open
zrosin opened this issue Jun 25, 2021 · 6 comments
Open

Problems with anomaly detection #938

zrosin opened this issue Jun 25, 2021 · 6 comments

Comments

@zrosin
Copy link

zrosin commented Jun 25, 2021

I'm having trouble performing anomaly detection in python. I'm using the hotgym example and am struggling to detect anomalies. I posted on the htm forum earlier and think this is worthy of an issue here.

First I want to point out that the anomaly likelihood class isn't actually being used there, despite that it seems to be working correctly. It's already in the code, just never used. Anyways, I'm pretty sure I'm getting the expected results from the example, but I never was a fan of this dataset because the actual anomalies are hard to see.
image

In the forum post I mentioned above I get some useful advice on ways to try to debug this, The images and questions there may help with some context. but it seems to me that the tm.anomaly is not working correctly. Rather than increasing at anomalies, it decreases. Removing the date encoder does seem to fix this, but obviously removes the temporal context of the data, and is not a real solution. Here's hotgym running with the date encoding removed on a custom data set. It seems to catch the anomalies at 300 and 400, but the noise prevents 300 from being detected and delays 400 from being detected.
image

I wanted to compare this to NuPIC just to see if the results were actually correct and my understanding was wrong but it seems like NuPIC is working properly. Do note that the exact parameters are not exactly the same for the two runs, htm.core is running the hotgym.py parameters, but NuPIC is running these parameters.
image

I'm wondering if I'm doing something critically wrong or if there is an actual issue here. Thanks in advance for the help.

@ctrl-z-9000-times
Copy link
Collaborator

Hi, I'm able to reproduce this issue. I see what you mean about the example using raw anomaly scores instead of the AnomalyLikelihood.

@ctrl-z-9000-times
Copy link
Collaborator

So I took a look further into this issue...

The python code for the AnomalyLikelihood class is a mess. I'm pretty sure there are a few bugs in there. Also, it contains a bunch of special cases for detecting anomalies in situations where HTM systematically fails to.

So I rewrote the class to work much better. Its on a branch of this repo: git checkout anomaly_likelihood_rewrite.
I changed the API so you'll have to modify your program if you want to use it.
It performs worse on the NAB benchmark, probably because I removed all of the special cases.

The hot gym example now looks like:
hotgym

@zrosin
Copy link
Author

zrosin commented Jun 29, 2021

Thanks for the quick reply.

I don't believe the source of my issues was the likelihood class. I think it was due to either an error in or a misuse of the anomaly score function.

In this figure you can see anomaly score (blue) drop at anomaly just after 200, drop at anomaly just after 300, and drop at anomaly just after 400. You can also see it rise around 6-700 despite no change in data.
image
The anomaly likelihood manages to save the day and marks the first 2 anomalies anyways.

Out of curiosity I removed the date component from the data just to see if that would have any effect and it surprisingly gave better anomaly score, but still has some funky patterns.
image

@romanma9999
Copy link

Hi @zrosin
Did you manage to solve / find out eventually what was the problem ?
Was it bad implementation of AnomalyLikelihood class like @ctrl-z-9000-times suggested or something else?

@zrosin
Copy link
Author

zrosin commented Sep 28, 2021

@romanma9999 I don’t have the expertise to actually find the problem here. The predictions work correctly, so HTM under the hood is running correctly. The anomaly reporting is the only problem, whether this is from the c++ part of the implementation or the switch back to python I don’t know.

I ran the program with NUPIC just to sanity check myself, and that worked, so I just switched to using that.

If you are looking at trying to fix it, you may want to isolate the problem first, so make sure C++ is working before trying to change python side.

@breznak
Copy link
Member

breznak commented Sep 30, 2021

@zrosin (sorry I just read this thread briefly) wanted to let you know of the recent rewrite of anomaly likelihood by David, #958

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants