-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementing circular hypervectors with the Holographic Reduced Representations model #108
Comments
I guessed the problem is related to the inverse function of the vectors, so I run this test to check the validity of inverse functions:
the result was:
For "HRR" and "VTA" the test failed.
But still the
and
What these tests are checking is, binding a vector X to the inverse of a vector Y should generate a vector as similar to an Identity vector(binding-Identity vector) as the similarity between X and Y. And this is the exact property of {inverse, binding, similarity} operations that circular function is using to generate the second half of the circle. |
Thank you @milad2073 for your investigative work, this is very valuable. The VTB and HRR models are related so it makes sense that both would suffer from this problem. It is indeed a good idea to add unit tests for the inverse operation. Based on your experiments, do you have a sense of where/why this problem arrises? |
It shows the relationship between the similarity of two vectors and how much their division(x bind inv(y)) is close to identity. When this relation is linear, we have:
For example, for the first transition of the second half of the circle, with the notation of the paper, we have:
by assuming:
That means for I'm sorry for bothering you with this extra information. I just wanted to provide my thinking steps so that others can see if I make a mistake. In the end, in my opinion, there is a problem with the nature of HRR operator(s) that prohibits it from being used in this circular function. |
I will think about this. It's a very long time since I read the paper. I think it will always be possible to construct encodings with the required similarity structure to encode cyclic variables for any VSA, but I am more prepared to believe it's possible that the method described in the paper doesn't work with HRR. |
These are some very useful insights @milad2073! I might have some time at the end of next week to verify the inverse implementation of VTB. |
I was reading "Learning with Holographic Reduced Representations", the paper introduced a better version of inverse function for HRR. I replaced the inverse function of torchhd with their optimized version (I also changed the binding functions a little to match this new inverse function), which made the results better.
As you can see, although in the second half, the vectors are getting more similar to the first vector, the differences are not equal. I am eager to read Mr. @rgayler 's explanation of the reasons for this phenomenon. |
@milad2073 , waiting on me for useful explanations is probably a pretty risky strategy. The fact that your results improved with the modified inverse function certainly suggests that the inverse may be involved in the explanation. If I recall correctly, in his 1994 thesis Tony Plate mentions exact and approximate inverses for HRR and FHRR and discusses numerical stability. I think he argued that the approximate inverses were generally preferable. So there's historical precedent for believing that there are alternative definitions for the inverse and reasons to choose between them. (I don't use python, haven't read the torchhd code, and haven't read the unit tests - so take what I'm about to say as a high-level opinion and treat it accordingly) I like @milad2073 's figure comparing the accuracy of the inverse for different VSAs. I believe that accurate inversion is central (axiomatic even) to VSAs - so some purported inverse that's not accurate isn't actually an inverse for VSA purposes. Consequently, on the basis of that figure I would also conclude that the implementation of inverse for HRR and VTB is either buggy or (less likely) that the original theoretical conception has problems. I also believe that HRR and FHRR are equivalent (because you can losslessly transform between them), so the fact that FHRR inverse works as expected while HRR inverse does not, again suggests to me that HRR inverse is broken somehow. I think this inverse behaviour is more fundamental than the circular encoding, so I suggest the HRR inverse should be sorted out first before investigating the circular encoding any further. (Just to prove how truly paranoid I am: are you certain that the cosine similarity function is correct for HRRs? I would be very surprised if it's wrong, but you could waste a lot of time barking up the wrong tree if it is wrong.) |
Thank you, Mr. @rgayler for your insightful response.
This clue was great. Changing the random method of HRR to:
made the HRR inverse function to gain the required property: This is because the challenge of inverse function was where the abs of an element (in Fourier space) was near zero.
But — and there is always a “but” — in I will mention my investigation about the reason of that here for everyone who wants to do more research about it. |
The level-hypervectors have an interesting feature: if we assume L'[i] like:
Note: L'[1] is a vector that the second half of the circle is generated based on it. Then L'[i] has two features:
Level-hypervectors in HRR lack both features. The second feature makes the vectors on two sides of the circle's diameter orthogonal. Here is a figure that shows the second feature for different VSAs: The best I know is that It's because in generating Level-hypervectors in HRR we use the original representation of the vector, but for applying operates (binding and inverse) we use the Fourier representation. If this is the case, for solving this issue we have to generate Level-hypervectors in Fourier representation, but that makes HRR to be exactly FHRR. |
@milad2073 you are providing really great insights! As for opening a PR to update the unit tests, feel free to open the PR directly, no need to first open an issue. |
@milad2073, I had some time to go over the VTB implementation in Torchhd and it looks to me like it implements exactly what was proposed in Vector-Derived Transformation Binding: An Improved Binding Operation for Deep Symbol-Like Processing in Neural Networks. They do mention in the paper that the inverse operation is "approximate", which is probably what causes the strange behavior that you observed. |
The unit tests for circular hypervectors don't currently pass with the HRR model. I am not sure why this is as I expected it to behave very similarly to the FHRR model which does pass the unit tests. The failing test can be found here.
What happens is that in the second half of the circle the similarity with respect to the starting point is not increasing:
The first array shows the signs of changes and the second array the change itself. These arrays should look something like this instead:
The text was updated successfully, but these errors were encountered: