Implementing circular hypervectors with the Holographic Reduced Representations model #108

mikeheddes · 2023-01-12T18:12:02Z

The unit tests for circular hypervectors don't currently pass with the HRR model. I am not sure why this is as I expected it to behave very similarly to the FHRR model which does pass the unit tests. The failing test can be found here.

What happens is that in the second half of the circle the similarity with respect to the starting point is not increasing:

HRR([ 1.,  1.,  1.,  1.,  1., -1., -1.], dtype=torch.float64) 
HRR([0.2494, 0.2501, 0.2496, 0.2519, 0.0005, 0.0006, 0.0011], dtype=torch.float64)

The first array shows the signs of changes and the second array the change itself. These arrays should look something like this instead:

HRR([ 1.,  1.,  1.,  1.,  -1., -1., -1.], dtype=torch.float64) 
HRR([0.2494, 0.2501, 0.2496, 0.2519, 0.2500, 0.2500, 0.2500], dtype=torch.float64)

The text was updated successfully, but these errors were encountered:

milad2073 · 2024-10-09T13:22:25Z

I guessed the problem is related to the inverse function of the vectors, so I run this test to check the validity of inverse functions:


from torchhd import functional

vsa_tensors = ["BSC","MAP","HRR","FHRR",
               "BSBC","VTB","MCR"]

for vsa in vsa_tensors:
    
    if vsa == "BSBC" or vsa == "MCR":
        [a,b] = functional.random(2, 1000000, vsa, block_size=1024)
    else:
        [a,b] = functional.random(2, 1000000, vsa)

    one = a.inverse().bind(a)
    
    sim = b.bind(one).cosine_similarity(b)
    print(f'{vsa}: Similarity ==> {sim}')

the result was:


BSC: Similarity ==> BSCTensor(1.)
MAP: Similarity ==> MAPTensor(1.)
HRR: Similarity ==> HRRTensor(0.7076)
FHRR: Similarity ==> FHRRTensor(1.)
BSBC: Similarity ==> BSBCTensor(1.)
VTB: Similarity ==> VTBTensor(0.0030)
MCR: Similarity ==> MCRTensor(1.)

For "HRR" and "VTA" the test failed.
Maybe there is a bug in the implementation of "VTB". (there is no unit test for checking inverse functions. I recommend adding).
But for "HRR", you implemented another version of inverse function (exact_inverse) when I replace it with the main inverse function the result was like:


BSC: Similarity ==> BSCTensor(1.)
MAP: Similarity ==> MAPTensor(1.)
HRR: Similarity ==> HRRTensor(1.0000)
FHRR: Similarity ==> FHRRTensor(1.)
BSBC: Similarity ==> BSBCTensor(1.)
VTB: Similarity ==> VTBTensor(0.0016)
MCR: Similarity ==> MCRTensor(1.)

But still the test_circular_hv kept failing for "HRR". So I run another tests:


from torchhd import functional
import torch

vsa_tensors = ["BSC","MAP","HRR","FHRR",
               "BSBC","MCR", "VTB"]

for vsa in vsa_tensors:
    if vsa == "BSBC" or vsa == "MCR":
        hv  = functional.level(11, 1000000, vsa, block_size=1024)
        [a_random_vec] = functional.random(1, 1000000, vsa, block_size=1024)
        
    else:
        hv  = functional.level(11, 1000000, vsa)
        [a_random_vec] = functional.random(1, 1000000, vsa)
    
    x,y = hv[0], hv[2]

    
    sim1 = x.cosine_similarity(y)
    sim2 = x.bind(a_random_vec).bind(y.inverse()).cosine_similarity(a_random_vec)
    if torch.isclose(sim1,sim2):
        print(f'{vsa}: Passed sim1 = sim2 = {sim1}')
    else:
        print(f'{vsa}: Failed sim1 = {sim1}, sim2 = {sim2}')


OUTPUT:

BSC: Passed sim1 = sim2 = BSCTensor(0.8000)
MAP: Passed sim1 = sim2 = MAPTensor(0.7993)
HRR: Failed sim1 = HRRTensor(0.7992), sim2 = HRRTensor(0.2950)
FHRR: Passed sim1 = sim2 = FHRRTensor(0.8004)
BSBC: Passed sim1 = sim2 = BSBCTensor(0.8005)
MCR: Passed sim1 = sim2 = MCRTensor(0.8005)
VTB: Failed sim1 = VTBTensor(0.7994), sim2 = VTBTensor(0.0005)

and


from torchhd import functional
import torch

vsa_tensors = ["BSC","MAP","HRR","FHRR",
               "BSBC","MCR", "VTB"]

for vsa in vsa_tensors:
    if vsa == "BSBC" or vsa == "MCR":
        hv  = functional.level(11, 1000000, vsa, block_size=1024)
        [one] = functional.identity(1, 1000000, vsa, block_size=1024)
        
    else:
        hv  = functional.level(11, 1000000, vsa)
        [one] = functional.identity(1, 1000000, vsa)
    
    x,y = hv[0], hv[2]

    
    sim1 = x.cosine_similarity(y)
    sim2 = x.inverse().bind(y).cosine_similarity(one)
    if torch.isclose(sim1,sim2):
        print(f'{vsa}: Passed sim1 = sim2 = {sim1}')
    else:
        print(f'{vsa}: Failed sim1 = {sim1}, sim2 = {sim2}')


OUTPUT:

BSC: Passed sim1 = sim2 = BSCTensor(0.8001)
MAP: Passed sim1 = sim2 = MAPTensor(0.8000)
HRR: Failed sim1 = HRRTensor(0.8005), sim2 = HRRTensor(0.3318)
FHRR: Passed sim1 = sim2 = FHRRTensor(0.7999)
BSBC: Passed sim1 = sim2 = BSBCTensor(0.7996)
MCR: Passed sim1 = sim2 = MCRTensor(0.8006)
VTB: Failed sim1 = VTBTensor(0.7992), sim2 = VTBTensor(0.0019)

What these tests are checking is, binding a vector X to the inverse of a vector Y should generate a vector as similar to an Identity vector(binding-Identity vector) as the similarity between X and Y. And this is the exact property of {inverse, binding, similarity} operations that circular function is using to generate the second half of the circle.

mikeheddes · 2024-10-09T20:20:40Z

Thank you @milad2073 for your investigative work, this is very valuable. The VTB and HRR models are related so it makes sense that both would suffer from this problem. It is indeed a good idea to add unit tests for the inverse operation. Based on your experiments, do you have a sense of where/why this problem arrises?

milad2073 · 2024-10-10T08:36:13Z

Have a look at this figure:

It shows the relationship between the similarity of two vectors and how much their division(x bind inv(y)) is close to identity.

When this relation is linear, we have:

Sim(x,y) = Sim(I , x * inv(y))
(1): Sim(x,y) = Sim(a , a * x * inv(y))

For example, for the first transition of the second half of the circle, with the notation of the paper, we have:

Sim(C[m/2] , T[m/2+1]) = Sim(C[m/2] , C[m/2] * C[1] * inv(T[1]) )

by assuming: x=C[1], y=T[1], and a=C[m/2] we can apply (1):

(2): Sim(C[m/2] , T[m/2+1]) = Sim(C[1] , inv(T[1]) )

That means for (2) to be true, we need (1) to be true, which in turn means the linearity in the above figure.

I'm sorry for bothering you with this extra information. I just wanted to provide my thinking steps so that others can see if I make a mistake.

In the end, in my opinion, there is a problem with the nature of HRR operator(s) that prohibits it from being used in this circular function.
But For VTB, maybe there is a bug in the implementation of the inverse operation. By having test units for the inverse function, we can check it.

rgayler · 2024-10-10T11:08:00Z

there is a problem with the nature of HRR operator(s) that prohibits it from being used in this circular function.

I will think about this. It's a very long time since I read the paper. I think it will always be possible to construct encodings with the required similarity structure to encode cyclic variables for any VSA, but I am more prepared to believe it's possible that the method described in the paper doesn't work with HRR.

mikeheddes · 2024-10-10T15:46:44Z

These are some very useful insights @milad2073! I might have some time at the end of next week to verify the inverse implementation of VTB.

milad2073 · 2024-10-12T08:46:07Z

I was reading "Learning with Holographic Reduced Representations", the paper introduced a better version of inverse function for HRR. I replaced the inverse function of torchhd with their optimized version (I also changed the binding functions a little to match this new inverse function), which made the results better.
Here is the result for num_vectors=16:

HRRTensor([ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1., -1., -1., -1., -1., -1., -1.,-1.])
HRRTensor([0.1261, 0.1255, 0.1247, 0.1259, 0.1237, 0.1250, 0.1250, 0.1237,
           0.0658, 0.0815, 0.0899, 0.0982, 0.1053, 0.1184, 0.1352]

As you can see, although in the second half, the vectors are getting more similar to the first vector, the differences are not equal.

I am eager to read Mr. @rgayler 's explanation of the reasons for this phenomenon.

rgayler · 2024-10-12T13:12:12Z

@milad2073 , waiting on me for useful explanations is probably a pretty risky strategy.

The fact that your results improved with the modified inverse function certainly suggests that the inverse may be involved in the explanation. If I recall correctly, in his 1994 thesis Tony Plate mentions exact and approximate inverses for HRR and FHRR and discusses numerical stability. I think he argued that the approximate inverses were generally preferable. So there's historical precedent for believing that there are alternative definitions for the inverse and reasons to choose between them.

(I don't use python, haven't read the torchhd code, and haven't read the unit tests - so take what I'm about to say as a high-level opinion and treat it accordingly) I like @milad2073 's figure comparing the accuracy of the inverse for different VSAs. I believe that accurate inversion is central (axiomatic even) to VSAs - so some purported inverse that's not accurate isn't actually an inverse for VSA purposes. Consequently, on the basis of that figure I would also conclude that the implementation of inverse for HRR and VTB is either buggy or (less likely) that the original theoretical conception has problems. I also believe that HRR and FHRR are equivalent (because you can losslessly transform between them), so the fact that FHRR inverse works as expected while HRR inverse does not, again suggests to me that HRR inverse is broken somehow.

I think this inverse behaviour is more fundamental than the circular encoding, so I suggest the HRR inverse should be sorted out first before investigating the circular encoding any further.

(Just to prove how truly paranoid I am: are you certain that the cosine similarity function is correct for HRRs? I would be very surprised if it's wrong, but you could waste a lot of time barking up the wrong tree if it is wrong.)

milad2073 · 2024-10-15T08:53:49Z

Thank you, Mr. @rgayler for your insightful response.

I also believe that HRR and FHRR are equivalent (because you can losslessly transform between them)

This clue was great.
In torchHD also, they are almost equivalent, the only thing that is different is the method for generating a random vector.
— In HRR, a random vector is generated from: a standard normal distribution
— In FHRR, a random vector is generated from: a uniform distribution of angles (while setting absolute of each element to 1)

Changing the random method of HRR to:

size = (num_vectors, dimensions)
angle = torch.empty(size, dtype=dtype, device=device)
angle.uniform_(-math.pi, math.pi, generator=generator)
result = torch.complex(angle.cos(), angle.sin())
return torch.fft.irfft(result.as_subclass(cls),result.shape[-1])

made the HRR inverse function to gain the required property:

This is because the challenge of inverse function was where the abs of an element (in Fourier space) was near zero.
By the way, This change made the test_circular_hv.py to pass. Here is the result for num_vectors=16:

HRRTensor([ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1., -1., -1., -1., -1., -1., -1.,
           -1.])
HRRTensor([0.1257, 0.1249, 0.1251, 0.1245, 0.1238, 0.1249, 0.1247, 0.1260,
           0.1012, 0.1274, 0.1250, 0.1251, 0.1252, 0.1268, 0.1233])

But — and there is always a “but” — in test_circular_hv.py we just check the validity of similarity of vectors to the first vector in the circle, while for a function that claims it generates vectors with a circular similarity we should test similarity between every pair of vectors.
By applying this change to test_circular_hv.py, the test for HRR again failed. [@mikeheddes , I recommend adding this change in test_circular_hv.py to the torchHD, but I don't know if I should first open an issue, or I can directly make a PR]

I will mention my investigation about the reason of that here for everyone who wants to do more research about it.

milad2073 · 2024-10-15T10:29:22Z

The level-hypervectors have an interesting feature:

if we assume L'[i] like:

L'[i] = L[1] * L[-1] * inv(L[i])

Note: L'[1] is a vector that the second half of the circle is generated based on it.

Then L'[i] has two features:

(1): sim(L'[i] , L[-1]) = sim(L[i] , L[0])
(2): sim(L'[i] , L[i]) = 0

Level-hypervectors in HRR lack both features.
By changing the random vector generator of the HRR (mentioned in the previous comment) I made the HRR to have the first feature.

The second feature makes the vectors on two sides of the circle's diameter orthogonal. Here is a figure that shows the second feature for different VSAs:

The best I know is that It's because in generating Level-hypervectors in HRR we use the original representation of the vector, but for applying operates (binding and inverse) we use the Fourier representation. If this is the case, for solving this issue we have to generate Level-hypervectors in Fourier representation, but that makes HRR to be exactly FHRR.

mikeheddes · 2024-10-15T15:56:44Z

@milad2073 you are providing really great insights! As for opening a PR to update the unit tests, feel free to open the PR directly, no need to first open an issue.

mikeheddes · 2024-10-29T16:38:17Z

@milad2073, I had some time to go over the VTB implementation in Torchhd and it looks to me like it implements exactly what was proposed in Vector-Derived Transformation Binding: An Improved Binding Operation for Deep Symbol-Like Processing in Neural Networks. They do mention in the paper that the inverse operation is "approximate", which is probably what causes the strange behavior that you observed.

mikeheddes added the bug Something isn't working label Jan 12, 2023

milad2073 mentioned this issue Oct 16, 2024

Add similarity calculation for all pairs of vectors in the unit test … #175

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementing circular hypervectors with the Holographic Reduced Representations model #108

Implementing circular hypervectors with the Holographic Reduced Representations model #108

mikeheddes commented Jan 12, 2023 •

edited

Loading

milad2073 commented Oct 9, 2024 •

edited

Loading

mikeheddes commented Oct 9, 2024

milad2073 commented Oct 10, 2024 •

edited

Loading

rgayler commented Oct 10, 2024

mikeheddes commented Oct 10, 2024

milad2073 commented Oct 12, 2024 •

edited

Loading

rgayler commented Oct 12, 2024

milad2073 commented Oct 15, 2024

milad2073 commented Oct 15, 2024 •

edited

Loading

mikeheddes commented Oct 15, 2024

mikeheddes commented Oct 29, 2024

Implementing circular hypervectors with the Holographic Reduced Representations model #108

Implementing circular hypervectors with the Holographic Reduced Representations model #108

Comments

mikeheddes commented Jan 12, 2023 • edited Loading

milad2073 commented Oct 9, 2024 • edited Loading

mikeheddes commented Oct 9, 2024

milad2073 commented Oct 10, 2024 • edited Loading

rgayler commented Oct 10, 2024

mikeheddes commented Oct 10, 2024

milad2073 commented Oct 12, 2024 • edited Loading

rgayler commented Oct 12, 2024

milad2073 commented Oct 15, 2024

milad2073 commented Oct 15, 2024 • edited Loading

mikeheddes commented Oct 15, 2024

mikeheddes commented Oct 29, 2024

mikeheddes commented Jan 12, 2023 •

edited

Loading

milad2073 commented Oct 9, 2024 •

edited

Loading

milad2073 commented Oct 10, 2024 •

edited

Loading

milad2073 commented Oct 12, 2024 •

edited

Loading

milad2073 commented Oct 15, 2024 •

edited

Loading