How to align and get the begin and end time of the query in database？ #17

980202006 · 2021-09-06T10:09:38Z

If I query the start and end positions of an audio, through faiss.search, it is possible that top n is a non-continuous value, how can I align it?

mimbres · 2021-09-06T11:37:22Z

The simplest answer is matcher.

mimbres · 2021-09-06T14:56:47Z

@980202006 For the details, here is one example.
We assume a Query as a sequence of 3 contiguous segments.

Reference DB: [z0, z1, z2, z3,...,z10, z11, z12,...z20]
Query: [z1', z2', z3']

The top N (e.g. N=5) results of faiss.search for the query, for example:

I(z1') = [1, 2, 11, 3, 15] # sorted top-5 result indices
D(z1') = [0.1, 0.2, 0.3, 0.4, 0.5] # distances are not required in this stage.

I(z2') = [11, 2, 1, 3, 12]
D(z2') = ...
 
I(z3') = [3, 2, 12, 4, 11]
D(z3') = ...

Here the result [11,2,1,3,12] of the second segment's result I(z2') implies that the sequence would start from '[10,1,0,2,11]. Based on this idea, we perform offset-compensation` for each segment-wise search result.

# time-offset compensation
I'(z1) = I(z1)  # [1, 2, 11, 3, 15]
I'(z2') = I(z2') - 1 # [10, 1, 0, 2, 11]
I'(z3') = I(z3') - 2 # [1, 0, 10, 2, 9]

We then perform np.unique to reduce duplicated candidates

# unique
c_start_index = unique_set(all_I's) # [0, 1, 2, 3, 9, 10, 11, 15]
# from the `c_start_index`, we get the final candidates, c
c = [[0,1,2], [1,2,3], [2,3,4], [9,10,11], [10,11,12], [11,12,13], [15,16,17]]

We calculate the total distance (called score, but actually a penalty: smaller value is better) for each candidate sequence:

# d is distance
score(c[0]) = score([0,1,2]) = d(z1', z0) +d(z2', z1) + d(z3',z2)
score(c[1]) = score([1,2,3]) = d(z1', z1) + d(z2', z2) + d(z3', z3)
...
score(c[7]) = score([15,16,17]) = ...

Finally we take argmin(score).

In the paper, I described this using argmax of cosine-similarity instead of argmin of euclidean-distance. But in the end they are the same thing.

mhmd-mst · 2023-07-31T16:47:37Z

This is a great example, I want to ask regarding such cases, in order to evaluate, gt_ids = test_ids + dummy_db_shape[0] in the original code considers same number of fp for the db and queries, but in this case it is not, then I should find at what time it starts and find the idx for the start point fp to find ground truths right?
Also this code fake_recon_index, index_shape = load_memmap_data( emb_dummy_dir, 'dummy_db', append_extra_length=query_shape[0], display=False) fake_recon_index[dummy_db_shape[0]:dummy_db_shape[0] + query_shape[0], :] = db[:, :]
only works if query_shape[0] = db_shape[0] otherwise I should use db_shape[0] am i right?

mimbres · 2023-08-02T23:47:55Z

@mhmd-mst

If what I remember is correct, gt_ids = test_ids + dummy_db_shape[0] was required because our DB was always constructed as dummy_db first, and test_db follows the dummy_db indices.
If you set dumm_db as empty (using np.empty()), then simplygt_ids = test_ids.
If you don't use dummy_db, again, it will be simply fake_recon_index = db. The reason why I use the on-disk solution of memmap variables like fake_recon_index is to save memory. Imagine if dummy_db was your actual DB with size of 2TB!

mimbres self-assigned this Sep 6, 2021

mimbres added the question Further information is requested label Sep 6, 2021

mimbres closed this as completed Dec 5, 2021

mimbres mentioned this issue Jul 30, 2023

Evaluation with custom dataset #38

Closed

mimbres pinned this issue Aug 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to align and get the begin and end time of the query in database？ #17

How to align and get the begin and end time of the query in database？ #17

980202006 commented Sep 6, 2021

mimbres commented Sep 6, 2021

mimbres commented Sep 6, 2021 •

edited

Loading

mhmd-mst commented Jul 31, 2023 •

edited

Loading

mimbres commented Aug 2, 2023 •

edited

Loading

How to align and get the begin and end time of the query in database？ #17

How to align and get the begin and end time of the query in database？ #17

Comments

980202006 commented Sep 6, 2021

mimbres commented Sep 6, 2021

mimbres commented Sep 6, 2021 • edited Loading

mhmd-mst commented Jul 31, 2023 • edited Loading

mimbres commented Aug 2, 2023 • edited Loading

mimbres commented Sep 6, 2021 •

edited

Loading

mhmd-mst commented Jul 31, 2023 •

edited

Loading

mimbres commented Aug 2, 2023 •

edited

Loading