Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug(not a bug maybe): When using static lib, vsag::init() is not always called. #108

Open
Coien-rr opened this issue Oct 31, 2024 · 2 comments
Assignees
Labels
kind/bug Something isn't working

Comments

@Coien-rr
Copy link
Contributor

Describe the bug
A clear and concise description of what the bug is.
When using static lib, if I'm not explicitly calling vsag::init(), vsag may be uninitialized.
I ran into a situation where the distance calculation function was invalid and caused the program to crash
image
if i explicitly calling vsag::init(), vsag can work well, but init log print twice (It's annoying and meaningless
image
But when using shared lib, everything goes fine.

To Reproduce
Codes to reproduce the behavior:
Just Use simple_hnsw.cpp in examples/, But comments the explicitly calling vsag::init() statement

// paste codes here

// Copyright 2024-present the vsag project
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//     http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

#include <vsag/vsag.h>

#include <iostream>

int
main(int argc, char** argv) {
    // vsag::init();

    int64_t num_vectors = 10000;
    int64_t dim = 128;

    // prepare ids and vectors
    auto ids = new int64_t[num_vectors];
    auto vectors = new float[dim * num_vectors];

    std::mt19937 rng;
    rng.seed(47);
    std::uniform_real_distribution<> distrib_real;
    for (int64_t i = 0; i < num_vectors; ++i) {
        ids[i] = i;
    }
    for (int64_t i = 0; i < dim * num_vectors; ++i) {
        vectors[i] = distrib_real(rng);
    }

    // create index
    auto hnsw_build_paramesters = R"(
    {
        "dtype": "float32",
        "metric_type": "l2",
        "dim": 128,
        "hnsw": {
            "max_degree": 16,
            "ef_construction": 100
        }
    }
    )";
    auto index = vsag::Factory::CreateIndex("hnsw", hnsw_build_paramesters).value();
    auto base = vsag::Dataset::Make();
    base->NumElements(num_vectors)->Dim(dim)->Ids(ids)->Float32Vectors(vectors)->Owner(false);
    index->Build(base);

    // prepare a query vector
    auto query_vector = new float[dim];  // memory will be released by query the dataset
    for (int64_t i = 0; i < dim; ++i) {
        query_vector[i] = distrib_real(rng);
    }

    // search on the index
    auto hnsw_search_parameters = R"(
    {
        "hnsw": {
            "ef_search": 100
        }
    }
    )";
    int64_t topk = 10;
    auto query = vsag::Dataset::Make();
    query->NumElements(1)->Dim(dim)->Float32Vectors(query_vector)->Owner(true);
    auto result = index->KnnSearch(query, topk, hnsw_search_parameters).value();

    // print the results
    std::cout << "results: " << std::endl;
    for (int64_t i = 0; i < result->GetDim(); ++i) {
        std::cout << result->GetIds()[i] << ": " << result->GetDistances()[i] << std::endl;
    }

    // free memory
    delete[] ids;
    delete[] vectors;

    return 0;
}

build config

add_executable (simple_hnsw simple_hnsw.cpp)
target_link_libraries (simple_hnsw vsag_static)

Environment

  • OS(Arch Linux):
  • vsag version(9195f6955e808b45314bd37db8cbf15b07736ed4):
  • compiler version(GCC 14.2.0):
  • interface(cpp):
  • build mode(debug)

Expected behavior
A clear and concise description of what you expected to happen.
i hope using static vsag lib without explicitly calling vsag::init().
or remove repeat log

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

@wxyucs wxyucs added the kind/bug Something isn't working label Nov 4, 2024
@wxyucs
Copy link
Collaborator

wxyucs commented Nov 4, 2024

@Coien-rr thanks for your report. I think it's a bug.

VSAG required initialization to switch to a suitable SIMD backend to accelerate distance calculation. It's calling from a static variable now for ease of use:
(https://github.com/alipay/vsag/blob/6f3af4f8b892807e6baee2dc4de5c067213f0e0a/src/vsag.cpp#L61)

However, it seems that the initialization is not always working. And initial this library with magic statics may not be a good choice. I consider that to add an init() and tini() function and make it required that the user call it explicitly in main functions.

@jiaweizone @LHT129 What's your suggestion?

@Coien-rr
Copy link
Contributor Author

Coien-rr commented Nov 4, 2024

@wxyucs Thanks for your explanation!
i fixed reinit VSAG error which introduced in using static lib, and propose a pr #112
Is it necessary to check if VSAG is initialized? since the program crashes without any indication
Like this(the code example is same as which in To Reproduce):

 ./build/examples/cpp/simple_hnsw
zsh: segmentation fault (core dumped)  ./build/examples/cpp/simple_hnsw

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants