Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

> Guys :) I'm not quite sure about the situation you've encountered. my detailed situation is as follows #80

Open
taras-bl opened this issue Jan 2, 2025 · 0 comments

Comments

@taras-bl
Copy link

taras-bl commented Jan 2, 2025

Guys :) I'm not quite sure about the situation you've encountered. my detailed situation is as follows

Suggestions

The insert process is highly related to LLM/Embedding model (the process use LLM to extract entities & relations, and EB model to index). This requires a significant amount of computing resources. If you run this locally, a GPU-accelerated model is recommended. if use CPU only, it will be much slower. And, use a model with few params may have a higher processing speed. (But be aware that a model with fewer params may have a worse performance. So you must make a balance) Also, I noticed that using external graph DB & Vector DB may accelerate the insert process.(also accelerate the query process) We're currently working on how to integrate all these.

About my situation

we use Ollama local service to power the framework, and a work station with 8 × Tesla P100 GPU.

Evaluation

Using a fake fairy tale (2k tokens, generated by GPT-4o, this means all LLMs don't know this story) to test the LightRAG & GraphRAG. The insert process of LightRAG cost 2~3min, while it costs more than 15min for GraphRAG.

Funnily enough, I do have decent specs (an NVIDIA gpu) and I have torch running and it detects cuda as well. But for some reason it doesn't work with GPU enabled acceleration. One difference is that I am using HF models instead of Ollama, as you described.

I peeked the source code, and it does seem to be offsetting the embedding model to cuda, yet it still has 0% gpu2 usage.

Originally posted by @PoyBoi in HKUDS/LightRAG#212 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant