You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Guys :) I'm not quite sure about the situation you've encountered. my detailed situation is as follows
Suggestions
The insert process is highly related to LLM/Embedding model (the process use LLM to extract entities & relations, and EB model to index). This requires a significant amount of computing resources. If you run this locally, a GPU-accelerated model is recommended. if use CPU only, it will be much slower. And, use a model with few params may have a higher processing speed. (But be aware that a model with fewer params may have a worse performance. So you must make a balance) Also, I noticed that using external graph DB & Vector DB may accelerate the insert process.(also accelerate the query process) We're currently working on how to integrate all these.
About my situation
we use Ollama local service to power the framework, and a work station with 8 × Tesla P100 GPU.
Evaluation
Using a fake fairy tale (2k tokens, generated by GPT-4o, this means all LLMs don't know this story) to test the LightRAG & GraphRAG. The insert process of LightRAG cost 2~3min, while it costs more than 15min for GraphRAG.
Funnily enough, I do have decent specs (an NVIDIA gpu) and I have torch running and it detects cuda as well. But for some reason it doesn't work with GPU enabled acceleration. One difference is that I am using HF models instead of Ollama, as you described.
I peeked the source code, and it does seem to be offsetting the embedding model to cuda, yet it still has 0% gpu2 usage.
Funnily enough, I do have decent specs (an NVIDIA gpu) and I have torch running and it detects cuda as well. But for some reason it doesn't work with GPU enabled acceleration. One difference is that I am using HF models instead of Ollama, as you described.
I peeked the source code, and it does seem to be offsetting the embedding model to cuda, yet it still has 0% gpu2 usage.
Originally posted by @PoyBoi in HKUDS/LightRAG#212 (comment)
The text was updated successfully, but these errors were encountered: