Skip to content

A modular graph-based Retrieval-Augmented Generation (RAG) system

License

Notifications You must be signed in to change notification settings

s106916/graphrag

 
 

Repository files navigation

Ollama Hacked Solution

I am working on integrating Ollama into GraphRAG with the proper configuration. Here is a temporary solution:

Installation

Clone the repository and install the package using Poetry:

git clone https://github.com/s106916/graphrag
cd ./graphrag
poetry build
pip install ./dist/graphrag-0.1.1-py3-none-any.whl

Configuration

change setting.yaml file with the following content:

in embeddings: section change model: ollama:your_embedding_model

model: ollama:nomic-embed-text

setting.yaml

encoding_model: cl100k_base
skip_workflows: []
llm:
  api_key: ${GRAPHRAG_API_KEY}
  type: openai_chat # or azure_openai_chat or ollama_chat
  model: mistral
  model_supports_json: true # recommended if this is available for your model.
  # max_tokens: 4000
  # request_timeout: 180.0
  api_base: http://127.0.0.1:11434/v1
  # api_version: '2024-02-01'
  # organization: <organization_id>
  # deployment_name: default
  # tokens_per_minute: 150_000 # set a leaky bucket throttle
  # requests_per_minute: 10_000 # set a leaky bucket throttle
  # max_retries: 10
  # max_retry_wait: 10.0
  # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
  # concurrent_requests: 5 # the number of parallel inflight requests that may be made
  # temperature: 0 # temperature for sampling
  # top_p: 1 # top-p sampling
  # n: 1 # Number of completions to generate

parallelization:
  stagger: 0.3
  # num_threads: 50 # the number of threads to use for parallel processing

async_mode: threaded # or asyncio

embeddings:
  ## parallelization: override the global parallelization settings for embeddings
  async_mode: threaded # or asyncio
  llm:
    api_key: ${GRAPHRAG_API_KEY}
    type: openai_embedding # or azure_openai_embedding # or ollama_embedding
    model: ollama:nomic-embed-text
    api_base: http://127.0.0.1:11434/v1
    # api_version: '2024-02-01'
    # organization: <organization_id>
    # deployment_name: default-ada
    # tokens_per_minute: 150_000 # set a leaky bucket throttle
    # requests_per_minute: 10_000 # set a leaky bucket throttle
    # max_retries: 10
    # max_retry_wait: 10.0
    # sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
    # concurrent_requests: 25 # the number of parallel inflight requests that may be made
    # batch_size: 16 # the number of documents to send in a single request
    # batch_max_tokens: 8191 # the maximum number of tokens to send in a single request
    # target: required # or optional

About

A modular graph-based Retrieval-Augmented Generation (RAG) system

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 95.1%
  • Jupyter Notebook 3.8%
  • Nunjucks 0.6%
  • Jinja 0.2%
  • CSS 0.1%
  • JavaScript 0.1%
  • Shell 0.1%