You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
tokens_per_minute: 150_000 # set a leaky bucket throttle
requests_per_minute: 10_000 # set a leaky bucket throttle
max_retries: 10
max_retry_wait: 10.0
sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
concurrent_requests: 1 # the number of parallel inflight requests that may be made
parallelization:
stagger: 0.3
num_threads: 50 # the number of threads to use for parallel processing
async_mode: threaded # or asyncio
embeddings:
parallelization: override the global parallelization settings for embeddings
async_mode: threaded # or asyncio
llm:
api_key: ${GRAPHRAG_API_KEY}
type: openai_embedding # or azure_openai_embedding
model: mistral
api_base: http://localhost:8000/v1
# api_version: 2024-02-15-preview
# organization: <organization_id>
# deployment_name: <azure_model_deployment_name>
# tokens_per_minute: 150_000 # set a leaky bucket throttle
# requests_per_minute: 10_000 # set a leaky bucket throttle
# max_retries: 10
# max_retry_wait: 10.0
# sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
# concurrent_requests: 25 # the number of parallel inflight requests that may be made
batch_size: 1 # the number of documents to send in a single request
# batch_max_tokens: 8191 # the maximum number of tokens to send in a single request
# target: required # or optional
chunks:
size: 300
overlap: 100
group_by_columns: [id] # by default, we don't allow chunks to cross documents
input:
type: file # or blob
file_type: text # or csv
base_dir: "input"
file_encoding: utf-8
file_pattern: ".*\.txt$"
04:36:19,87 datashaper.workflow.workflow ERROR Error executing verb "text_embed" in create_final_entities: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (16,) + inhomogeneous part.
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/datashaper/workflow/workflow.py", line 415, in _execute_verb
result = await result
File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/text_embed.py", line 105, in text_embed
return await _text_embed_in_memory(
File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/text_embed.py", line 130, in _text_embed_in_memory
result = await strategy_exec(texts, callbacks, cache, strategy_args)
File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/strategies/openai.py", line 61, in run
embeddings = await _execute(llm, text_batches, ticker, semaphore)
File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/strategies/openai.py", line 105, in _execute
results = await asyncio.gather(*futures)
File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/strategies/openai.py", line 100, in embed
result = np.array(chunk_embeddings.output)
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (16,) + inhomogeneous part.
04:36:19,92 graphrag.index.reporting.file_workflow_callbacks INFO Error executing verb "text_embed" in create_final_entities: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (16,) + inhomogeneous part. details=None
04:36:19,96 graphrag.index.run ERROR error running workflow create_final_entities
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/graphrag/index/run.py", line 323, in run_pipeline
result = await workflow.run(context, callbacks)
File "/usr/local/lib/python3.10/dist-packages/datashaper/workflow/workflow.py", line 369, in run
timing = await self._execute_verb(node, context, callbacks)
File "/usr/local/lib/python3.10/dist-packages/datashaper/workflow/workflow.py", line 415, in _execute_verb
result = await result
File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/text_embed.py", line 105, in text_embed
return await _text_embed_in_memory(
File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/text_embed.py", line 130, in _text_embed_in_memory
result = await strategy_exec(texts, callbacks, cache, strategy_args)
File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/strategies/openai.py", line 61, in run
embeddings = await _execute(llm, text_batches, ticker, semaphore)
File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/strategies/openai.py", line 105, in _execute
results = await asyncio.gather(*futures)
File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/strategies/openai.py", line 100, in embed
result = np.array(chunk_embeddings.output)
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (16,) + inhomogeneous part.
04:36:19,97 graphrag.index.reporting.file_workflow_callbacks INFO Error running pipeline! details=None
Logs.json File shows this:
{"type": "error", "data": "Error executing verb "text_embed" in create_final_entities: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (16,) + inhomogeneous part.", "stack": "Traceback (most recent call last):\n File "/usr/local/lib/python3.10/dist-packages/datashaper/workflow/workflow.py", line 415, in _execute_verb\n result = await result\n File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/text_embed.py", line 105, in text_embed\n return await _text_embed_in_memory(\n File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/text_embed.py", line 130, in _text_embed_in_memory\n result = await strategy_exec(texts, callbacks, cache, strategy_args)\n File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/strategies/openai.py", line 61, in run\n embeddings = await _execute(llm, text_batches, ticker, semaphore)\n File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/strategies/openai.py", line 105, in _execute\n results = await asyncio.gather(*futures)\n File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/strategies/openai.py", line 100, in embed\n result = np.array(chunk_embeddings.output)\nValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (16,) + inhomogeneous part.\n", "source": "setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (16,) + inhomogeneous part.", "details": null}
{"type": "error", "data": "Error running pipeline!", "stack": "Traceback (most recent call last):\n File "/usr/local/lib/python3.10/dist-packages/graphrag/index/run.py", line 323, in run_pipeline\n result = await workflow.run(context, callbacks)\n File "/usr/local/lib/python3.10/dist-packages/datashaper/workflow/workflow.py", line 369, in run\n timing = await self._execute_verb(node, context, callbacks)\n File "/usr/local/lib/python3.10/dist-packages/datashaper/workflow/workflow.py", line 415, in _execute_verb\n result = await result\n File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/text_embed.py", line 105, in text_embed\n return await _text_embed_in_memory(\n File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/text_embed.py", line 130, in _text_embed_in_memory\n result = await strategy_exec(texts, callbacks, cache, strategy_args)\n File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/strategies/openai.py", line 61, in run\n embeddings = await _execute(llm, text_batches, ticker, semaphore)\n File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/strategies/openai.py", line 105, in _execute\n results = await asyncio.gather(*futures)\n File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/strategies/openai.py", line 100, in embed\n result = np.array(chunk_embeddings.output)\nValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (16,) + inhomogeneous part.\n", "source": "setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (16,) + inhomogeneous part.", "details": null}
Additional Information
GraphRAG Version: v0.1.1
Operating System: Ubuntu 20.04
Python Version: 3.10.14
Related Issues: 442
The text was updated successfully, but these errors were encountered:
In configuration(yaml), you are using mistral as an embedding model and that might be causing the inhomogeneous dimension. You can use models from nomic-ai or mixedbread.
When I faced the issue, I created a repository for deploying Hugging Face models to local endpoints, offering functionality similar to OpenAI APIs. You can find the repo here: https://github.com/rushizirpe/open-llm-server
Describe the issue
I was trying to run graphRAG using llama_cpp. Got the following issue:
❌ create_final_entities
⠼ GraphRAG Indexer
├── Loading Input (text) - 1 files loaded (0 filtered) ━━━━━━ 100% 0:00:… 0:00:…
├── create_base_text_units
├── create_base_extracted_entities
├── create_summarized_entities
├── create_base_entity_graph
None
⠴ GraphRAG Indexer
├── Loading Input (text) - 1 files loaded (0 filtered) ━━━━━━ 100% 0:00:… 0:00:…
├── create_base_text_units
├── create_base_extracted_entities
├── create_summarized_entities
├── create_base_entity_graph
⠴ GraphRAG Indexer
├── Loading Input (text) - 1 files loaded (0 filtered) ━━━━━━ 100% 0:00:… 0:00:…
├── create_base_text_units
├── create_base_extracted_entities
├── create_summarized_entities
├── create_base_entity_graph
└── create_final_entities
❌ Errors occurred during the pipeline run, see logs for more details.
Steps to reproduce
Use the settings.yaml file to replicate the issue
GraphRAG Config Used
The settings.yaml is as follows:
Logs and screenshots
Indexing Engine Log file shows this:
Logs.json File shows this:
{"type": "error", "data": "Error executing verb "text_embed" in create_final_entities: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (16,) + inhomogeneous part.", "stack": "Traceback (most recent call last):\n File "/usr/local/lib/python3.10/dist-packages/datashaper/workflow/workflow.py", line 415, in _execute_verb\n result = await result\n File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/text_embed.py", line 105, in text_embed\n return await _text_embed_in_memory(\n File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/text_embed.py", line 130, in _text_embed_in_memory\n result = await strategy_exec(texts, callbacks, cache, strategy_args)\n File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/strategies/openai.py", line 61, in run\n embeddings = await _execute(llm, text_batches, ticker, semaphore)\n File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/strategies/openai.py", line 105, in _execute\n results = await asyncio.gather(*futures)\n File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/strategies/openai.py", line 100, in embed\n result = np.array(chunk_embeddings.output)\nValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (16,) + inhomogeneous part.\n", "source": "setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (16,) + inhomogeneous part.", "details": null}
{"type": "error", "data": "Error running pipeline!", "stack": "Traceback (most recent call last):\n File "/usr/local/lib/python3.10/dist-packages/graphrag/index/run.py", line 323, in run_pipeline\n result = await workflow.run(context, callbacks)\n File "/usr/local/lib/python3.10/dist-packages/datashaper/workflow/workflow.py", line 369, in run\n timing = await self._execute_verb(node, context, callbacks)\n File "/usr/local/lib/python3.10/dist-packages/datashaper/workflow/workflow.py", line 415, in _execute_verb\n result = await result\n File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/text_embed.py", line 105, in text_embed\n return await _text_embed_in_memory(\n File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/text_embed.py", line 130, in _text_embed_in_memory\n result = await strategy_exec(texts, callbacks, cache, strategy_args)\n File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/strategies/openai.py", line 61, in run\n embeddings = await _execute(llm, text_batches, ticker, semaphore)\n File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/strategies/openai.py", line 105, in _execute\n results = await asyncio.gather(*futures)\n File "/usr/local/lib/python3.10/dist-packages/graphrag/index/verbs/text/embed/strategies/openai.py", line 100, in embed\n result = np.array(chunk_embeddings.output)\nValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (16,) + inhomogeneous part.\n", "source": "setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (16,) + inhomogeneous part.", "details": null}
Additional Information
The text was updated successfully, but these errors were encountered: