Performance with Scientific Spec Sheets #656

istranic · 2024-11-01T16:34:05Z

Has anyone tried this tool with scientific spec sheets like those provided for manufacturing tools, probes, measurement devices, etc.

Is there anything in the way the models are prompted or the way the data is processed that would make that not work well?

dosubot · 2024-11-01T16:37:14Z

Hey there, @istranic! I'm a bot here to help you with bug fixes, answer questions, and guide you on becoming a contributor. Let's tackle this issue together while we wait for a human maintainer.

The tool has specific configuration options for processing scientific or technical documents, which can help in handling scientific spec sheets effectively. These include:

Parsing Configuration:
- use_doc_details: Set to true to utilize document details during parsing.
- chunk_size: Set to 7000 to define the size of each chunk of the document.
- overlap: Set to 250 to specify the overlap between chunks.
Answer Configuration:
- evidence_k: Set to 8 to determine the number of evidence pieces to consider.
- answer_max_sources: Set to 3 to limit the maximum number of sources for an answer.
- max_concurrent_requests: Set to 8 to control the number of concurrent requests.
Prompts Configuration:
- use_json: Set to true to enable JSON format for prompts.
LLM Configuration:
- Various rate limits are specified for different models, such as gpt-4o, gpt-4-turbo, and gpt-3.5-turbo, with limits ranging from 40,000 to 2,000,000 per minute.
Embedding Configuration:
- rate_limit: Set to 1,000,000 per minute.

These configurations are designed to manage resources and performance effectively, which should help in processing scientific spec sheets. However, there is no specific information on limitations or issues with highly technical or domain-specific language in the repository ^[1]^[2].

To continue talking to Dosu, mention @dosu.

dosubot bot added the question Further information is requested label Nov 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance with Scientific Spec Sheets #656

Performance with Scientific Spec Sheets #656

istranic commented Nov 1, 2024

dosubot bot commented Nov 1, 2024

Performance with Scientific Spec Sheets #656

Performance with Scientific Spec Sheets #656

Comments

istranic commented Nov 1, 2024

dosubot bot commented Nov 1, 2024