【New Features】Chunking method available? #3

Kunlun-Zhu · 2024-05-27T02:12:31Z

To my best understanding.

The retriever only returns the doc ID without the chunking method for each document.

I would also suggest API usage for chatGPT, Gemini, Claude, etc in the generator.

DaoD · 2024-05-27T02:22:26Z

ignorejjj · 2024-05-27T05:51:21Z

The retriever will retrieve similar items (including ID and text) from the document corpus. As I understand it, document chunking is employed during corpus construction and does not need to be returned by the retriever.

For the generator, due to various limitations of the black-box model (can't return logits, requiring API costs), we did not implement it initially. To ensure completeness, we plan to implement mainstream API-based models, such as ChatGPT within the next few weeks.

If I have misunderstood anything, please feel free to make suggestions!

Kunlun-Zhu · 2024-05-27T06:14:10Z

Thanks for the reply, looking forward to new updates.

linchen111 · 2024-05-29T16:04:25Z

hope that I can use this to chunk my html ,hhh

DaoD added the enhancement New feature or request label May 27, 2024

Kunlun-Zhu closed this as completed May 27, 2024

DaoD reopened this May 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【New Features】Chunking method available? #3

【New Features】Chunking method available? #3

Kunlun-Zhu commented May 27, 2024

DaoD commented May 27, 2024

ignorejjj commented May 27, 2024

Kunlun-Zhu commented May 27, 2024

linchen111 commented May 29, 2024

【New Features】Chunking method available? #3

【New Features】Chunking method available? #3

Comments

Kunlun-Zhu commented May 27, 2024

DaoD commented May 27, 2024

ignorejjj commented May 27, 2024

Kunlun-Zhu commented May 27, 2024

linchen111 commented May 29, 2024