Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Final touches to HTTP and Python API references #3019

Merged
merged 1 commit into from
Oct 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 35 additions & 35 deletions api/http_api_reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

---

:::tip NOTE
:::tip API GROUPING
Dataset Management
:::

Expand All @@ -32,7 +32,7 @@ Creates a dataset.
- `"embedding_model"`: `string`
- `"permission"`: `string`
- `"chunk_method"`: `string`
- `"parser_config"`: `Dataset.ParserConfig`
- `"parser_config"`: `object`

#### Request example

Expand Down Expand Up @@ -86,11 +86,11 @@ curl --request POST \
- `"laws"`: Laws
- `"presentation"`: Presentation
- `"picture"`: Picture
- `"one"`:One
- `"one"`: One
- `"knowledge_graph"`: Knowledge Graph
- `"email"`: Email

- `"parser_config"`: (*Body parameter*)
- `"parser_config"`: (*Body parameter*), `object`
The configuration settings for the dataset parser. A `ParserConfig` object contains the following attributes:
- `"chunk_token_count"`: Defaults to `128`.
- `"layout_recognize"`: Defaults to `true`.
Expand Down Expand Up @@ -237,8 +237,8 @@ curl --request PUT \
- `dataset_id`: (*Path parameter*)
The ID of the dataset to update.
- `"name"`: `string`
The name of the dataset to update.
- `"embedding_model"`: `string` The embedding model name to update.
The revised name of the dataset.
- `"embedding_model"`: `string` The updated embedding model name.
- Ensure that `"chunk_count"` is `0` before updating `"embedding_model"`.
- `"chunk_method"`: `enum<string>` The chunking method for the dataset. Available options:
- `"naive"`: General
Expand Down Expand Up @@ -572,7 +572,7 @@ curl --request GET \
Success:

```text
This is a test to verify the file download functionality.
This is a test to verify the file download feature.
```

Failure:
Expand Down Expand Up @@ -938,15 +938,15 @@ Lists chunks in a specified document.
### Request

- Method: GET
- URL: `/api/v1/dataset/{dataset_id}/document/{document_id}/chunk?keywords={keywords}&offset={offset}&limit={limit}&id={id}`
- URL: `/api/v1/dataset/{dataset_id}/document/{document_id}/chunk?keywords={keywords}&offset={offset}&limit={limit}&id={chunk_id}`
- Headers:
- `'Authorization: Bearer {YOUR_API_KEY}'`

#### Request example

```bash
curl --request GET \
--url http://{address}/api/v1/dataset/{dataset_id}/document/{document_id}/chunk?keywords={keywords}&offset={offset}&limit={limit}&id={id} \
--url http://{address}/api/v1/dataset/{dataset_id}/document/{document_id}/chunk?keywords={keywords}&offset={offset}&limit={limit}&id={chunk_id} \
--header 'Authorization: Bearer {YOUR_API_KEY}'
```

Expand All @@ -956,13 +956,13 @@ curl --request GET \
The associated dataset ID.
- `document_ids`: (*Path parameter*)
The associated document ID.
- `"keywords"`(*Filter parameter*), `string`
- `keywords`(*Filter parameter*), `string`
The keywords used to match chunk content.
- `"offset"`(*Filter parameter*), `string`
- `offset`(*Filter parameter*), `string`
The starting index for the chunks to retrieve. Defaults to `1`.
- `"limit"`(*Filter parameter*), `integer`
- `limit`(*Filter parameter*), `integer`
The maximum number of chunks to retrieve. Default: `1024`
- `"id"`(*Filter parameter*), `string`
- `id`(*Filter parameter*), `string`
The ID of the chunk to retrieve.

### Response
Expand Down Expand Up @@ -1210,21 +1210,21 @@ curl --request POST \

- `"question"`: (*Body parameter*), `string`, *Required*
The user query or query keywords.
- `"dataset_ids"`: (*Body parameter*) `list[string]`, *Required*
The IDs of the datasets to search from.
- `"dataset_ids"`: (*Body parameter*) `list[string]`
The IDs of the datasets to search. If you do not set this argument, ensure that you set `"document_ids"`.
- `"document_ids"`: (*Body parameter*), `list[string]`
The IDs of the documents to search from.
The IDs of the documents to search. Ensure that all selected documents use the same embedding model. Otherwise, an error will occur. If you do not set this argument, ensure that you set `"dataset_ids"`.
- `"offset"`: (*Body parameter*), `integer`
The starting index for the documents to retrieve. Defaults to `1`.
- `"limit"`: (*Body parameter*)
The maximum number of chunks to retrieve. Defaults to `1024`.
- `"similarity_threshold"`: (*Body parameter*)
The minimum similarity score. Defaults to `0.2`.
- `"vector_similarity_weight"`: (*Body parameter*)
- `"vector_similarity_weight"`: (*Body parameter*), `weight`
The weight of vector cosine similarity. Defaults to `0.3`. If x represents the vector cosine similarity, then (1 - x) is the term similarity weight.
- `"top_k"`: (*Body parameter*)
- `"top_k"`: (*Body parameter*), `integer`
The number of chunks engaged in vector cosine computaton. Defaults to `1024`.
- `"rerank_id"`: (*Body parameter*)
- `"rerank_id"`: (*Body parameter*), `integer`
The ID of the rerank model.
- `"keyword"`: (*Body parameter*), `boolean`
Indicates whether to enable keyword-based matching:
Expand Down Expand Up @@ -1335,7 +1335,7 @@ curl --request POST \
- `"dataset_ids"`: (*Body parameter*), `list[string]`
The IDs of the associated datasets.
- `"llm"`: (*Body parameter*), `object`
The LLM settings for the chat assistant to create. If it is not explicitly set, a dictionary with the following values will be generated as the default. An `llm` object contains the following attributes:
The LLM settings for the chat assistant to create. If it is not explicitly set, a JSON object with the following values will be generated as the default. An `llm` JSON object contains the following attributes:
- `"model_name"`, `string`
The chat model name. If not set, the user's default chat model will be used.
- `"temperature"`: `float`
Expand All @@ -1349,7 +1349,7 @@ curl --request POST \
- `"max_token"`: `integer`
The maximum length of the model’s output, measured in the number of tokens (words or pieces of words). Defaults to `512`.
- `"prompt"`: (*Body parameter*), `object`
Instructions for the LLM to follow. A `prompt` object contains the following attributes:
Instructions for the LLM to follow. If it is not explicitly set, a JSON object with the following values will be generated as the default. A `prompt` JSON object contains the following attributes:
- `"similarity_threshold"`: `float` RAGFlow uses a hybrid of weighted keyword similarity and vector cosine similarity during retrieval. This argument sets the threshold for similarities between the user query and chunks. If a similarity score falls below this threshold, the corresponding chunk will be excluded from the results. The default value is `0.2`.
- `"keywords_similarity_weight"`: `float` This argument sets the weight of keyword similarity in the hybrid similarity score with vector cosine similarity or reranking model similarity. By adjusting this weight, you can control the influence of keyword similarity in relation to other similarity measures. The default value is `0.7`.
- `"top_n"`: `int` This argument specifies the number of top chunks with similarity scores above the `similarity_threshold` that are fed to the LLM. The LLM will *only* access these 'top N' chunks. The default value is `8`.
Expand Down Expand Up @@ -1467,7 +1467,7 @@ curl --request PUT \
- `chat_id`: (*Path parameter*)
The ID of the chat assistant to update.
- `"name"`: (*Body parameter*), `string`, *Required*
The name of the chat assistant.
The revised name of the chat assistant.
- `"avatar"`: (*Body parameter*), `string`
Base64 encoding of the avatar.
- `"dataset_ids"`: (*Body parameter*), `list[string]`
Expand Down Expand Up @@ -1603,19 +1603,19 @@ curl --request GET \

#### Request parameters

- `page`: (*Path parameter*), `integer`
- `page`: (*Filter parameter*), `integer`
Specifies the page on which the chat assistants will be displayed. Defaults to `1`.
- `page_size`: (*Path parameter*), `integer`
- `page_size`: (*Filter parameter*), `integer`
The number of chat assistants on each page. Defaults to `1024`.
- `orderby`: (*Path parameter*), `string`
- `orderby`: (*Filter parameter*), `string`
The attribute by which the results are sorted. Available options:
- `create_time` (default)
- `update_time`
- `"desc"`: (*Path parameter*), `boolean`
- `desc`: (*Filter parameter*), `boolean`
Indicates whether the retrieved chat assistants should be sorted in descending order. Defaults to `true`.
- `id`: (*Path parameter*), `string`
- `id`: (*Filter parameter*), `string`
The ID of the chat assistant to retrieve.
- `name`: (*Path parameter*), `string`
- `name`: (*Filter parameter*), `string`
The name of the chat assistant to retrieve.

### Response
Expand Down Expand Up @@ -1775,7 +1775,7 @@ curl --request PUT \
--header 'Authorization: Bearer {YOUR_API_KEY}' \
--data '
{
"name": "Updated session"
"name": "<REVISED_SESSION_NAME_HERE>"
}'
```

Expand All @@ -1786,7 +1786,7 @@ curl --request PUT \
- `session_id`: (*Path parameter*)
The ID of the session to update.
- `"name"`: (*Body Parameter), `string`
The name of the session to update.
The revised name of the session.

### Response

Expand Down Expand Up @@ -1818,7 +1818,7 @@ Lists sessions associated with a specified chat assistant.
### Request

- Method: GET
- URL: `/api/v1/chat/{chat_id}/session?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={dataset_name}&id={dataset_id}`
- URL: `/api/v1/chat/{chat_id}/session?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={session_name}&id={session_id}`
- Headers:
- `'Authorization: Bearer {YOUR_API_KEY}'`

Expand Down Expand Up @@ -1949,7 +1949,7 @@ Failure:

**POST** `/api/v1/chat/{chat_id}/completion`

Asks a question to start a conversation.
Asks a question to start an AI-powered conversation.

### Request

Expand All @@ -1972,7 +1972,7 @@ curl --request POST \
--header 'Authorization: Bearer {YOUR_API_KEY}' \
--data-binary '
{
"question": "Hello!",
"question": "What is RAGFlow?",
"stream": true
}'
```
Expand All @@ -1982,11 +1982,11 @@ curl --request POST \
- `chat_id`: (*Path parameter*)
The ID of the associated chat assistant.
- `"question"`: (*Body Parameter*), `string` *Required*
The question to start an AI chat.
The question to start an AI-powered conversation.
- `"stream"`: (*Body Parameter*), `boolean`
Indicates whether to output responses in a streaming way:
- `true`: Enable streaming.
- `false`: (Default) Disable streaming.
- `false`: Disable streaming (default).
- `"session_id"`: (*Body Parameter*)
The ID of session. If it is not provided, a new session will be generated.

Expand Down
24 changes: 12 additions & 12 deletions api/python_api_reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ The chunking method of the dataset to create. Available options:
- `"laws"`: Laws
- `"presentation"`: Presentation
- `"picture"`: Picture
- `"one"`:One
- `"one"`: One
- `"knowledge_graph"`: Knowledge Graph
- `"email"`: Email

Expand Down Expand Up @@ -210,8 +210,8 @@ Updates configurations for the current dataset.

A dictionary representing the attributes to update, with the following keys:

- `"name"`: `str` The name of the dataset to update.
- `"embedding_model"`: `str` The embedding model name to update.
- `"name"`: `str` The revised name of the dataset.
- `"embedding_model"`: `str` The updated embedding model name.
- Ensure that `"chunk_count"` is `0` before updating `"embedding_model"`.
- `"chunk_method"`: `str` The chunking method for the dataset. Available options:
- `"naive"`: General
Expand All @@ -223,7 +223,7 @@ A dictionary representing the attributes to update, with the following keys:
- `"laws"`: Laws
- `"presentation"`: Presentation
- `"picture"`: Picture
- `"one"`:One
- `"one"`: One
- `"knowledge_graph"`: Knowledge Graph
- `"email"`: Email

Expand Down Expand Up @@ -753,11 +753,11 @@ The user query or query keywords. Defaults to `""`.

#### dataset_ids: `list[str]`, *Required*

The IDs of the datasets to search from.
The IDs of the datasets to search. Defaults to `None`. If you do not set this argument, ensure that you set `document_ids`.

#### document_ids: `list[str]`

The IDs of the documents to search from. Defaults to `None`.
The IDs of the documents to search. Defaults to `None`. You must ensure all selected documents use the same embedding model. Otherwise, an error will occur. If you do not set this argument, ensure that you set `dataset_ids`.

#### offset: `int`

Expand Down Expand Up @@ -932,7 +932,7 @@ Updates configurations for the current chat assistant.

A dictionary representing the attributes to update, with the following keys:

- `"name"`: `str` The name of the chat assistant to update.
- `"name"`: `str` The revised name of the chat assistant.
- `"avatar"`: `str` Base64 encoding of the avatar. Defaults to `""`
- `"dataset_ids"`: `list[str]` The datasets to update.
- `"llm"`: `dict` The LLM settings:
Expand Down Expand Up @@ -1117,15 +1117,15 @@ session = assistant.create_session()
Session.update(update_message: dict)
```

Updates the current session name.
Updates the current session.

### Parameters

#### update_message: `dict[str, Any]`, *Required*

A dictionary representing the attributes to update, with only one key:

- `"name"`: `str` The name of the session to update.
- `"name"`: `str` The revised name of the session.

### Returns

Expand Down Expand Up @@ -1247,7 +1247,7 @@ assistant.delete_sessions(ids=["id_1","id_2"])
Session.ask(question: str, stream: bool = False) -> Optional[Message, iter[Message]]
```

Asks a question to start a conversation.
Asks a question to start an AI-powered conversation.

### Parameters

Expand All @@ -1260,7 +1260,7 @@ The question to start an AI chat.
Indicates whether to output responses in a streaming way:

- `True`: Enable streaming.
- `False`: (Default) Disable streaming.
- `False`: Disable streaming (default).

### Returns

Expand Down Expand Up @@ -1324,4 +1324,4 @@ while True:
for ans in session.ask(question, stream=True):
print(answer.content[len(cont):], end='', flush=True)
cont = answer.content
```
```