-
Notifications
You must be signed in to change notification settings - Fork 683
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cloud: Polish availability and limitation of Vector Search #19494
base: release-7.5
Are you sure you want to change the base?
Changes from all commits
d68f99d
5d92d7b
a8ad90a
31a0065
80f1ec6
9bf913f
2b73dc1
561426e
5a39337
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Large diffs are not rendered by default.
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -18,9 +18,9 @@ Using vector data types provides the following advantages over using the [`JSON` | |||||
- Dimension enforcement: You can specify a dimension to forbid inserting vectors with different dimensions. | ||||||
- Optimized storage format: Vector data types are optimized for handling vector data, offering better space efficiency and performance compared to `JSON` types. | ||||||
|
||||||
> **Note:** | ||||||
> **Note** | ||||||
> | ||||||
> Vector data types are only available for [TiDB Cloud Serverless](/tidb-cloud/select-cluster-tier.md#tidb-cloud-serverless) clusters. | ||||||
> TiDB Vector Search is only available for TiDB (>= v8.4) and [TiDB Cloud Serverless](/tidb-cloud/select-cluster-tier.md#tidb-cloud-serverless). It is not available for [TiDB Cloud Dedicated](/tidb-cloud/select-cluster-tier.md#tidb-cloud-dedicated). | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
## Syntax | ||||||
|
||||||
|
@@ -231,9 +231,9 @@ Currently, direct casting between Vector and other data types (such as `JSON`) i | |||||
|
||||||
Note that vector data type columns stored in a table cannot be converted to other data types using `ALTER TABLE ... MODIFY COLUMN ...`. | ||||||
|
||||||
## Restrictions | ||||||
## Limitations | ||||||
|
||||||
For restrictions on vector data types, see [Vector search limitations](/tidb-cloud/vector-search-limitations.md) and [Vector index restrictions](/tidb-cloud/vector-search-index.md#restrictions). | ||||||
See [Vector data type limitations](/tidb-cloud/vector-search-limitations.md#vector-data-type-limitations). | ||||||
|
||||||
## MySQL compatibility | ||||||
|
||||||
|
@@ -243,4 +243,4 @@ Vector data types are TiDB specific, and are not supported in MySQL. | |||||
|
||||||
- [Vector Functions and Operators](/tidb-cloud/vector-search-functions-and-operators.md) | ||||||
- [Vector Search Index](/tidb-cloud/vector-search-index.md) | ||||||
- [Improve Vector Search Performance](/tidb-cloud/vector-search-improve-performance.md) | ||||||
- [Improve Vector Search Performance](/tidb-cloud/vector-search-improve-performance.md) |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -9,25 +9,25 @@ This document lists the functions and operators available for Vector data types. | |||||
|
||||||
> **Note** | ||||||
> | ||||||
> Vector data types and these vector functions are only available for [TiDB Cloud Serverless](/tidb-cloud/select-cluster-tier.md#tidb-cloud-serverless) clusters. | ||||||
> TiDB Vector Search is only available for TiDB (>= v8.4) and [TiDB Cloud Serverless](/tidb-cloud/select-cluster-tier.md#tidb-cloud-serverless). It is not available for [TiDB Cloud Dedicated](/tidb-cloud/select-cluster-tier.md#tidb-cloud-dedicated). | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
## Vector functions | ||||||
|
||||||
The following functions are designed specifically for [Vector data types](/tidb-cloud/vector-search-data-types.md). | ||||||
|
||||||
**Vector distance functions:** | ||||||
|
||||||
| Function Name | Description | | ||||||
| --------------------------------------------------------- | ---------------------------------------------------------------- | | ||||||
| Function Name | Description | | ||||||
| ----------------------------------------------------------- | ---------------------------------------------------------------- | | ||||||
| [`VEC_L2_DISTANCE`](#vec_l2_distance) | Calculates L2 distance (Euclidean distance) between two vectors | | ||||||
| [`VEC_COSINE_DISTANCE`](#vec_cosine_distance) | Calculates the cosine distance between two vectors | | ||||||
| [`VEC_NEGATIVE_INNER_PRODUCT`](#vec_negative_inner_product) | Calculates the negative of the inner product between two vectors | | ||||||
| [`VEC_L1_DISTANCE`](#vec_l1_distance) | Calculates L1 distance (Manhattan distance) between two vectors | | ||||||
|
||||||
**Other vector functions:** | ||||||
|
||||||
| Function Name | Description | | ||||||
| ------------------------------- | --------------------------------------------------- | | ||||||
| Function Name | Description | | ||||||
| --------------------------------- | --------------------------------------------------- | | ||||||
| [`VEC_DIMS`](#vec_dims) | Returns the dimension of a vector | | ||||||
| [`VEC_L2_NORM`](#vec_l2_norm) | Calculates the L2 norm (Euclidean norm) of a vector | | ||||||
| [`VEC_FROM_TEXT`](#vec_from_text) | Converts a string into a vector | | ||||||
|
@@ -48,17 +48,17 @@ For more information about how vector arithmetic works, see [Vector Data Type | | |||||
|
||||||
**Aggregate (GROUP BY) functions:** | ||||||
|
||||||
| Name | Description | | ||||||
| :----------------------- | :----------------------------------------------- | | ||||||
| Name | Description | | ||||||
| :------------------------------------------------------------------------------------------------------------ | :----------------------------------------------- | | ||||||
| [`COUNT()`](https://dev.mysql.com/doc/refman/8.0/en/aggregate-functions.html#function_count) | Return a count of the number of rows returned | | ||||||
| [`COUNT(DISTINCT)`](https://dev.mysql.com/doc/refman/8.0/en/aggregate-functions.html#function_count-distinct) | Return the count of a number of different values | | ||||||
| [`MAX()`](https://dev.mysql.com/doc/refman/8.0/en/aggregate-functions.html#function_max) | Return the maximum value | | ||||||
| [`MIN()`](https://dev.mysql.com/doc/refman/8.0/en/aggregate-functions.html#function_min) | Return the minimum value | | ||||||
|
||||||
**Comparison functions and operators:** | ||||||
|
||||||
| Name | Description | | ||||||
| ---------------------------------------- | ----------------------------------------------------- | | ||||||
| Name | Description | | ||||||
| ------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------- | | ||||||
| [`BETWEEN ... AND ...`](https://dev.mysql.com/doc/refman/8.0/en/comparison-operators.html#operator_between) | Check whether a value is within a range of values | | ||||||
| [`COALESCE()`](https://dev.mysql.com/doc/refman/8.0/en/comparison-operators.html#function_coalesce) | Return the first non-NULL argument | | ||||||
| [`=`](https://dev.mysql.com/doc/refman/8.0/en/comparison-operators.html#operator_equal) | Equal operator | | ||||||
|
@@ -67,8 +67,8 @@ For more information about how vector arithmetic works, see [Vector Data Type | | |||||
| [`>=`](https://dev.mysql.com/doc/refman/8.0/en/comparison-operators.html#operator_greater-than-or-equal) | Greater than or equal operator | | ||||||
| [`GREATEST()`](https://dev.mysql.com/doc/refman/8.0/en/comparison-operators.html#function_greatest) | Return the largest argument | | ||||||
| [`IN()`](https://dev.mysql.com/doc/refman/8.0/en/comparison-operators.html#operator_in) | Check whether a value is within a set of values | | ||||||
| [`IS NULL`](https://dev.mysql.com/doc/refman/8.0/en/comparison-operators.html#operator_is-null) | Test whether a value is `NULL` | | ||||||
| [`ISNULL()`](https://dev.mysql.com/doc/refman/8.0/en/comparison-operators.html#function_isnull) | Test whether the argument is `NULL` | | ||||||
| [`IS NULL`](https://dev.mysql.com/doc/refman/8.0/en/comparison-operators.html#operator_is-null) | Test whether a value is `NULL` | | ||||||
| [`ISNULL()`](https://dev.mysql.com/doc/refman/8.0/en/comparison-operators.html#function_isnull) | Test whether the argument is `NULL` | | ||||||
| [`LEAST()`](https://dev.mysql.com/doc/refman/8.0/en/comparison-operators.html#function_least) | Return the smallest argument | | ||||||
| [`<`](https://dev.mysql.com/doc/refman/8.0/en/comparison-operators.html#operator_less-than) | Less than operator | | ||||||
| [`<=`](https://dev.mysql.com/doc/refman/8.0/en/comparison-operators.html#operator_less-than-or-equal) | Less than or equal operator | | ||||||
|
@@ -80,19 +80,19 @@ For more information about how vectors are compared, see [Vector Data Type | Com | |||||
|
||||||
**Control flow functions:** | ||||||
|
||||||
| Name | Description | | ||||||
| :------------------------------------------------------------------------------------------------ | :--------------------------- | | ||||||
| [`CASE`](https://dev.mysql.com/doc/refman/8.0/en/flow-control-functions.html#operator_case) | Case operator | | ||||||
| [`IF()`](https://dev.mysql.com/doc/refman/8.0/en/flow-control-functions.html#function_if) | If/else construct | | ||||||
| [`IFNULL()`](https://dev.mysql.com/doc/refman/8.0/en/flow-control-functions.html#function_ifnull) | Null if/else construct | | ||||||
| Name | Description | | ||||||
| :------------------------------------------------------------------------------------------------ | :----------------------------- | | ||||||
| [`CASE`](https://dev.mysql.com/doc/refman/8.0/en/flow-control-functions.html#operator_case) | Case operator | | ||||||
| [`IF()`](https://dev.mysql.com/doc/refman/8.0/en/flow-control-functions.html#function_if) | If/else construct | | ||||||
| [`IFNULL()`](https://dev.mysql.com/doc/refman/8.0/en/flow-control-functions.html#function_ifnull) | Null if/else construct | | ||||||
| [`NULLIF()`](https://dev.mysql.com/doc/refman/8.0/en/flow-control-functions.html#function_nullif) | Return `NULL` if expr1 = expr2 | | ||||||
|
||||||
**Cast functions:** | ||||||
|
||||||
| Name | Description | | ||||||
| :------------------------------------------------------------------------------------------ | :----------------------------- | | ||||||
| Name | Description | | ||||||
| :------------------------------------------------------------------------------------------ | :--------------------------------- | | ||||||
| [`CAST()`](https://dev.mysql.com/doc/refman/8.0/en/cast-functions.html#function_cast) | Cast a value as a string or vector | | ||||||
| [`CONVERT()`](https://dev.mysql.com/doc/refman/8.0/en/cast-functions.html#function_convert) | Cast a value as a string | | ||||||
| [`CONVERT()`](https://dev.mysql.com/doc/refman/8.0/en/cast-functions.html#function_convert) | Cast a value as a string | | ||||||
|
||||||
For more information about how to use `CAST()`, see [Vector Data Type | Cast](/tidb-cloud/vector-search-data-types.md#cast). | ||||||
|
||||||
|
@@ -222,7 +222,7 @@ Examples: | |||||
VEC_L2_NORM(vector) | ||||||
``` | ||||||
|
||||||
Calculates the [L2 norm](https://en.wikipedia.org/wiki/Norm_(mathematics)) (Euclidean norm) of a vector using the following formula: | ||||||
Calculates the [L2 norm](<https://en.wikipedia.org/wiki/Norm_(mathematics)>) (Euclidean norm) of a vector using the following formula: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why there is a pair of '<>' at the margin of the link? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is added by the formatter by default. Although the rendering is fine without this |
||||||
|
||||||
$NORM(p)=\sqrt {\sum \limits _{i=1}^{n}{p_{i}^{2}}}$ | ||||||
|
||||||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -11,7 +11,7 @@ Throughout this tutorial, you will develop this AI application using [TiDB Vecto | |||||
|
||||||
> **Note** | ||||||
> | ||||||
> TiDB Vector Search is currently in beta and is not available for [TiDB Cloud Dedicated](/tidb-cloud/select-cluster-tier.md#tidb-cloud-dedicated) clusters. | ||||||
> TiDB Vector Search is only available for TiDB (>= v8.4) and [TiDB Cloud Serverless](/tidb-cloud/select-cluster-tier.md#tidb-cloud-serverless). It is not available for [TiDB Cloud Dedicated](/tidb-cloud/select-cluster-tier.md#tidb-cloud-dedicated). | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
## Prerequisites | ||||||
|
||||||
|
@@ -54,28 +54,28 @@ pip install sqlalchemy pymysql sentence-transformers tidb-vector python-dotenv | |||||
|
||||||
3. Ensure the configurations in the connection dialog match your operating environment. | ||||||
|
||||||
- **Connection Type** is set to `Public`. | ||||||
- **Branch** is set to `main`. | ||||||
- **Connect With** is set to `SQLAlchemy`. | ||||||
- **Operating System** matches your environment. | ||||||
- **Connection Type** is set to `Public`. | ||||||
- **Branch** is set to `main`. | ||||||
- **Connect With** is set to `SQLAlchemy`. | ||||||
- **Operating System** matches your environment. | ||||||
Comment on lines
+57
to
+60
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe four spaces are correct. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Automatic formatting by the very recent version of Prettier. Considering that old prettier does not format like this, I guess this is an improvement. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I found the feature here, which is introduced 2 months ago: prettier/prettier@6168d1e The format rule is as follows:
|
||||||
|
||||||
> **Tip:** | ||||||
> | ||||||
> If your program is running in Windows Subsystem for Linux (WSL), switch to the corresponding Linux distribution. | ||||||
> **Tip:** | ||||||
> | ||||||
> If your program is running in Windows Subsystem for Linux (WSL), switch to the corresponding Linux distribution. | ||||||
4. Click the **PyMySQL** tab and copy the connection string. | ||||||
|
||||||
> **Tip:** | ||||||
> | ||||||
> If you have not set a password yet, click **Generate Password** to generate a random password. | ||||||
> **Tip:** | ||||||
> | ||||||
> If you have not set a password yet, click **Generate Password** to generate a random password. | ||||||
5. In the root directory of your Python project, create a `.env` file and paste the connection string into it. | ||||||
|
||||||
The following is an example for macOS: | ||||||
The following is an example for macOS: | ||||||
|
||||||
```dotenv | ||||||
TIDB_DATABASE_URL="mysql+pymysql://<prefix>.root:<password>@gateway01.<region>.prod.aws.tidbcloud.com:4000/test?ssl_ca=/etc/ssl/cert.pem&ssl_verify_cert=true&ssl_verify_identity=true" | ||||||
``` | ||||||
```dotenv | ||||||
TIDB_DATABASE_URL="mysql+pymysql://<prefix>.root:<password>@gateway01.<region>.prod.aws.tidbcloud.com:4000/test?ssl_ca=/etc/ssl/cert.pem&ssl_verify_cert=true&ssl_verify_identity=true" | ||||||
``` | ||||||
|
||||||
### Step 4. Initialize the embedding model | ||||||
|
||||||
|
@@ -192,4 +192,4 @@ Therefore, according to the output, the swimming animal is most likely a fish, o | |||||
## See also | ||||||
|
||||||
- [Vector Data Types](/tidb-cloud/vector-search-data-types.md) | ||||||
- [Vector Search Index](/tidb-cloud/vector-search-index.md) | ||||||
- [Vector Search Index](/tidb-cloud/vector-search-index.md) |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -16,7 +16,7 @@ This tutorial demonstrates how to get started with TiDB Vector Search just using | |||||
|
||||||
> **Note** | ||||||
> | ||||||
> TiDB Vector Search is currently in beta and is not available for [TiDB Cloud Dedicated](/tidb-cloud/select-cluster-tier.md#tidb-cloud-dedicated) clusters. | ||||||
> TiDB Vector Search is only available for TiDB (>= v8.4) and [TiDB Cloud Serverless](/tidb-cloud/select-cluster-tier.md#tidb-cloud-serverless). It is not available for [TiDB Cloud Dedicated](/tidb-cloud/select-cluster-tier.md#tidb-cloud-dedicated). | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
## Prerequisites | ||||||
|
||||||
|
@@ -39,9 +39,9 @@ To complete this tutorial, you need: | |||||
|
||||||
5. Copy the connection command and paste it into your terminal. The following is an example for macOS: | ||||||
|
||||||
```bash | ||||||
mysql -u '<prefix>.root' -h '<host>' -P 4000 -D 'test' --ssl-mode=VERIFY_IDENTITY --ssl-ca=/etc/ssl/cert.pem -p'<password>' | ||||||
``` | ||||||
```bash | ||||||
mysql -u '<prefix>.root' -h '<host>' -P 4000 -D 'test' --ssl-mode=VERIFY_IDENTITY --ssl-ca=/etc/ssl/cert.pem -p'<password>' | ||||||
``` | ||||||
|
||||||
### Step 2. Create a vector table | ||||||
|
||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
preview: https://pr.pingcap-docsite-preview.pages.dev/tidbcloud/vector-search-overview