Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: the performance of partition keys is far inferior to scalar retrieval #38574

Open
1 task done
douglarek opened this issue Dec 19, 2024 · 17 comments
Open
1 task done
Assignees
Labels
kind/bug Issues or changes related a bug triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@douglarek
Copy link

douglarek commented Dec 19, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: 2.3.12 & 2.5.0-beta
- Deployment mode(standalone or cluster): cluster
- MQ type(rocksmq, pulsar or kafka):    external kafka
- SDK version(e.g. pymilvus v2.0.0rc2): golang-sdk v2.4.2
- OS(Ubuntu or CentOS): k8s
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

First, align a conclusion: whether partition key retrieval improves performance under the premise of scalar fields. If so, is the theoretical performance of partition keys higher than that of purely scalar field retrieval?

According to the test, I found that the retrieval performance of these two is vastly different.

Steps:

  1. Insert data into the following set as shown in the figure

    func insertData(ctx context.Context, c client.Client) {                                                                                                                                                                                                                                  
            var embeddingList [][]float32
            embeddingList = make([][]float32, 0, *nEntities)
            for i := 0; i < *nEntities; i++ {
                    vec := make([]float32, 0, *dim)
                    for j := 0; j < *dim; j++ {
                            vec = append(vec, rand.Float32())
                    }
                    embeddingList = append(embeddingList, vec)
            }
    
            embeddingColData := entity.NewColumnFloatVector(embeddingCol, *dim, embeddingList)
    
            start := time.Now()
            var err error
            if *enableScalar {
                    colorList := make([]string, 0, *nEntities)
                    color := colors[rand.Intn(len(colors))] // Choose one color for the batch
                    for i := 0; i < *nEntities; i++ {
                            colorList = append(colorList, color)
                    }
                    colorColData := entity.NewColumnVarChar(colorCol, colorList) // Changed from NewColumnString to NewColumnVarChar
    
                    if *enablePartition {
                            partitionName := fmt.Sprintf("partition_%s", color)
                            _, err = c.Insert(ctx, *milvusCollection, partitionName, embeddingColData, colorColData)
                    } else if *enableScalarKey {
                            keyColData := entity.NewColumnVarChar(colorKeyCol, colorList)
                            _, err = c.Insert(ctx, *milvusCollection, "", embeddingColData, colorColData, keyColData)
                    } else {
                            _, err = c.Insert(ctx, *milvusCollection, "", embeddingColData, colorColData)
                    }
            } else {
                    _, err = c.Insert(ctx, *milvusCollection, "", embeddingColData)
            }
    
            // ...
    }
    
    
    图片

    collection schema information:

    json
     {
       "data": [
         {
           "collection_name": "benchmark_v4",
           "schema": {
             "fields": [
               {
                 "type_params": [],
                 "index_params": [],
                 "fieldID": "100",
                 "name": "id",
                 "is_primary_key": true,
                 "description": "",
                 "data_type": "Int64",
                 "autoID": true,
                 "state": "FieldCreated",
                 "element_type": "None",
                 "default_value": null,
                 "is_dynamic": false,
                 "is_partition_key": false,
                 "is_clustering_key": false,
                 "dataType": 5,
                 "dimension": -1,
                 "maxCapacity": -1,
                 "maxLength": -1
               },
               {
                 "type_params": [
                   {
                     "key": "dim",
                     "value": "768"
                   }
                 ],
                 "index_params": [],
                 "fieldID": "101",
                 "name": "vec",
                 "is_primary_key": false,
                 "description": "",
                 "data_type": "FloatVector",
                 "autoID": false,
                 "state": "FieldCreated",
                 "element_type": "None",
                 "default_value": null,
                 "is_dynamic": false,
                 "is_partition_key": false,
                 "is_clustering_key": false,
                 "dataType": 101,
                 "index": {
                   "params": [
                     {
                       "key": "index_type",
                       "value": "IVF_FLAT"
                     },
                     {
                       "key": "metric_type",
                       "value": "COSINE"
                     },
                     {
                       "key": "params",
                       "value": "{\"nlist\":1024}"
                     }
                   ],
                   "index_name": "vec",
                   "indexID": "452967585148662141",
                   "field_name": "vec",
                   "indexed_rows": "11760180",
                   "total_rows": "11760180",
                   "state": "Finished",
                   "index_state_fail_reason": "",
                   "pending_index_rows": "0",
                   "indexType": "IVF_FLAT",
                   "metricType": "COSINE",
                   "indexParameterPairs": [
                     {
                       "key": "metric_type",
                       "value": "COSINE"
                     },
                     {
                       "key": "nlist",
                       "value": 1024
                     }
                   ]
                 },
                 "dimension": 768,
                 "maxCapacity": -1,
                 "maxLength": -1
               },
               {
                 "type_params": [
                   {
                     "key": "max_length",
                     "value": "32"
                   }
                 ],
                 "index_params": [],
                 "fieldID": "102",
                 "name": "scalar_colors",
                 "is_primary_key": false,
                 "description": "",
                 "data_type": "VarChar",
                 "autoID": false,
                 "state": "FieldCreated",
                 "element_type": "None",
                 "default_value": null,
                 "is_dynamic": false,
                 "is_partition_key": false,
                 "is_clustering_key": false,
                 "dataType": 21,
                 "index": {
                   "params": [
                     {
                       "key": "index_type",
                       "value": "Trie"
                     },
                     {
                       "key": "metric_type",
                       "value": ""
                     },
                     {
                       "key": "params",
                       "value": "{}"
                     }
                   ],
                   "index_name": "scalar_colors",
                   "indexID": "452967585148662190",
                   "field_name": "scalar_colors",
                   "indexed_rows": "11760180",
                   "total_rows": "11760180",
                   "state": "Finished",
                   "index_state_fail_reason": "",
                   "pending_index_rows": "0",
                   "indexType": "Trie",
                   "metricType": "",
                   "indexParameterPairs": [
                     {
                       "key": "metric_type",
                       "value": ""
                     }
                   ]
                 },
                 "dimension": -1,
                 "maxCapacity": -1,
                 "maxLength": 32
               },
               {
                 "type_params": [
                   {
                     "key": "max_length",
                     "value": "32"
                   }
                 ],
                 "index_params": [],
                 "fieldID": "103",
                 "name": "scalar_colors_k",
                 "is_primary_key": false,
                 "description": "",
                 "data_type": "VarChar",
                 "autoID": false,
                 "state": "FieldCreated",
                 "element_type": "None",
                 "default_value": null,
                 "is_dynamic": false,
                 "is_partition_key": true,
                 "is_clustering_key": false,
                 "dataType": 21,
                 "index": {
                   "params": [
                     {
                       "key": "index_type",
                       "value": "Trie"
                     },
                     {
                       "key": "metric_type",
                       "value": ""
                     },
                     {
                       "key": "params",
                       "value": "{}"
                     }
                   ],
                   "index_name": "scalar_colors_k",
                   "indexID": "452967585148662220",
                   "field_name": "scalar_colors_k",
                   "indexed_rows": "11760180",
                   "total_rows": "11760180",
                   "state": "Finished",
                   "index_state_fail_reason": "",
                   "pending_index_rows": "0",
                   "indexType": "Trie",
                   "metricType": "",
                   "indexParameterPairs": [
                     {
                       "key": "metric_type",
                       "value": ""
                     }
                   ]
                 },
                 "dimension": -1,
                 "maxCapacity": -1,
                 "maxLength": 32
               }
             ],
             "properties": [],
             "name": "benchmark_v4",
             "description": "",
             "autoID": false,
             "enable_dynamic_field": true,
             "primaryField": {
               "type_params": [],
               "index_params": [],
               "fieldID": "100",
               "name": "id",
               "is_primary_key": true,
               "description": "",
               "data_type": "Int64",
               "autoID": true,
               "state": "FieldCreated",
               "element_type": "None",
               "default_value": null,
               "is_dynamic": false,
               "is_partition_key": false,
               "is_clustering_key": false,
               "dataType": 5,
               "dimension": -1,
               "maxCapacity": -1,
               "maxLength": -1
             },
             "hasVectorIndex": true,
             "enablePartitionKey": true,
             "scalarFields": [
               {
                 "type_params": [],
                 "index_params": [],
                 "fieldID": "100",
                 "name": "id",
                 "is_primary_key": true,
                 "description": "",
                 "data_type": "Int64",
                 "autoID": true,
                 "state": "FieldCreated",
                 "element_type": "None",
                 "default_value": null,
                 "is_dynamic": false,
                 "is_partition_key": false,
                 "is_clustering_key": false,
                 "dataType": 5,
                 "dimension": -1,
                 "maxCapacity": -1,
                 "maxLength": -1
               },
               {
                 "type_params": [
                   {
                     "key": "max_length",
                     "value": "32"
                   }
                 ],
                 "index_params": [],
                 "fieldID": "102",
                 "name": "scalar_colors",
                 "is_primary_key": false,
                 "description": "",
                 "data_type": "VarChar",
                 "autoID": false,
                 "state": "FieldCreated",
                 "element_type": "None",
                 "default_value": null,
                 "is_dynamic": false,
                 "is_partition_key": false,
                 "is_clustering_key": false,
                 "dataType": 21,
                 "index": {
                   "params": [
                     {
                       "key": "index_type",
                       "value": "Trie"
                     },
                     {
                       "key": "metric_type",
                       "value": ""
                     },
                     {
                       "key": "params",
                       "value": "{}"
                     }
                   ],
                   "index_name": "scalar_colors",
                   "indexID": "452967585148662190",
                   "field_name": "scalar_colors",
                   "indexed_rows": "11760180",
                   "total_rows": "11760180",
                   "state": "Finished",
                   "index_state_fail_reason": "",
                   "pending_index_rows": "0",
                   "indexType": "Trie",
                   "metricType": "",
                   "indexParameterPairs": [
                     {
                       "key": "metric_type",
                       "value": ""
                     }
                   ]
                 },
                 "dimension": -1,
                 "maxCapacity": -1,
                 "maxLength": 32
               },
               {
                 "type_params": [
                   {
                     "key": "max_length",
                     "value": "32"
                   }
                 ],
                 "index_params": [],
                 "fieldID": "103",
                 "name": "scalar_colors_k",
                 "is_primary_key": false,
                 "description": "",
                 "data_type": "VarChar",
                 "autoID": false,
                 "state": "FieldCreated",
                 "element_type": "None",
                 "default_value": null,
                 "is_dynamic": false,
                 "is_partition_key": true,
                 "is_clustering_key": false,
                 "dataType": 21,
                 "index": {
                   "params": [
                     {
                       "key": "index_type",
                       "value": "Trie"
                     },
                     {
                       "key": "metric_type",
                       "value": ""
                     },
                     {
                       "key": "params",
                       "value": "{}"
                     }
                   ],
                   "index_name": "scalar_colors_k",
                   "indexID": "452967585148662220",
                   "field_name": "scalar_colors_k",
                   "indexed_rows": "11760180",
                   "total_rows": "11760180",
                   "state": "Finished",
                   "index_state_fail_reason": "",
                   "pending_index_rows": "0",
                   "indexType": "Trie",
                   "metricType": "",
                   "indexParameterPairs": [
                     {
                       "key": "metric_type",
                       "value": ""
                     }
                   ]
                 },
                 "dimension": -1,
                 "maxCapacity": -1,
                 "maxLength": 32
               }
             ],
             "vectorFields": [
               {
                 "type_params": [
                   {
                     "key": "dim",
                     "value": "768"
                   }
                 ],
                 "index_params": [],
                 "fieldID": "101",
                 "name": "vec",
                 "is_primary_key": false,
                 "description": "",
                 "data_type": "FloatVector",
                 "autoID": false,
                 "state": "FieldCreated",
                 "element_type": "None",
                 "default_value": null,
                 "is_dynamic": false,
                 "is_partition_key": false,
                 "is_clustering_key": false,
                 "dataType": 101,
                 "index": {
                   "params": [
                     {
                       "key": "index_type",
                       "value": "IVF_FLAT"
                     },
                     {
                       "key": "metric_type",
                       "value": "COSINE"
                     },
                     {
                       "key": "params",
                       "value": "{\"nlist\":1024}"
                     }
                   ],
                   "index_name": "vec",
                   "indexID": "452967585148662141",
                   "field_name": "vec",
                   "indexed_rows": "11760180",
                   "total_rows": "11760180",
                   "state": "Finished",
                   "index_state_fail_reason": "",
                   "pending_index_rows": "0",
                   "indexType": "IVF_FLAT",
                   "metricType": "COSINE",
                   "indexParameterPairs": [
                     {
                       "key": "metric_type",
                       "value": "COSINE"
                     },
                     {
                       "key": "nlist",
                       "value": 1024
                     }
                   ]
                 },
                 "dimension": 768,
                 "maxCapacity": -1,
                 "maxLength": -1
               }
             ],
             "dynamicFields": [
               {
                 "name": "$meta",
                 "data_type": "JSON",
                 "type_params": [],
                 "description": "",
                 "index_params": [],
                 "dimension": -1,
                 "maxCapacity": -1,
                 "maxLength": -1,
                 "autoID": false,
                 "fieldID": "",
                 "state": "",
                 "dataType": 23
               }
             ]
           },
           "rowCount": 11760180,
           "createdTime": 1734073852866,
           "aliases": [],
           "description": "",
           "autoID": true,
           "id": "452967585148661839",
           "loadedPercentage": 100,
           "consistency_level": "Bounded",
           "replicas": [
             {
               "partition_ids": [],
               "shard_replicas": [
                 {
                   "node_ids": [
                     "55"
                   ],
                   "leaderID": "55",
                   "leader_addr": "10.203.27.51:21123",
                   "dm_channel_name": "milvus-zxy-sync-5-rootcoord-dml_14_452967585148661839v0"
                 }
               ],
               "node_ids": [
                 "11",
                 "4",
                 "6",
                 "5",
                 "21",
                 "56",
                 "54",
                 "1",
                 "9",
                 "46",
                 "22",
                 "55",
                 "12",
                 "7",
                 "10",
                 "20",
                 "3",
                 "47"
               ],
               "num_outbound_node": {},
               "replicaID": "452967593885040663",
               "collectionID": "452967585148661839",
               "resource_group_name": "__default_resource_group"
             }
           ],
           "loaded": true,
           "status": "loaded",
           "properties": []
         }
       ],
       "statusCode": 200
     }
     
  2. Use a benchmark to call the golang-sdk, the following are key methods for retrieval

    func worker(jobs <-chan int, results chan<- []client.SearchResult, c client.Client, ctx context.Context, sp entity.SearchParam, opt client.SearchQueryOptionFunc) {                                                                                                                                                                                                                                                                                                                                                                                         
            for range jobs {                                                                                                                                                                                                                                                                 
                    start := time.Now()                                                                                                                                                                                                                                                      
                    searchResult, err := c.Search(                                                                                                                                                                                                                                           
                            ctx,                                                                                                                                                                                                                                                             
                            *milvusCollection,                                                                                                                                                                                                                                               
                            []string{},                                                                                                                                                                                                                                                      
                            *expr,                                                                                                                                                                                                                                                           
                            nil,                                                                                                                                                                                                                                                     
                            []entity.Vector{entity.FloatVector(generateRandomVec(*dim))},                                                                                                                                                                                                    
                            *vectorField,                                                                                                                                                                                                                                                    
                            entity.MetricType(*metricType),                                                                                                                                                                                                                                  
                            *topK,                                                                                                                                                                                                                                                           
                            sp,                                                                                                                                                                                                                                                              
                            opt,                                                                                                                                                                                                                                                             
                    )                                                                                                                                                                                                                                                                        
                    if err != nil {                                                                                                                                                                                                                                                          
                            log.Fatal("fail to search collection:", err.Error())                                                                                                                                                                                                             
                    }                                                                                                                                                                                                                                                                        
                   // ...                                                                                                                                                                                                              
            }                                                                                                                                                                                                                                                                                
    }
    

    The *expr here is used to distinguish between scalar retrieval and partition key retrieval. scalar_colors == "red" for scalar search, scalar_colors_key == "red" for partition-key search.

  3. Search QPS benchmark

    • milvus 2.3.12

      # scalar search                                                                                                                                                                                                                                                                          
      {"time":"2024-12-12T17:41:35.988044033+08:00","level":"INFO","msg":"QPS","v":386}                                                                                                                                                                                                        
      {"time":"2024-12-12T17:41:35.98829498+08:00","level":"INFO","msg":"P99","v":51}                                                                                                                                                                                                          
      {"time":"2024-12-12T17:41:35.988305907+08:00","level":"INFO","msg":"P95","v":46}                                                                                                                                                                                                         
      {"time":"2024-12-12T17:41:40.989039638+08:00","level":"INFO","msg":"QPS","v":376}                                                                                                                                                                                                        
      {"time":"2024-12-12T17:41:40.989490032+08:00","level":"INFO","msg":"P99","v":52}                                                                                                                                                                                                         
      {"time":"2024-12-12T17:41:40.989500774+08:00","level":"INFO","msg":"P95","v":46}                                                                                                                                                                                                         
      {"time":"2024-12-12T17:41:45.989555065+08:00","level":"INFO","msg":"QPS","v":393}                                                                                                                                                                                                        
      {"time":"2024-12-12T17:41:45.989796831+08:00","level":"INFO","msg":"P99","v":48}                                                                                                                                                                                                         
      {"time":"2024-12-12T17:41:45.989804232+08:00","level":"INFO","msg":"P95","v":43}                                                                                                                                                                                                         
      {"time":"2024-12-12T17:41:50.990203755+08:00","level":"INFO","msg":"QPS","v":383}                                                                                                                                                                                                        
      {"time":"2024-12-12T17:41:50.990456089+08:00","level":"INFO","msg":"P99","v":51}                                                                                                                                                                                                         
      {"time":"2024-12-12T17:41:50.990463261+08:00","level":"INFO","msg":"P95","v":45}                                                                                                                                                                                                         
      {"time":"2024-12-12T17:41:55.990861482+08:00","level":"INFO","msg":"QPS","v":392}                                                                                                                                                                                                        
      {"time":"2024-12-12T17:41:55.991100164+08:00","level":"INFO","msg":"P99","v":49}                                                                                                                                                                                                         
      {"time":"2024-12-12T17:41:55.991108686+08:00","level":"INFO","msg":"P95","v":44}
                                                                                                                                                                                                     
      # partition key                                                                                                                                                                                                                                                                          
      {"time":"2024-12-12T17:36:08.882460339+08:00","level":"INFO","msg":"QPS","v":16}                                                                                                                                                                                                         
      {"time":"2024-12-12T17:36:08.88252359+08:00","level":"INFO","msg":"P99","v":620}                                                                                                                                                                                                         
      {"time":"2024-12-12T17:36:08.882530898+08:00","level":"INFO","msg":"P95","v":591}                                                                                                                                                                                                        
      {"time":"2024-12-12T17:36:13.883288182+08:00","level":"INFO","msg":"QPS","v":16}                                                                                                                                                                                                         
      {"time":"2024-12-12T17:36:13.88335226+08:00","level":"INFO","msg":"P99","v":706}                                                                                                                                                                                                         
      {"time":"2024-12-12T17:36:13.88336183+08:00","level":"INFO","msg":"P95","v":588}                                                                                                                                                                                                         
      {"time":"2024-12-12T17:36:18.884237264+08:00","level":"INFO","msg":"QPS","v":16}                                                                                                                                                                                                         
      {"time":"2024-12-12T17:36:18.884317977+08:00","level":"INFO","msg":"P99","v":606}                                                                                                                                                                                                        
      {"time":"2024-12-12T17:36:18.884327059+08:00","level":"INFO","msg":"P95","v":586}                                                                                                                                                                                                        
      {"time":"2024-12-12T17:36:23.885271055+08:00","level":"INFO","msg":"QPS","v":15}                                                                                                                                                                                                         
      {"time":"2024-12-12T17:36:23.885338278+08:00","level":"INFO","msg":"P99","v":598}                                                                                                                                                                                                        
      {"time":"2024-12-12T17:36:23.885348057+08:00","level":"INFO","msg":"P95","v":590}                                                                                                                                                                                                        
      {"time":"2024-12-12T17:36:28.88613793+08:00","level":"INFO","msg":"QPS","v":16}                                                                                                                                                                                                          
      {"time":"2024-12-12T17:36:28.886199197+08:00","level":"INFO","msg":"P99","v":603}                                                                                                                                                                                                        
      {"time":"2024-12-12T17:36:28.886207714+08:00","level":"INFO","msg":"P95","v":584}
      
    • milvus 2.5.0-beta

      # scalar search                                                                                                                                                                                                                                                                           
      {"time":"2024-12-12T17:49:15.820700081+08:00","level":"INFO","msg":"QPS","v":542}                                                                                                                                                                                                        
      {"time":"2024-12-12T17:49:15.82105169+08:00","level":"INFO","msg":"P99","v":49}                                                                                                                                                                                                          
      {"time":"2024-12-12T17:49:15.821068566+08:00","level":"INFO","msg":"P95","v":45}                                                                                                                                                                                                         
      {"time":"2024-12-12T17:49:20.821332247+08:00","level":"INFO","msg":"QPS","v":546}                                                                                                                                                                                                        
      {"time":"2024-12-12T17:49:20.821694137+08:00","level":"INFO","msg":"P99","v":49}                                                                                                                                                                                                         
      {"time":"2024-12-12T17:49:20.821706095+08:00","level":"INFO","msg":"P95","v":46}
      {"time":"2024-12-12T17:49:25.822302477+08:00","level":"INFO","msg":"QPS","v":543}
      {"time":"2024-12-12T17:49:25.822640691+08:00","level":"INFO","msg":"P99","v":50}
      {"time":"2024-12-12T17:49:25.822651138+08:00","level":"INFO","msg":"P95","v":45}
      {"time":"2024-12-12T17:49:30.822674675+08:00","level":"INFO","msg":"QPS","v":552}
      {"time":"2024-12-12T17:49:30.823024752+08:00","level":"INFO","msg":"P99","v":49}
      {"time":"2024-12-12T17:49:30.823036281+08:00","level":"INFO","msg":"P95","v":44}
      {"time":"2024-12-12T17:49:35.823171368+08:00","level":"INFO","msg":"QPS","v":563}
      {"time":"2024-12-12T17:49:35.823545029+08:00","level":"INFO","msg":"P99","v":46}
      {"time":"2024-12-12T17:49:35.823557088+08:00","level":"INFO","msg":"P95","v":42}
      
      # partition key 
      {"time":"2024-12-12T17:45:08.399602673+08:00","level":"INFO","msg":"QPS","v":13}
      {"time":"2024-12-12T17:45:08.399707573+08:00","level":"INFO","msg":"P99","v":907}
      {"time":"2024-12-12T17:45:08.399742638+08:00","level":"INFO","msg":"P95","v":818}
      {"time":"2024-12-12T17:45:13.399804611+08:00","level":"INFO","msg":"QPS","v":15}
      {"time":"2024-12-12T17:45:13.399945101+08:00","level":"INFO","msg":"P99","v":872}
      {"time":"2024-12-12T17:45:13.399956018+08:00","level":"INFO","msg":"P95","v":776}
      {"time":"2024-12-12T17:45:18.400527855+08:00","level":"INFO","msg":"QPS","v":15}
      {"time":"2024-12-12T17:45:18.400615751+08:00","level":"INFO","msg":"P99","v":796}
      {"time":"2024-12-12T17:45:18.400623359+08:00","level":"INFO","msg":"P95","v":723}
      {"time":"2024-12-12T17:45:23.400921698+08:00","level":"INFO","msg":"QPS","v":15}
      {"time":"2024-12-12T17:45:23.400975064+08:00","level":"INFO","msg":"P99","v":805}
      {"time":"2024-12-12T17:45:23.400981427+08:00","level":"INFO","msg":"P95","v":727}
      {"time":"2024-12-12T17:45:28.401998004+08:00","level":"INFO","msg":"QPS","v":15}
      {"time":"2024-12-12T17:45:28.402056908+08:00","level":"INFO","msg":"P99","v":870}
      {"time":"2024-12-12T17:45:28.402064275+08:00","level":"INFO","msg":"P95","v":703}
      
      

Expected Behavior

One surprising aspect of milvus-2.5.0 is that the scalar search performance has indeed improved significantly, by about 40%. The gap between partition keys and scalars is too large, which doesn't quite align with the theory in the milvus documentation. Of course, it could also be an issue with my stress testing, which is why I'm raising this issue.

Steps To Reproduce

No response

Milvus Log

No response

Anything else?

No response

@douglarek douglarek added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 19, 2024
@yanliang567
Copy link
Contributor

@douglarek That's an interesting test. Thank you for you updates. Quick questions:

  1. how many colors in all when you do inserting?
  2. refering to the insert scripts, i think we used the 2nd way for insert(), am I right?
  3. I noticed that you run the partition key search before the scalar search, have you rerun for times and get the same results?
  4. could you please upload the milvus logs for investigation? Please refer this doc to export the whole Milvus logs for investigation./

/assign @douglarek

@yanliang567 yanliang567 added triage/needs-information Indicates an issue needs more information in order to work on it. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 20, 2024
@douglarek
Copy link
Author

douglarek commented Dec 20, 2024

@douglarek That's an interesting test. Thank you for you updates. Quick questions:

1. how many colors in all when you do inserting?

2. refering to the insert scripts, i think we used the 2nd way for insert(), am I right?

3. I noticed that you run the partition key search before the scalar search, have you rerun for times and get the same results?

4. could you please upload the milvus logs for investigation? Please refer this [doc](https://github.com/milvus-io/milvus/tree/master/deployments/export-log) to export the whole Milvus logs for investigation./

/assign @douglarek

  1. var colors = []string{"green", "blue", "yellow", "red", "black", "white", "purple", "pink", "orange", "brown", "grey"}
else if *enableScalarKey {
                       keyColData := entity.NewColumnVarChar(colorKeyCol, colorList)
                       _, err = c.Insert(ctx, *milvusCollection, "", embeddingColData, colorColData, keyColData)
               }
  1. run many times and long time
  2. wait for another benchmark next week
    milvus-log.tar.gz

@douglarek douglarek removed their assignment Dec 23, 2024
@douglarek
Copy link
Author

@yanliang567 All information has been provided.

@yanliang567
Copy link
Contributor

@douglarek thank you for your quick update. could you also provide the birdwatcher backup file for investigation. Please refer to this doc: https://github.com/milvus-io/birdwatcher to backup etcd backup with birdwatcher

@xiaofan-luan
Copy link
Collaborator

what is scalar search? are you search by partition? or you search with filter?
partitionkey should be faster than filter but slower than search with one partition

@xiaofan-luan
Copy link
Collaborator

and how does the table defined?

@xiaofan-luan
Copy link
Collaborator

the QPS seems to be too low under whatever test case you are testing

@yanliang567
Copy link
Contributor

@douglarek thank you for the update. I checked the logs, but the logs only have INFO logs, so we cannot find any search info in the logs. If convenient, could you please set the log level to debug, reproduce the issue and recollect the logs for investigation?

@douglarek
Copy link
Author

@douglarek thank you for the update. I checked the logs, but the logs only have INFO logs, so we cannot find any search info in the logs. If convenient, could you please set the log level to debug, reproduce the issue and recollect the logs for investigation?

Okay, I'll adjust the log to debug this afternoon and collect it again. Additionally, I will also provide information about birdwatcher. Thank you for paying attention to this issue.

@douglarek
Copy link
Author

the QPS seems to be too low under whatever test case you are testing

Perhaps it is like this, I will first provide some necessary information. I am not sure if it is my posture that is incorrect or an issue with Milvus, so I reserve judgment on this matter.

@douglarek
Copy link
Author

douglarek commented Dec 24, 2024

@douglarek thank you for your quick update. could you also provide the birdwatcher backup file for investigation. Please refer to this doc: https://github.com/milvus-io/birdwatcher to backup etcd backup with birdwatcher

@yanliang567 The following are the provided relevant logs and backup.

milvus log(debug enabled):
milvus-log.tar.gz

birdwatcher backup:
bw_etcd_ALL.241224-182256.bak.gz

@yanliang567
Copy link
Contributor

@douglarek thank you for the update. unfortunately, it seems that the dirdwatcher backup file is broken or not backup successfully. After I loaded the backup file it show 0 collections and 0 segments. Could you please retry to backup a new file for us?
Moreover, I checked the milvus logs and find 2 different types of search requests for the same collection benchmark_v4
the first type was searching without expr and the search params was nprobe
[2024/12/24 09:48:01.170 +00:00] [DEBUG] [proxy/impl.go:3114] ["Search done"] [traceID=8a4f3b706d7e41465513fd15177fe187] [role=proxy] [db=default] [collection=benchmark_v4] [partitions="[]"] [dsl=] [len(PlaceholderGroup)=3084] [OutputFields="[id,scalar_colors,scalar_colors_k,$meta]"] [search_params="[{\"key\":\"anns_field\",\"value\":\"vec\"},{\"key\":\"params\",\"value\":\"{\\\"nprobe\\\":1}\"},{\"key\":\"topk\",\"value\":\"50\"},{\"key\":\"offset\",\"value\":\"0\"},{\"key\":\"metric_type\"},{\"key\":\"ignore_growing\",\"value\":\"false\"}]"] [ConsistencyLevel=Bounded] [useDefaultConsistency=false]
while the 2nd type was searching with expr and the search params was ef. Checked the code you shared at the beginning of this issue, I think the index was IVF_FLAT, so ef is not a valid param.
[2024/12/24 09:51:21.400 +00:00] [DEBUG] [proxy/impl.go:3057] ["Search received"] [traceID=717f231e3400ca548c68a55c20d63919] [role=proxy] [db=default] [collection=benchmark_v4] [partitions="[]"] [dsl="scalar_colors_key==\"red\""] [len(PlaceholderGroup)=3084] [OutputFields="[]"] [search_params="[{\"key\":\"group_by_field\"},{\"key\":\"round_decimal\",\"value\":\"-1\"},{\"key\":\"anns_field\",\"value\":\"vec\"},{\"key\":\"topk\",\"value\":\"10\"},{\"key\":\"params\",\"value\":\"{\\\"ef\\\":100,\\\"for_tuning\\\":false}\"},{\"key\":\"metric_type\",\"value\":\"COSINE\"},{\"key\":\"ignore_growing\",\"value\":\"false\"},{\"key\":\"offset\",\"value\":\"0\"}]"] [ConsistencyLevel=Strong] [useDefaultConsistency=false]

@douglarek
Copy link
Author

douglarek commented Dec 24, 2024

@yanliang567 Before I started debugging, I made a backup (bw_etcd_ALL.241224-073220.bak.gz, perhaps this data is also acceptable?) . during debugging, I was too eager and forcibly used kubectl to delete some pods, causing certain nodes in etcd to be in a kill-triggered state. Birdwatcher exported with errors but still generated files; unexpectedly, the files are unusable.

while the 2nd type was searching with expr and the search params was ef. Checked the code you shared at the beginning of this issue, I think the index was IVF_FLAT, so ef is not a valid param.

yeah, my mistake. my test program previously tested the HNSW index, and this parameter might not have been deleted. Will this parameter affect IVF_FLAT(yes, the index is it!)? Perhaps I should correct this parameter and retest?

I checked the milvus logs and find 2 different types

yes, the first one I queried when loading the collection in attu.

@yanliang567
Copy link
Contributor

it still failed to check the meta info with bw_etcd_ALL.241224-073220.bak.gz :(

@douglarek
Copy link
Author

douglarek commented Dec 25, 2024

it still failed to check the meta info with bw_etcd_ALL.241224-073220.bak.gz :(

Oh, very strange, I did it through birdwatcher's readme, connected to etcd, executed backup, and then kubectl cp to the host. I will try again Or I can try using the original etcd backup command. Can you share the correct backup method and a proper example?

图片

@douglarek
Copy link
Author

@yanliang567 Sorry for the late reply, I ended up using the original etcdctl to back up the data. Not sure if you can use it.

etcdctl --endpoints=xxx.com:2379 get /milvus-dev-next --prefix --write-out=json > milvus-dev-next.json

milvus-dev-next.json

@yanliang567
Copy link
Contributor

/assign @congqixia
please help to take a look

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests

4 participants