Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: [benchmark][cluster] Milvus is stuck and the client has no response #38609

Open
1 task done
wangting0128 opened this issue Dec 20, 2024 · 8 comments
Open
1 task done
Assignees
Labels
kind/bug Issues or changes related a bug test/benchmark benchmark test triage/accepted Indicates an issue or PR is ready to be actively worked on.
Milestone

Comments

@wangting0128
Copy link
Contributor

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version:master-20241219-306e5e68-amd64
- Deployment mode(standalone or cluster):cluster
- MQ type(rocksmq, pulsar or kafka):pulsar    
- SDK version(e.g. pymilvus v2.0.0rc2):2.5.0rc124
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

argo task: fouramf-46vpj

server:

NAME                                                              READY   STATUS      RESTARTS       AGE     IP              NODE         NOMINATED NODE   READINESS GATES
fouramf-qz2xh-28-2434-etcd-0                                      1/1     Running     0              39h     10.104.24.169   4am-node29   <none>           <none>
fouramf-qz2xh-28-2434-etcd-1                                      1/1     Running     0              39h     10.104.27.189   4am-node31   <none>           <none>
fouramf-qz2xh-28-2434-etcd-2                                      1/1     Running     0              39h     10.104.25.170   4am-node30   <none>           <none>
fouramf-qz2xh-28-2434-milvus-datanode-67f777d989-d9479            1/1     Running     0              21h     10.104.33.6     4am-node36   <none>           <none>
fouramf-qz2xh-28-2434-milvus-indexnode-85c5b599d4-2d2cd           1/1     Running     0              21h     10.104.21.12    4am-node24   <none>           <none>
fouramf-qz2xh-28-2434-milvus-indexnode-85c5b599d4-78xnh           1/1     Running     0              21h     10.104.17.74    4am-node23   <none>           <none>
fouramf-qz2xh-28-2434-milvus-indexnode-85c5b599d4-dcrm4           1/1     Running     0              20h     10.104.27.156   4am-node31   <none>           <none>
fouramf-qz2xh-28-2434-milvus-indexnode-85c5b599d4-dgcf6           1/1     Running     0              20h     10.104.23.28    4am-node27   <none>           <none>
fouramf-qz2xh-28-2434-milvus-indexnode-85c5b599d4-nm792           1/1     Running     0              21h     10.104.9.175    4am-node14   <none>           <none>
fouramf-qz2xh-28-2434-milvus-indexnode-85c5b599d4-vvxtf           1/1     Running     0              20h     10.104.25.150   4am-node30   <none>           <none>
fouramf-qz2xh-28-2434-milvus-indexnode-85c5b599d4-zgqpg           1/1     Running     0              21h     10.104.34.232   4am-node37   <none>           <none>
fouramf-qz2xh-28-2434-milvus-indexnode-85c5b599d4-zl7kv           1/1     Running     0              20h     10.104.20.22    4am-node22   <none>           <none>
fouramf-qz2xh-28-2434-milvus-mixcoord-857b5f7d4-czj5b             1/1     Running     0              21h     10.104.21.11    4am-node24   <none>           <none>
fouramf-qz2xh-28-2434-milvus-proxy-5d6c86c556-99bgp               1/1     Running     0              21h     10.104.23.228   4am-node27   <none>           <none>
fouramf-qz2xh-28-2434-milvus-querynode-7575fcff6-wkcxf            1/1     Running     0              21h     10.104.33.7     4am-node36   <none>           <none>
fouramf-qz2xh-28-2434-minio-0                                     1/1     Running     0              3d1h    10.104.24.77    4am-node29   <none>           <none>
fouramf-qz2xh-28-2434-minio-1                                     1/1     Running     0              3d1h    10.104.25.206   4am-node30   <none>           <none>
fouramf-qz2xh-28-2434-minio-2                                     1/1     Running     0              3d1h    10.104.15.36    4am-node20   <none>           <none>
fouramf-qz2xh-28-2434-minio-3                                     1/1     Running     0              3d1h    10.104.27.135   4am-node31   <none>           <none>
fouramf-qz2xh-28-2434-pulsarv3-bookie-0                           1/1     Running     0              3d1h    10.104.25.205   4am-node30   <none>           <none>
fouramf-qz2xh-28-2434-pulsarv3-bookie-1                           1/1     Running     0              3d1h    10.104.27.132   4am-node31   <none>           <none>
fouramf-qz2xh-28-2434-pulsarv3-bookie-2                           1/1     Running     0              3d1h    10.104.24.81    4am-node29   <none>           <none>
fouramf-qz2xh-28-2434-pulsarv3-broker-0                           1/1     Running     0              3d1h    10.104.15.31    4am-node20   <none>           <none>
fouramf-qz2xh-28-2434-pulsarv3-broker-1                           1/1     Running     0              3d1h    10.104.13.106   4am-node16   <none>           <none>
fouramf-qz2xh-28-2434-pulsarv3-proxy-0                            1/1     Running     0              3d1h    10.104.25.197   4am-node30   <none>           <none>
fouramf-qz2xh-28-2434-pulsarv3-proxy-1                            1/1     Running     0              3d1h    10.104.13.104   4am-node16   <none>           <none>
fouramf-qz2xh-28-2434-pulsarv3-recovery-0                         1/1     Running     0              3d1h    10.104.15.34    4am-node20   <none>           <none>
fouramf-qz2xh-28-2434-pulsarv3-zookeeper-0                        1/1     Running     0              3d1h    10.104.24.78    4am-node29   <none>           <none>
fouramf-qz2xh-28-2434-pulsarv3-zookeeper-1                        1/1     Running     0              3d1h    10.104.25.204   4am-node30   <none>           <none>
fouramf-qz2xh-28-2434-pulsarv3-zookeeper-2                        1/1     Running     0              3d1h    10.104.20.24    4am-node22   <none>           <none>

image

The following requests have no response

client logs:

1. [2024-12-20 01:57:21,408 - DEBUG - fouram]: [Base] Params of partition:scene_test_partition_hybrid_search_dior520E hybrid_search: reqs:[{'anns_field': 'float_vector', 'param': {'metric_type': 'L2', 'params': {'ef': 32}}, 'limit': 10, 'expr': None, 'nq': 1}, {'anns_field': 'float_vector_1', 'param': {'metric_type': 'IP', 'params': {'search_list': 30}}, 'limit': 10, 'expr': None, 'nq': 1}, {'anns_field': 'sparse_float_vector', 'param': {'metric_type': 'IP', 'params': {'drop_ratio_search': 0.3}}, 'limit': 30, 'expr': None, 'nq': 1}, {'anns_field': 'bfloat16_vector', 'param': {'metric_type': 'L2', 'params': {'nprobe': 16}}, 'limit': 400, 'expr': None, 'nq': 1}], rerank:{'strategy': 'rrf', 'params': {'k': 60}}, limit:1, timeout:6000, kwargs:{'check_task': 'check_response'} (base.py:868)
[2024-12-20 01:57:21,408 - DEBUG - fouram]: (api_request)  : [Partition.hybrid_search] args: [[<pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e01754d60>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e01754610>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e01778d90>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e01778dc0>], <pymilvus.client.abstract.RRFRanker object at 0x7f4bf504b1f0>, 1, ['*'], 6000, -1], kwargs: {}, [requestId: c009b696-be75-11ef-8acf-7ee0480c3f50] (api_request.py:77)

2. [2024-12-20 01:57:21,411 - DEBUG - fouram]: (api_request)  : [Partition] args: [<Collection>:
-------------
<name>: fouram_WlWdokxT
<description>: 
<schema>: {'auto_id': False, 'description': '', 'fields': [{'name': 'id', 'description': '', 'type': <DataType.INT64: 5>, 'is_primary': True, 'auto_id': False}, {'name': 'float_vector', 'description': '', 'type': <DataType.FLOAT_VECTOR: 101>, 'params': {'dim': 128}}, {'name': 'float_vector_1', 'description': '', 'type': <DataType.FLOAT_VECTOR: 101>, 'params': {'dim': 768}}, {'name': 'sparse_float_vector', 'description': '', 'type': <DataType.SPARSE_FLOAT_VECTOR: 104>}, {'name': 'bfloat16_vector', 'description': '', 'type': <DataType.BFLOAT16_VECTOR: 103>, 'params': {'dim': 256}}, {'name': 'int64_1', 'description': '', 'type': <DataType.INT64: 5>}, {'name': 'varchar_1', 'description': '', 'type': <DataType.VARCHAR: 21>, 'params': {'max_length': 256}}], 'enable_dynamic_field': False}
, 'scene_test_partition_hybrid_search_rCTUELMU', ''], kwargs: {}, [requestId: c00a1e9c-be75-11ef-8acf-7ee0480c3f50] (api_request.py:77)


3. [2024-12-20 01:57:23,434 - DEBUG - fouram]: [Base] Params of partition:scene_test_partition_hybrid_search_pln8a36H hybrid_search: reqs:[{'anns_field': 'float_vector', 'param': {'metric_type': 'L2', 'params': {'ef': 32}}, 'limit': 10, 'expr': None, 'nq': 1}, {'anns_field': 'float_vector_1', 'param': {'metric_type': 'IP', 'params': {'search_list': 30}}, 'limit': 10, 'expr': None, 'nq': 1}, {'anns_field': 'sparse_float_vector', 'param': {'metric_type': 'IP', 'params': {'drop_ratio_search': 0.3}}, 'limit': 30, 'expr': None, 'nq': 1}, {'anns_field': 'bfloat16_vector', 'param': {'metric_type': 'L2', 'params': {'nprobe': 16}}, 'limit': 400, 'expr': None, 'nq': 1}], rerank:{'strategy': 'rrf', 'params': {'k': 60}}, limit:1, timeout:6000, kwargs:{'check_task': 'check_response'} (base.py:868)
[2024-12-20 01:57:23,434 - DEBUG - fouram]: (api_request)  : [Partition.hybrid_search] args: [[<pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e0173aa00>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e01750e80>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e01750100>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e017509d0>], <pymilvus.client.abstract.RRFRanker object at 0x7f4bf504b1f0>, 1, ['*'], 6000, -1], kwargs: {}, [requestId: c13ede74-be75-11ef-8acf-7ee0480c3f50] (api_request.py:77)

4. [2024-12-20 01:57:23,437 - DEBUG - fouram]: [Base] Params of partition:scene_test_partition_hybrid_search_EVUcR0GU hybrid_search: reqs:[{'anns_field': 'float_vector', 'param': {'metric_type': 'L2', 'params': {'ef': 32}}, 'limit': 10, 'expr': None, 'nq': 1}, {'anns_field': 'float_vector_1', 'param': {'metric_type': 'IP', 'params': {'search_list': 30}}, 'limit': 10, 'expr': None, 'nq': 1}, {'anns_field': 'sparse_float_vector', 'param': {'metric_type': 'IP', 'params': {'drop_ratio_search': 0.3}}, 'limit': 30, 'expr': None, 'nq': 1}, {'anns_field': 'bfloat16_vector', 'param': {'metric_type': 'L2', 'params': {'nprobe': 16}}, 'limit': 400, 'expr': None, 'nq': 1}], rerank:{'strategy': 'rrf', 'params': {'k': 60}}, limit:1, timeout:6000, kwargs:{'check_task': 'check_response'} (base.py:868)
[2024-12-20 01:57:23,437 - DEBUG - fouram]: (api_request)  : [Partition.hybrid_search] args: [[<pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e0173eb20>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e0173e130>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e0173e520>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e0173e7f0>], <pymilvus.client.abstract.RRFRanker object at 0x7f4bf504b1f0>, 1, ['*'], 6000, -1], kwargs: {}, [requestId: c13f512e-be75-11ef-8acf-7ee0480c3f50] (api_request.py:77)

5. [2024-12-20 01:57:23,443 - DEBUG - fouram]: (api_request)  : [Collection] args: ['scene_hybrid_search_test_32JqTbe6', {'auto_id': False, 'description': '', 'fields': [{'name': 'id', 'description': '', 'type': <DataType.INT64: 5>, 'is_primary': True, 'auto_id': False}, {'name': 'float_vector', 'description': '', 'type': <DataType.FLOAT_VECTOR: 101>, 'params': {'dim': 128}}, {'name': 'binary_vector_scene_hybrid_search_test_1', 'description': '', 'type': <DataType.BINARY_VECTOR: 100>, 'params': {'dim': 512}}, {'name': 'float16_vector_scene_hybrid_search_test_2', 'description': '', 'type': <DataType.FLOAT16_VECTOR: 102>, 'params': {'dim': 64}}, {'name': 'sparse_float_vector_scene_hybrid_search_test_3', 'description': '', 'type': <DataType.SPARSE_FLOAT_VECTOR: 104>}, {'name': 'int64_1', 'description': '', 'type': <DataType.INT64: 5>}, {'name': 'bool_1', 'description': '', 'type': <DataType.BOOL: 1>}, {'name': 'varchar_1', 'description': '', 'type': <DataType.VARCHAR: 21>, 'params': {'max_length': 256}}], 'enable_dynamic_field': False}, 'default'], kwargs: {'shards_num': 2}, [requestId: c1403008-be75-11ef-8acf-7ee0480c3f50] (api_request.py:77)

6. [2024-12-20 01:57:23,478 - DEBUG - fouram]: [Base] Params of partition:scene_test_partition_hybrid_search_2OpXfVmv hybrid_search: reqs:[{'anns_field': 'float_vector', 'param': {'metric_type': 'L2', 'params': {'ef': 32}}, 'limit': 10, 'expr': None, 'nq': 1}, {'anns_field': 'float_vector_1', 'param': {'metric_type': 'IP', 'params': {'search_list': 30}}, 'limit': 10, 'expr': None, 'nq': 1}, {'anns_field': 'sparse_float_vector', 'param': {'metric_type': 'IP', 'params': {'drop_ratio_search': 0.3}}, 'limit': 30, 'expr': None, 'nq': 1}, {'anns_field': 'bfloat16_vector', 'param': {'metric_type': 'L2', 'params': {'nprobe': 16}}, 'limit': 400, 'expr': None, 'nq': 1}], rerank:{'strategy': 'rrf', 'params': {'k': 60}}, limit:1, timeout:6000, kwargs:{'check_task': 'check_response'} (base.py:868)
[2024-12-20 01:57:23,479 - DEBUG - fouram]: (api_request)  : [Partition.hybrid_search] args: [[<pymilvus.client.abstract.AnnSearchRequest object at 0x7f4d0868c160>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4d0868cca0>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4d0868c700>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4d0868c7c0>], <pymilvus.client.abstract.RRFRanker object at 0x7f4bf504b1f0>, 1, ['*'], 6000, -1], kwargs: {}, [requestId: c1459d86-be75-11ef-8acf-7ee0480c3f50] (api_request.py:77)

7. [2024-12-20 01:57:23,483 - DEBUG - fouram]: [Base] Params of partition:scene_test_partition_hybrid_search_sLCneRiV hybrid_search: reqs:[{'anns_field': 'float_vector', 'param': {'metric_type': 'L2', 'params': {'ef': 32}}, 'limit': 10, 'expr': None, 'nq': 1}, {'anns_field': 'float_vector_1', 'param': {'metric_type': 'IP', 'params': {'search_list': 30}}, 'limit': 10, 'expr': None, 'nq': 1}, {'anns_field': 'sparse_float_vector', 'param': {'metric_type': 'IP', 'params': {'drop_ratio_search': 0.3}}, 'limit': 30, 'expr': None, 'nq': 1}, {'anns_field': 'bfloat16_vector', 'param': {'metric_type': 'L2', 'params': {'nprobe': 16}}, 'limit': 400, 'expr': None, 'nq': 1}], rerank:{'strategy': 'rrf', 'params': {'k': 60}}, limit:1, timeout:6000, kwargs:{'check_task': 'check_response'} (base.py:868)
[2024-12-20 01:57:23,483 - DEBUG - fouram]: (api_request)  : [Partition.hybrid_search] args: [[<pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e0173f400>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e0173f910>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e0173f070>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e0173f0a0>], <pymilvus.client.abstract.RRFRanker object at 0x7f4bf504b1f0>, 1, ['*'], 6000, -1], kwargs: {}, [requestId: c14653ac-be75-11ef-8acf-7ee0480c3f50] (api_request.py:77)

8. [2024-12-20 01:57:23,487 - DEBUG - fouram]: [Base] Params of partition:scene_test_partition_hybrid_search_DRbzmWnb hybrid_search: reqs:[{'anns_field': 'float_vector', 'param': {'metric_type': 'L2', 'params': {'ef': 32}}, 'limit': 10, 'expr': None, 'nq': 1}, {'anns_field': 'float_vector_1', 'param': {'metric_type': 'IP', 'params': {'search_list': 30}}, 'limit': 10, 'expr': None, 'nq': 1}, {'anns_field': 'sparse_float_vector', 'param': {'metric_type': 'IP', 'params': {'drop_ratio_search': 0.3}}, 'limit': 30, 'expr': None, 'nq': 1}, {'anns_field': 'bfloat16_vector', 'param': {'metric_type': 'L2', 'params': {'nprobe': 16}}, 'limit': 400, 'expr': None, 'nq': 1}], rerank:{'strategy': 'rrf', 'params': {'k': 60}}, limit:1, timeout:6000, kwargs:{'check_task': 'check_response'} (base.py:868)
[2024-12-20 01:57:23,487 - DEBUG - fouram]: (api_request)  : [Partition.hybrid_search] args: [[<pymilvus.client.abstract.AnnSearchRequest object at 0x7f4d0868c1f0>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4d0868cb20>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e017483d0>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e01748100>], <pymilvus.client.abstract.RRFRanker object at 0x7f4bf504b1f0>, 1, ['*'], 6000, -1], kwargs: {}, [requestId: c146ef4c-be75-11ef-8acf-7ee0480c3f50] (api_request.py:77)

9. [2024-12-20 01:57:23,537 - DEBUG - fouram]: [Base] Params of partition:scene_test_partition_hybrid_search_zk5IhoeN hybrid_search: reqs:[{'anns_field': 'float_vector', 'param': {'metric_type': 'L2', 'params': {'ef': 32}}, 'limit': 10, 'expr': None, 'nq': 1}, {'anns_field': 'float_vector_1', 'param': {'metric_type': 'IP', 'params': {'search_list': 30}}, 'limit': 10, 'expr': None, 'nq': 1}, {'anns_field': 'sparse_float_vector', 'param': {'metric_type': 'IP', 'params': {'drop_ratio_search': 0.3}}, 'limit': 30, 'expr': None, 'nq': 1}, {'anns_field': 'bfloat16_vector', 'param': {'metric_type': 'L2', 'params': {'nprobe': 16}}, 'limit': 400, 'expr': None, 'nq': 1}], rerank:{'strategy': 'rrf', 'params': {'k': 60}}, limit:1, timeout:6000, kwargs:{'check_task': 'check_response'} (base.py:868)
[2024-12-20 01:57:23,537 - DEBUG - fouram]: (api_request)  : [Partition.hybrid_search] args: [[<pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e0174ea30>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e0174e2e0>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e0174e6d0>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e0174eac0>], <pymilvus.client.abstract.RRFRanker object at 0x7f4bf504b1f0>, 1, ['*'], 6000, -1], kwargs: {}, [requestId: c14e7cc6-be75-11ef-8acf-7ee0480c3f50] (api_request.py:77)

10. [2024-12-20 01:57:23,538 - DEBUG - fouram]: (api_request)  : [load_state] args: ['scene_hybrid_search_test_KpWfACG6', None, 'default'], kwargs: {}, [requestId: c14eb4ca-be75-11ef-8acf-7ee0480c3f50] (api_request.py:77)

11. [2024-12-20 01:57:23,541 - DEBUG - fouram]: [Base] Params of concurrent_hybrid_search reqs: [{'anns_field': 'float_vector', 'param': {'metric_type': 'L2', 'params': {'ef': 32}}, 'limit': 10, 'expr': 'int64_1 > 100000', 'nq': 1}, {'anns_field': 'float_vector_1', 'param': {'metric_type': 'IP', 'params': {'search_list': 30}}, 'limit': 10, 'expr': 'id < 900000', 'nq': 1}, {'anns_field': 'sparse_float_vector', 'param': {'metric_type': 'IP', 'params': {'drop_ratio_search': 0.3}}, 'limit': 30, 'expr': 'varchar_1 > "1"', 'nq': 1}, {'anns_field': 'bfloat16_vector', 'param': {'metric_type': 'L2', 'params': {'nprobe': 16}}, 'limit': 400, 'expr': None, 'nq': 1}], {'rerank': {'strategy': 'weighted', 'params': {'weights': [0.85, 0.95, 0.51, 0.32]}}, 'limit': 100, 'output_fields': ['*'], 'ignore_growing': False, 'guarantee_timestamp': None, 'partition_names': ['_default'], 'timeout': 6000} (base.py:882)
[2024-12-20 01:57:23,541 - DEBUG - fouram]: (api_request)  : [Collection.hybrid_search] args: [[<pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e0173a130>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e0173a1f0>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e0173adc0>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e0173ad00>], <pymilvus.client.abstract.WeightedRanker object at 0x7f4bf504b7f0>, 100, ['_default'], ['*'], 6000, -1], kwargs: {}, [requestId: c14f14ce-be75-11ef-8acf-7ee0480c3f50] (api_request.py:77)

12. [2024-12-20 01:57:32,792 - DEBUG - fouram]: [Base] Params of partition:scene_test_partition_hybrid_search_GnMvTND0 hybrid_search: reqs:[{'anns_field': 'float_vector', 'param': {'metric_type': 'L2', 'params': {'ef': 32}}, 'limit': 10, 'expr': None, 'nq': 1}, {'anns_field': 'float_vector_1', 'param': {'metric_type': 'IP', 'params': {'search_list': 30}}, 'limit': 10, 'expr': None, 'nq': 1}, {'anns_field': 'sparse_float_vector', 'param': {'metric_type': 'IP', 'params': {'drop_ratio_search': 0.3}}, 'limit': 30, 'expr': None, 'nq': 1}, {'anns_field': 'bfloat16_vector', 'param': {'metric_type': 'L2', 'params': {'nprobe': 16}}, 'limit': 400, 'expr': None, 'nq': 1}], rerank:{'strategy': 'rrf', 'params': {'k': 60}}, limit:1, timeout:6000, kwargs:{'check_task': 'check_response'} (base.py:868)
[2024-12-20 01:57:32,792 - DEBUG - fouram]: (api_request)  : [Partition.hybrid_search] args: [[<pymilvus.client.abstract.AnnSearchRequest object at 0x7f4d97da3700>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e01322af0>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e01322040>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e01322820>], <pymilvus.client.abstract.RRFRanker object at 0x7f4bf504b1f0>, 1, ['*'], 6000, -1], kwargs: {}, [requestId: c6d2b90a-be75-11ef-8acf-7ee0480c3f50] (api_request.py:77)

13. [2024-12-20 01:57:34,091 - DEBUG - fouram]: [Base] Params of partition:scene_test_partition_hybrid_search_znlSqRpc hybrid_search: reqs:[{'anns_field': 'float_vector', 'param': {'metric_type': 'L2', 'params': {'ef': 32}}, 'limit': 10, 'expr': None, 'nq': 1}, {'anns_field': 'float_vector_1', 'param': {'metric_type': 'IP', 'params': {'search_list': 30}}, 'limit': 10, 'expr': None, 'nq': 1}, {'anns_field': 'sparse_float_vector', 'param': {'metric_type': 'IP', 'params': {'drop_ratio_search': 0.3}}, 'limit': 30, 'expr': None, 'nq': 1}, {'anns_field': 'bfloat16_vector', 'param': {'metric_type': 'L2', 'params': {'nprobe': 16}}, 'limit': 400, 'expr': None, 'nq': 1}], rerank:{'strategy': 'rrf', 'params': {'k': 60}}, limit:1, timeout:6000, kwargs:{'check_task': 'check_response'} (base.py:868)
[2024-12-20 01:57:34,091 - DEBUG - fouram]: (api_request)  : [Partition.hybrid_search] args: [[<pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e01322a00>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e0174b700>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e017547f0>, <pymilvus.client.abstract.AnnSearchRequest object at 0x7f4e017542b0>], <pymilvus.client.abstract.RRFRanker object at 0x7f4bf504b1f0>, 1, ['*'], 6000, -1], kwargs: {}, [requestId: c798f7be-be75-11ef-8acf-7ee0480c3f50] (api_request.py:77)

14. [2024-12-20 00:17:23,486 - DEBUG - fouram]: (api_request)  : [Collection] args: ['scene_test_fcIPtZco', {'auto_id': False, 'description': '', 'fields': [{'name': 'id', 'description': '', 'type': <DataType.INT64: 5>, 'is_primary': True, 'auto_id': False}, {'name': 'float_vector', 'description': '', 'type': <DataType.FLOAT_VECTOR: 101>, 'params': {'dim': 128}}], 'enable_dynamic_field': False}, 'default'], kwargs: {'shards_num': 2}, [requestId: c8ff5d04-be67-11ef-8acf-7ee0480c3f50] (api_request.py:77)

15. [2024-12-20 00:17:32,789 - DEBUG - fouram]: (api_request)  : [Collection] args: ['scene_test_2cEmF7ST', {'auto_id': False, 'description': '', 'fields': [{'name': 'id', 'description': '', 'type': <DataType.INT64: 5>, 'is_primary': True, 'auto_id': False}, {'name': 'float_vector', 'description': '', 'type': <DataType.FLOAT_VECTOR: 101>, 'params': {'dim': 128}}], 'enable_dynamic_field': False}, 'default'], kwargs: {'shards_num': 2}, [requestId: ce8aff76-be67-11ef-8acf-7ee0480c3f50] (api_request.py:77)

16. [2024-12-19 22:37:23,492 - DEBUG - fouram]: (api_request)  : [Collection.query] args: ['int64_1 > -1 &&   11013559 < id < 11013559 + 1000000', ['*'], ['_default'], 6000], kwargs: {'limit': 10}, [requestId: d0b8fae0-be59-11ef-8acf-7ee0480c3f50] (api_request.py:77)

17. [2024-12-19 22:37:23,674 - DEBUG - fouram]: (api_request)  : [Collection.query] args: ['int64_1 > -1 &&   3099030 < id < 3099030 + 1000000', ['*'], ['_default'], 6000], kwargs: {'limit': 10}, [requestId: d0d4af92-be59-11ef-8acf-7ee0480c3f50] (api_request.py:77)

18. [2024-12-19 22:37:25,927 - DEBUG - fouram]: (api_request)  : [Index] args: [<Collection>:
-------------
<name>: scene_hybrid_search_test_0gCWkq1i
<description>: 
<schema>: {'auto_id': False, 'description': '', 'fields': [{'name': 'id', 'description': '', 'type': <DataType.INT64: 5>, 'is_primary': True, 'auto_id': False}, {'name': 'float_vector', 'description': '', 'type': <DataType.FLOAT_VECTOR: 101>, 'params': {'dim': 128}}, {'name': 'binary_vector_scene_hybrid_search_test_1', 'description': '', 'type': <DataType.BINARY_VECTOR: 100>, 'params': {'dim': 512}}, {'name': 'float16_vector_scene_hybrid_search_test_2', 'description': '', 'type': <DataType.FLOAT16_VECTOR: 102>, 'params': {'dim': 64}}, {'name': 'sparse_float_vector_scene_hybrid_search_test_3', 'description': '', 'type': <DataType.SPARSE_FLOAT_VECTOR: 104>}, {'name': 'int64_1', 'description': '', 'type': <DataType.INT64: 5>}, {'name': 'bool_1', 'description': '', 'type': <DataType.BOOL: 1>}, {'name': 'varchar_1', 'description': '', 'type': <DataType.VARCHAR: 21>, 'params': {'max_length': 256}}], 'enable_dynamic_field': False}
, 'varchar_1', {'index_type': 'INVERTED'}], kwargs: {}, [requestId: d22c93a0-be59-11ef-8acf-7ee0480c3f50] (api_request.py:77)


19. [2024-12-19 22:37:28,701 - DEBUG - fouram]: (api_request)  : [Index] args: [<Collection>:
-------------
<name>: scene_test_0OLcYjs3
<description>: 
<schema>: {'auto_id': False, 'description': '', 'fields': [{'name': 'id', 'description': '', 'type': <DataType.INT64: 5>, 'is_primary': True, 'auto_id': False}, {'name': 'float_vector', 'description': '', 'type': <DataType.FLOAT_VECTOR: 101>, 'params': {'dim': 128}}], 'enable_dynamic_field': False}
, 'float_vector', {'index_type': 'IVF_SQ8', 'metric_type': 'L2', 'params': {'nlist': 2048}}], kwargs: {}, [requestId: d3d3dcea-be59-11ef-8acf-7ee0480c3f50] (api_request.py:77)

20. [2024-12-19 22:37:28,702 - DEBUG - fouram]: (api_request)  : [Index] args: [<Collection>:
-------------
<name>: scene_test_INZX5wLa
<description>: 
<schema>: {'auto_id': False, 'description': '', 'fields': [{'name': 'id', 'description': '', 'type': <DataType.INT64: 5>, 'is_primary': True, 'auto_id': False}, {'name': 'float_vector', 'description': '', 'type': <DataType.FLOAT_VECTOR: 101>, 'params': {'dim': 128}}], 'enable_dynamic_field': False}
, 'float_vector', {'index_type': 'IVF_SQ8', 'metric_type': 'L2', 'params': {'nlist': 2048}}], kwargs: {}, [requestId: d3d3f658-be59-11ef-8acf-7ee0480c3f50] (api_request.py:77)

Expected Behavior

No response

Steps To Reproduce

1. re-build index on an exited collection with 20m data
   - HNSW: float_vector
   - DISKANN: float_vector_1
   - SPARSE_INVERTED_INDEX: sparse_float_vector
   - IVF_SQ8: bfloat16_vector
   - INVERTED: int64_1, varchar_1
2. load collection
3. concurrent requests
   - scene_hybrid_search_test
     (collection: create->insert->flush->index->load->hybrid_search->drop)
   - scene_test
     (collection: create->insert->flush->index->drop)
   - scene_test_partition_hybrid_search
     (partition: create->insert->flush->index again->load->hybrid_search->release->hybrid_search failed->drop)
   - search
   - hybrid_search
   - query

Milvus Log

No response

Anything else?

server config:

    extraConfigFiles:
      user.yaml: |+
        indexCoord:
          scheduler:
            interval: 1
        queryNode:
          mmap:
            vectorField: true
            vectorIndex: true
            scalarField: true
            scalarIndex: true
    queryNode:
      resources:
        limits:
          cpu: '32'
          memory: 32Gi
        requests:
          cpu: '16'
          memory: 32Gi
      replicas: 1
      nodeSelector:
        node-role/nvme: 'true'
    indexNode:
      resources:
        limits:
          cpu: '4.0'
          memory: 16Gi
        requests:
          cpu: '2.0'
          memory: 4Gi
      replicas: 8
    dataNode:
      resources:
        limits:
          cpu: '2.0'
          memory: 16Gi
        requests:
          cpu: '2.0'
          memory: 5Gi

client config: fouramf-client-all-vector-types-dql-ddl

    dataset_params:
      metric_type: L2
      dim: 128
      scalars_index:
        int64_1:
          index_type: INVERTED
        varchar_1:
          index_type: INVERTED
      vectors_index:
        float_vector_1:
          index_type: DISKANN
          index_param: {}
          metric_type: IP
        sparse_float_vector:
          index_type: SPARSE_INVERTED_INDEX
          index_param:
            drop_ratio_build: 0.2
          metric_type: IP
        bfloat16_vector:
          index_type: IVF_SQ8
          index_param:
            nlist: 2048
          metric_type: L2
      scalars_params:
        float_vector_1:
          params:
            dim: 768
          other_params:
            dataset: laion2b_multi
            column_name: float32_vector
        sparse_float_vector:
          other_params:
            dim: 10000
            sparse_range:
            - 1
            - 20
        bfloat16_vector:
          params:
            dim: 256
      dataset_name: sift
      dataset_size: 20m
      ni_per: 10000
    collection_params:
      other_fields:
      - float_vector_1
      - sparse_float_vector
      - bfloat16_vector
      - int64_1
      - varchar_1
      shards_num: 2
    index_params:
      index_type: HNSW
      index_param:
        M: 8
        efConstruction: 200
    concurrent_params:
      concurrent_number: 20
      during_time: 24h
      interval: 20
    concurrent_tasks:
    - type: scene_hybrid_search_test
      weight: 1
      params:
        nq: 2
        top_k: 5
        reqs:
        - search_param:
            nprobe: 128
          anns_field: float_vector
          expr: bool_1 == True
          top_k: 100
        - search_param:
            nprobe: 32
          anns_field: binary_vector_scene_hybrid_search_test_1
          expr: bool_1 != True
          top_k: 10
        - search_param:
            search_list: 30
          anns_field: float16_vector_scene_hybrid_search_test_2
          expr: int64_1 >= 1500
          top_k: 5
        - search_param:
            drop_ratio_search: 0.1
          anns_field: sparse_float_vector_scene_hybrid_search_test_3
          expr: varchar_1 like "1%"
          top_k: 10
        rerank:
          RRFRanker: []
        output_fields:
        - "*"
        timeout: 600
        random_data: true
        dataset: local
        dim: 128
        shards_num: 2
        data_size: 3000
        nb: 3000
        index_type: IVF_SQ8
        index_param:
          nlist: 2048
        metric_type: L2
        other_fields:
        - binary_vector_scene_hybrid_search_test_1
        - float16_vector_scene_hybrid_search_test_2
        - sparse_float_vector_scene_hybrid_search_test_3
        - int64_1
        - bool_1
        - varchar_1
        replica_number: 1
        scalars_params:
          binary_vector_scene_hybrid_search_test_1:
            params:
              dim: 512
            other_params:
              dataset: binary
          float16_vector_scene_hybrid_search_test_2:
            params:
              dim: 64
        scalars_index:
          int64_1: {}
          bool_1:
            index_type: BITMAP
          varchar_1:
            index_type: INVERTED
        vectors_index:
          binary_vector_scene_hybrid_search_test_1:
            index_type: BIN_IVF_FLAT
            index_param:
              nlist: 2048
            metric_type: JACCARD
          float16_vector_scene_hybrid_search_test_2:
            index_type: DISKANN
            index_param: {}
            metric_type: IP
          sparse_float_vector_scene_hybrid_search_test_3:
            index_type: SPARSE_WAND
            index_param:
              drop_ratio_build: 0.2
            metric_type: IP
        hybrid_search_counts: 10
    - type: scene_test
      weight: 1
      params:
        dim: 128
        data_size: 3000
        nb: 3000
        index_type: IVF_SQ8
        index_param:
          nlist: 2048
        metric_type: L2
    - type: scene_test_partition_hybrid_search
      weight: 1
      params:
        nq: 1
        top_k: 1
        reqs:
        - search_param:
            ef: 32
          anns_field: float_vector
          top_k: 10
        - search_param:
            search_list: 30
          anns_field: float_vector_1
          top_k: 10
        - search_param:
            drop_ratio_search: 0.3
          anns_field: sparse_float_vector
          top_k: 30
        - search_param:
            nprobe: 16
          anns_field: bfloat16_vector
          top_k: 400
        rerank:
          RRFRanker: []
        output_fields:
        - "*"
        timeout: 6000
        random_data: true
        hybrid_search_counts: 10
        data_size: 3000
        ni: 3000
    - type: search
      weight: 1
      params:
        nq: 1000
        top_k: 1
        search_param:
          nprobe: 1000
        expr: int64_1 >= 0
        timeout: 6000
        random_data: true
        partition_names:
        - _default
    - type: hybrid_search
      weight: 1
      params:
        nq: 1
        top_k: 100
        reqs:
        - search_param:
            ef: 32
          anns_field: float_vector
          expr: int64_1 > 100000
          top_k: 10
        - search_param:
            search_list: 30
          anns_field: float_vector_1
          expr: id < 900000
          top_k: 10
        - search_param:
            drop_ratio_search: 0.3
          anns_field: sparse_float_vector
          expr: varchar_1 > "1"
          top_k: 30
        - search_param:
            nprobe: 16
          anns_field: bfloat16_vector
          top_k: 400
        rerank:
          WeightedRanker:
          - 0.85
          - 0.95
          - 0.51
          - 0.32
        output_fields:
        - "*"
        partition_names:
        - _default
        timeout: 6000
        random_data: true
    - type: query
      weight: 1
      params:
        expr: 'int64_1 > -1 && '
        output_fields:
        - "*"
        partition_names:
        - _default
        limit: 10
        timeout: 6000
        custom_expr: " {0} < id < {0} + 1000000"
        custom_range:
        - 0
        - 20000000
@wangting0128 wangting0128 added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. test/benchmark benchmark test labels Dec 20, 2024
@wangting0128 wangting0128 added this to the 2.5.0 milestone Dec 20, 2024
@yanliang567
Copy link
Contributor

/assign @sunby
/unassign

@sre-ci-robot sre-ci-robot assigned sunby and unassigned yanliang567 Dec 20, 2024
@yanliang567 yanliang567 added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 20, 2024
@yanliang567 yanliang567 modified the milestones: 2.5.0, 2.5.1 Dec 24, 2024
@xiaofan-luan
Copy link
Collaborator

/assign @aoiasd
please help on this

@aoiasd
Copy link
Contributor

aoiasd commented Dec 27, 2024

Situation start at 19 15:58, and we could see proxy start report disk not found at same time.
60a6a571-661b-47e7-83fb-f43ae529cf79
So may disk has some problem.
But proxy disk error not the mean reason cause stuck.
We could find that some channel like by-dev-rootcoord-dml_1_454664015923904872v1, can't fetch message from dispatcher and cause query or search stuck and timeout (1h30min, seems too long, some error?)
But some other channel still work success. Like by-dev-rootcoord-dml_1_454664015923904872v0
image

@aoiasd
Copy link
Contributor

aoiasd commented Dec 27, 2024

But rootcoord works success, so seems dispatcher or msgstream has some problem.

@aoiasd
Copy link
Contributor

aoiasd commented Dec 27, 2024

Another question was that our querynode tt lag metric report by sub time now and msg tt and label it by collection id, this will cause it looks like healthy when one channel consume stuck but others still work.
We should report msg tt and calculate tt lag by prometheus, fix later.

@yanliang567 yanliang567 modified the milestones: 2.5.1, 2.5.2 Dec 30, 2024
@wangting0128
Copy link
Contributor Author

wangting0128 commented Dec 30, 2024

different case,same problem

argo task:weekly-stab-1735430400
image: 2.4-20241227-2f208ebc-amd64

server:

[2024-12-30 03:46:30,176 -  INFO - fouram]: [Base] Deploy initial state: 
I1229 00:07:08.357917    2979 request.go:665] Waited for 1.174095807s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/policy/v1beta1?timeout=32s
I1229 00:07:18.556301    2979 request.go:665] Waited for 11.365960559s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/flowcontrol.apiserver.k8s.io/v1beta2?timeout=32s
I1229 00:07:28.557313    2979 request.go:665] Waited for 8.199023965s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/policy/v1beta1?timeout=32s
NAME                                                              READY   STATUS                   RESTARTS        AGE     IP              NODE         NOMINATED NODE   READINESS GATES
weekly-stab-17330400-1-48-7639-etcd-0                             1/1     Running                  0               4m37s   10.104.18.33    4am-node25   <none>           <none>
weekly-stab-17330400-1-48-7639-etcd-1                             1/1     Running                  0               4m36s   10.104.21.143   4am-node24   <none>           <none>
weekly-stab-17330400-1-48-7639-etcd-2                             1/1     Running                  0               4m36s   10.104.17.81    4am-node23   <none>           <none>
weekly-stab-17330400-1-48-7639-milvus-datanode-7666999cd6-6htbw   1/1     Running                  4 (3m3s ago)    4m40s   10.104.9.112    4am-node14   <none>           <none>
weekly-stab-17330400-1-48-7639-milvus-indexnode-5bb45d4d7f7j2xv   1/1     Running                  4 (3m8s ago)    4m40s   10.104.16.172   4am-node21   <none>           <none>
weekly-stab-17330400-1-48-7639-milvus-mixcoord-df5997f5d-2vcz7    1/1     Running                  4 (3m6s ago)    4m40s   10.104.16.171   4am-node21   <none>           <none>
weekly-stab-17330400-1-48-7639-milvus-proxy-dc75455cb-fklsg       1/1     Running                  4 (3m8s ago)    4m40s   10.104.16.170   4am-node21   <none>           <none>
weekly-stab-17330400-1-48-7639-milvus-querynode-6bbd89cdf7zr2cf   1/1     Running                  4 (3m3s ago)    4m40s   10.104.32.203   4am-node39   <none>           <none>
weekly-stab-17330400-1-48-7639-minio-0                            1/1     Running                  0               4m37s   10.104.17.79    4am-node23   <none>           <none>
weekly-stab-17330400-1-48-7639-minio-1                            1/1     Running                  0               4m37s   10.104.23.60    4am-node27   <none>           <none>
weekly-stab-17330400-1-48-7639-minio-2                            1/1     Running                  0               4m36s   10.104.21.142   4am-node24   <none>           <none>
weekly-stab-17330400-1-48-7639-minio-3                            1/1     Running                  0               4m36s   10.104.18.34    4am-node25   <none>           <none>
weekly-stab-17330400-1-48-7639-pulsarv3-bookie-0                  1/1     Running                  0               4m37s   10.104.18.31    4am-node25   <none>           <none>
weekly-stab-17330400-1-48-7639-pulsarv3-bookie-1                  1/1     Running                  0               4m37s   10.104.23.61    4am-node27   <none>           <none>
weekly-stab-17330400-1-48-7639-pulsarv3-bookie-2                  1/1     Running                  0               4m36s   10.104.21.144   4am-node24   <none>           <none>
weekly-stab-17330400-1-48-7639-pulsarv3-bookie-init-7sc57         0/1     Completed                0               4m40s   10.104.9.108    4am-node14   <none>           <none>
weekly-stab-17330400-1-48-7639-pulsarv3-broker-0                  1/1     Running                  0               4m38s   10.104.14.62    4am-node18   <none>           <none>
weekly-stab-17330400-1-48-7639-pulsarv3-broker-1                  1/1     Running                  0               4m37s   10.104.34.54    4am-node37   <none>           <none>
weekly-stab-17330400-1-48-7639-pulsarv3-proxy-0                   1/1     Running                  0               4m38s   10.104.14.61    4am-node18   <none>           <none>
weekly-stab-17330400-1-48-7639-pulsarv3-proxy-1                   1/1     Running                  0               4m38s   10.104.13.201   4am-node16   <none>           <none>
weekly-stab-17330400-1-48-7639-pulsarv3-pulsar-init-hhggt         0/1     Completed                0               4m40s   10.104.9.110    4am-node14   <none>           <none>
weekly-stab-17330400-1-48-7639-pulsarv3-recovery-0                1/1     Running                  0               4m37s   10.104.13.202   4am-node16   <none>           <none>
weekly-stab-17330400-1-48-7639-pulsarv3-zookeeper-0               1/1     Running                  0               4m39s   10.104.34.99    4am-node37   <none>           <none>
weekly-stab-17330400-1-48-7639-pulsarv3-zookeeper-1               1/1     Running                  0               4m38s   10.104.21.138   4am-node24   <none>           <none>
weekly-stab-17330400-1-48-7639-pulsarv3-zookeeper-2               1/1     Running                  0               4m37s   10.104.17.78    4am-node23   <none>           <none> (base.py:261)
[2024-12-30 03:46:30,176 -  INFO - fouram]: [Cmd Exe]  kubectl get pods  -n qa-milvus  -o wide | grep -E 'NAME|weekly-stab-17330400-1-48-7639-milvus|weekly-stab-17330400-1-48-7639-minio|weekly-stab-17330400-1-48-7639-etcd|weekly-stab-17330400-1-48-7639-pulsar|weekly-stab-17330400-1-48-7639-zookeeper|weekly-stab-17330400-1-48-7639-kafka|weekly-stab-17330400-1-48-7639-log|weekly-stab-17330400-1-48-7639-tikv'  (util_cmd.py:14)
[2024-12-30 03:46:59,851 -  INFO - fouram]: [CliClient] pod details of release(weekly-stab-17330400-1-48-7639): 
 I1230 03:46:31.419993    3110 request.go:665] Waited for 1.166229202s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/tekton.dev/v1alpha1?timeout=32s
I1230 03:46:41.420380    3110 request.go:665] Waited for 11.165672193s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/helm.toolkit.fluxcd.io/v2beta2?timeout=32s
I1230 03:46:51.619900    3110 request.go:665] Waited for 8.197592108s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/networking.k8s.io/v1?timeout=32s
NAME                                                              READY   STATUS                   RESTARTS         AGE     IP              NODE         NOMINATED NODE   READINESS GATES
weekly-stab-17330400-1-48-7639-etcd-0                             1/1     Running                  0                27h     10.104.18.33    4am-node25   <none>           <none>
weekly-stab-17330400-1-48-7639-etcd-1                             1/1     Running                  0                27h     10.104.21.143   4am-node24   <none>           <none>
weekly-stab-17330400-1-48-7639-etcd-2                             1/1     Running                  0                27h     10.104.17.81    4am-node23   <none>           <none>
weekly-stab-17330400-1-48-7639-milvus-datanode-7666999cd6-6htbw   1/1     Running                  4 (27h ago)      27h     10.104.9.112    4am-node14   <none>           <none>
weekly-stab-17330400-1-48-7639-milvus-indexnode-5bb45d4d7f7j2xv   1/1     Running                  4 (27h ago)      27h     10.104.16.172   4am-node21   <none>           <none>
weekly-stab-17330400-1-48-7639-milvus-mixcoord-df5997f5d-2vcz7    1/1     Running                  4 (27h ago)      27h     10.104.16.171   4am-node21   <none>           <none>
weekly-stab-17330400-1-48-7639-milvus-proxy-dc75455cb-fklsg       1/1     Running                  4 (27h ago)      27h     10.104.16.170   4am-node21   <none>           <none>
weekly-stab-17330400-1-48-7639-milvus-querynode-6bbd89cdf7zr2cf   1/1     Running                  4 (27h ago)      27h     10.104.32.203   4am-node39   <none>           <none>
weekly-stab-17330400-1-48-7639-minio-0                            1/1     Running                  0                27h     10.104.17.79    4am-node23   <none>           <none>
weekly-stab-17330400-1-48-7639-minio-1                            0/1     ContainerCreating        0                24h     <none>          4am-node27   <none>           <none>
weekly-stab-17330400-1-48-7639-minio-2                            1/1     Running                  0                27h     10.104.21.142   4am-node24   <none>           <none>
weekly-stab-17330400-1-48-7639-minio-3                            1/1     Running                  0                27h     10.104.18.34    4am-node25   <none>           <none>
weekly-stab-17330400-1-48-7639-pulsarv3-bookie-0                  1/1     Running                  0                27h     10.104.18.31    4am-node25   <none>           <none>
weekly-stab-17330400-1-48-7639-pulsarv3-bookie-1                  1/1     Running                  0                24h     10.104.23.69    4am-node27   <none>           <none>
weekly-stab-17330400-1-48-7639-pulsarv3-bookie-2                  1/1     Running                  0                27h     10.104.21.144   4am-node24   <none>           <none>
weekly-stab-17330400-1-48-7639-pulsarv3-bookie-init-7sc57         0/1     Completed                0                27h     10.104.9.108    4am-node14   <none>           <none>
weekly-stab-17330400-1-48-7639-pulsarv3-broker-0                  1/1     Running                  0                27h     10.104.14.62    4am-node18   <none>           <none>
weekly-stab-17330400-1-48-7639-pulsarv3-broker-1                  1/1     Running                  0                27h     10.104.34.54    4am-node37   <none>           <none>
weekly-stab-17330400-1-48-7639-pulsarv3-proxy-0                   1/1     Running                  0                27h     10.104.14.61    4am-node18   <none>           <none>
weekly-stab-17330400-1-48-7639-pulsarv3-proxy-1                   1/1     Running                  0                27h     10.104.13.201   4am-node16   <none>           <none>
weekly-stab-17330400-1-48-7639-pulsarv3-pulsar-init-hhggt         0/1     Completed                0                27h     10.104.9.110    4am-node14   <none>           <none>
weekly-stab-17330400-1-48-7639-pulsarv3-recovery-0                1/1     Running                  0                27h     10.104.13.202   4am-node16   <none>           <none>
weekly-stab-17330400-1-48-7639-pulsarv3-zookeeper-0               1/1     Running                  0                27h     10.104.34.99    4am-node37   <none>           <none>
weekly-stab-17330400-1-48-7639-pulsarv3-zookeeper-1               1/1     Running                  0                27h     10.104.21.138   4am-node24   <none>           <none>
weekly-stab-17330400-1-48-7639-pulsarv3-zookeeper-2               1/1     Running                  0                27h     10.104.17.78    4am-node23   <none>           <none>

client requests have no responses logs:

[2024-12-29 02:47:02,781 - DEBUG - fouram]: (api_request)  : [Collection.flush] args: [], kwargs: {'timeout': None}, [requestId: 2ecac802-c58f-11ef-856c-9e7ed1f62469] (api_request.py:77)

[2024-12-29 02:47:05,905 - DEBUG - fouram]: (api_request)  : [Collection.flush] args: [], kwargs: {'timeout': None}, [requestId: 30a77580-c58f-11ef-856c-9e7ed1f62469] (api_request.py:77)

[2024-12-29 02:47:25,433 - DEBUG - fouram]: (api_request)  : [Collection.flush] args: [], kwargs: {'timeout': None}, [requestId: 3c4b2de6-c58f-11ef-856c-9e7ed1f62469] (api_request.py:77)

[2024-12-29 02:47:29,084 - DEBUG - fouram]: (api_request)  : [Collection.flush] args: [], kwargs: {'timeout': None}, [requestId: 3e786458-c58f-11ef-856c-9e7ed1f62469] (api_request.py:77)

[2024-12-29 02:47:29,523 - DEBUG - fouram]: (api_request)  : [Collection.flush] args: [], kwargs: {'timeout': None}, [requestId: 3ebb5a1a-c58f-11ef-856c-9e7ed1f62469] (api_request.py:77)

[2024-12-29 02:54:42,382 - DEBUG - fouram]: (api_request)  : [Collection.load] args: [None, 1, None], kwargs: {}, [requestId: 40bc7136-c590-11ef-856c-9e7ed1f62469] (api_request.py:77)

[2024-12-29 02:54:42,383 - DEBUG - fouram]: (api_request)  : [Collection.load] args: [None, 1, None], kwargs: {}, [requestId: 40bc8e00-c590-11ef-856c-9e7ed1f62469] (api_request.py:77)

[2024-12-29 02:54:42,384 - DEBUG - fouram]: (api_request)  : [Collection.load] args: [None, 1, None], kwargs: {}, [requestId: 40bcae6c-c590-11ef-856c-9e7ed1f62469] (api_request.py:77)

[2024-12-29 02:54:42,385 - DEBUG - fouram]: (api_request)  : [Collection.load] args: [None, 1, None], kwargs: {}, [requestId: 40bccfdc-c590-11ef-856c-9e7ed1f62469] (api_request.py:77)

[2024-12-29 02:47:36,646 - DEBUG - fouram]: (api_request)  : [Collection.flush] args: [], kwargs: {'timeout': None}, [requestId: 42fa4bae-c58f-11ef-856c-9e7ed1f62469] (api_request.py:77)

[2024-12-29 02:47:47,881 - DEBUG - fouram]: (api_request)  : [Collection.flush] args: [], kwargs: {'timeout': None}, [requestId: 49ac7972-c58f-11ef-856c-9e7ed1f62469] (api_request.py:77)

[2024-12-29 02:47:48,859 - DEBUG - fouram]: (api_request)  : [Collection.flush] args: [], kwargs: {'timeout': None}, [requestId: 4a41bec4-c58f-11ef-856c-9e7ed1f62469] (api_request.py:77)

[2024-12-29 02:48:09,522 - DEBUG - fouram]: (api_request)  : [Collection.flush] args: [], kwargs: {'timeout': None}, [requestId: 5692a38c-c58f-11ef-856c-9e7ed1f62469] (api_request.py:77)

[2024-12-29 03:31:27,473 - DEBUG - fouram]: (api_request)  : [Collection.insert] args: <Collection.insert fields: 3, length: 1, content: [ [ `type<class 'int'>, dtype<>` 1010 ... ], [ `type<class 'dict'>, dtype<>` {140817: 0.4555932375689 ... ], [ `type<class 'numpy.float32'>, dtype<float32>` 1010.0 ... ] ]>, [None], kwargs: {'timeout': None}, [requestId: 63129486-c595-11ef-856c-9e7ed1f62469] (api_request.py:77)

[2024-12-29 02:48:40,347 - DEBUG - fouram]: (api_request)  : [Collection.flush] args: [], kwargs: {'timeout': None}, [requestId: 68f24906-c58f-11ef-856c-9e7ed1f62469] (api_request.py:77)

[2024-12-29 02:48:40,362 - DEBUG - fouram]: (api_request)  : [Collection.flush] args: [], kwargs: {'timeout': None}, [requestId: 68f48b80-c58f-11ef-856c-9e7ed1f62469] (api_request.py:77)

[2024-12-29 02:56:56,955 - DEBUG - fouram]: (api_request)  : [Collection] args: ['scene_search_test_NKnEiSxp', {'auto_id': False, 'description': '', 'fields': [{'name': 'id', 'description': '', 'type': <DataType.INT64: 5>, 'is_primary': True, 'auto_id': False}, {'name': 'sparse_float_vector', 'description': '', 'type': <DataType.SPARSE_FLOAT_VECTOR: 104>}], 'enable_dynamic_field': False}, 'default'], kwargs: {'shards_num': 2}, [requestId: 90f29680-c590-11ef-856c-9e7ed1f62469] (api_request.py:77)

[2024-12-29 03:19:12,714 - DEBUG - fouram]: (api_request)  : [Collection] args: ['scene_search_test_1fHXvWzq', {'auto_id': False, 'description': '', 'fields': [{'name': 'id', 'description': '', 'type': <DataType.INT64: 5>, 'is_primary': True, 'auto_id': False}, {'name': 'sparse_float_vector', 'description': '', 'type': <DataType.SPARSE_FLOAT_VECTOR: 104>}], 'enable_dynamic_field': False}, 'default'], kwargs: {'shards_num': 2}, [requestId: ad1f4f58-c593-11ef-856c-9e7ed1f62469] (api_request.py:77)

[2024-12-29 02:45:34,832 - DEBUG - fouram]: (api_request)  : [Collection.flush] args: [], kwargs: {'timeout': None}, [requestId: fa5ed978-c58e-11ef-856c-9e7ed1f62469] (api_request.py:77)

test result:

[2024-12-30 00:36:36,457 -  INFO - fouram]: Print locust final stats. (locust_runner.py:56)
[2024-12-30 00:36:36,457 -  INFO - fouram]: Type     Name                                                                          # reqs      # fails |    Avg     Min     Max    Med |   req/s  failures/s (stats.py:789)
[2024-12-30 00:36:36,458 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2024-12-30 00:36:36,458 -  INFO - fouram]: grpc     load                                                                             988     2(0.20%) |    384       5   30003      9 |    0.09        0.00 (stats.py:789)
[2024-12-30 00:36:36,458 -  INFO - fouram]: grpc     query                                                                          10286    23(0.22%) |    323       1   60004      4 |    0.98        0.00 (stats.py:789)
[2024-12-30 00:36:36,458 -  INFO - fouram]: grpc     scene_insert_delete_flush                                                        997  248(24.87%) | 117569    2551  336347  82000 |    0.10        0.02 (stats.py:789)
[2024-12-30 00:36:36,458 -  INFO - fouram]: grpc     scene_search_test                                                               1015     1(0.10%) |  34832   13418  351949  32000 |    0.10        0.00 (stats.py:789)
[2024-12-30 00:36:36,458 -  INFO - fouram]: grpc     search                                                                         20731    50(0.24%) |    284      14   60142     20 |    1.98        0.00 (stats.py:789)
[2024-12-30 00:36:36,458 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2024-12-30 00:36:36,458 -  INFO - fouram]:          Aggregated                                                                     34017   324(0.95%) |   4767       1  351949     19 |    3.24        0.03 (stats.py:789)
[2024-12-30 00:36:36,458 -  INFO - fouram]:  (stats.py:790)
[2024-12-30 00:36:36,460 -  INFO - fouram]: [PerfTemplate] Report data: 
{'server': {'deploy_tool': 'helm',
            'deploy_mode': 'cluster',
            'config_name': 'cluster_8c16m',
            'config': {'queryNode': {'resources': {'limits': {'cpu': 8.0, 'memory': '32Gi'}, 'requests': {'cpu': 6.0, 'memory': '16Gi'}}, 'replicas': 1},
                       'indexNode': {'resources': {'limits': {'cpu': '8.0', 'memory': '16Gi'}, 'requests': {'cpu': '5.0', 'memory': '9Gi'}}},
                       'dataNode': {'resources': {'limits': {'cpu': '8.0', 'memory': '16Gi'}, 'requests': {'cpu': '5.0', 'memory': '9Gi'}}},
                       'cluster': {'enabled': True},
                       'pulsarv3': {},
                       'kafka': {},
                       'minio': {'metrics': {'podMonitor': {'enabled': True}}},
                       'etcd': {'metrics': {'enabled': True, 'podMonitor': {'enabled': True}}},
                       'metrics': {'serviceMonitor': {'enabled': True}},
                       'log': {'level': 'debug'},
                       'image': {'all': {'repository': 'harbor.milvus.io/milvus/milvus', 'tag': '2.4-20241227-2f208ebc-amd64'}}},
            'host': 'weekly-stab-17330400-1-48-7639-milvus.qa-milvus.svc.cluster.local',
            'port': '19530',
            'uri': ''},
 'client': {'test_case_type': 'ConcurrentClientBase',
            'test_case_name': 'test_concurrent_locust_custom_parameters',
            'test_case_params': {'dataset_params': {'metric_type': 'IP',
                                                    'vector_field_name': 'sparse_float_vector',
                                                    'dim': 400000,
                                                    'sparse_range': [200, 300],
                                                    'dataset_name': 'sparse_full',
                                                    'dataset_size': '4m',
                                                    'ni_per': 10000},
                                 'collection_params': {'other_fields': ['float_1'], 'shards_num': 2},
                                 'resource_groups_params': {'reset': False},
                                 'database_user_params': {'reset_rbac': False, 'reset_db': False},
                                 'index_params': {'index_type': 'SPARSE_INVERTED_INDEX', 'index_param': {'drop_ratio_build': 0.2}},
                                 'concurrent_params': {'concurrent_number': 20, 'during_time': '24h', 'interval': 20, 'spawn_rate': None},
                                 'concurrent_tasks': [{'type': 'search',
                                                       'weight': 20,
                                                       'params': {'nq': 10,
                                                                  'top_k': 10,
                                                                  'search_param': {'drop_ratio_search': 0.2},
                                                                  'expr': {'float_1': {'GT': -1, 'LT': 50000}},
                                                                  'ignore_growing': False,
                                                                  'timeout': 60,
                                                                  'random_data': True}},
                                                      {'type': 'query',
                                                       'weight': 10,
                                                       'params': {'ids': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
                                                                  'ignore_growing': False,
                                                                  'timeout': 60,
                                                                  'random_data': False,
                                                                  'random_count': 0,
                                                                  'random_range': [0, 1],
                                                                  'field_name': 'id',
                                                                  'field_type': 'int64'}},
                                                      {'type': 'load', 'weight': 1, 'params': {'replica_number': 1, 'timeout': 30}},
                                                      {'type': 'scene_insert_delete_flush',
                                                       'weight': 1,
                                                       'params': {'insert_length': 1,
                                                                  'delete_length': 1,
                                                                  'start_id': 0,
                                                                  'random_id': True,
                                                                  'random_vector': True,
                                                                  'varchar_filled': True}},
                                                      {'type': 'scene_search_test',
                                                       'weight': 1,
                                                       'params': {'dataset': 'sparse_full',
                                                                  'dim': 40000,
                                                                  'data_size': '10000',
                                                                  'nb': 3000,
                                                                  'index_type': 'SPARSE_INVERTED_INDEX',
                                                                  'index_param': {'drop_ratio_build': 0.3},
                                                                  'metric_type': 'IP',
                                                                  'nq': 10,
                                                                  'top_k': 10,
                                                                  'search_param': {'drop_ratio_search': 0.5},
                                                                  'search_counts': 20}}]},
            'run_id': 2024122905719807,
            'datetime': '2024-12-29 00:02:51.267319',
            'client_version': '2.4.0'},
 'result': {'test_result': {'index': {'RT': 5.0409},
                            'insert': {'total_time': 1604.6918, 'VPS': 2492.6905, 'batch_time': 4.0117, 'batch': 10000},
                            'flush': {'RT': 3.0278},
                            'load': {'RT': 11.3451},
                            'Locust': {'Aggregated': {'Requests': 34017,
                                                      'Fails': 324,
                                                      'RPS': 3.24,
                                                      'fail_s': 0.01,
                                                      'RT_max': 351949.11,
                                                      'RT_avg': 4767.49,
                                                      'TP50': 19,
                                                      'TP99': 183000.0},
                                       'load': {'Requests': 988,
                                                'Fails': 2,
                                                'RPS': 0.09,
                                                'fail_s': 0.0,
                                                'RT_max': 30003.57,
                                                'RT_avg': 384.88,
                                                'TP50': 9,
                                                'TP99': 4900.0},
                                       'query': {'Requests': 10286,
                                                 'Fails': 23,
                                                 'RPS': 0.98,
                                                 'fail_s': 0.0,
                                                 'RT_max': 60004.0,
                                                 'RT_avg': 323.81,
                                                 'TP50': 4,
                                                 'TP99': 3700.0},
                                       'scene_insert_delete_flush': {'Requests': 997,
                                                                     'Fails': 248,
                                                                     'RPS': 0.1,
                                                                     'fail_s': 0.25,
                                                                     'RT_max': 336347.13,
                                                                     'RT_avg': 117569.43,
                                                                     'TP50': 82000.0,
                                                                     'TP99': 308000.0},
                                       'scene_search_test': {'Requests': 1015,
                                                             'Fails': 1,
                                                             'RPS': 0.1,
                                                             'fail_s': 0.0,
                                                             'RT_max': 351949.11,
                                                             'RT_avg': 34832.53,
                                                             'TP50': 32000.0,
                                                             'TP99': 77000.0},
                                       'search': {'Requests': 20731,
                                                  'Fails': 50,
                                                  'RPS': 1.98,
                                                  'fail_s': 0.0,
                                                  'RT_max': 60142.8,
                                                  'RT_avg': 284.26,
                                                  'TP50': 20,
                                                  'TP99': 1900.0}}}}}

@wangting0128
Copy link
Contributor Author

different case,same problem

argo task:weekly-stab-1735430400
image:2.5-20241227-ef400227-amd64

server:

[2024-12-30 03:51:02,680 -  INFO - fouram]: [Base] Deploy initial state: 
I1229 00:06:48.584960    3205 request.go:665] Waited for 1.177487897s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/monitoring.coreos.com/v1?timeout=32s
I1229 00:06:58.784267    3205 request.go:665] Waited for 11.374268165s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/agent.k8s.elastic.co/v1alpha1?timeout=32s
I1229 00:07:08.784681    3205 request.go:665] Waited for 8.197107417s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/storage.k8s.io/v1?timeout=32s
NAME                                                              READY   STATUS                   RESTARTS          AGE     IP              NODE         NOMINATED NODE   READINESS GATES
weekly-stab-17330400-1-75-6382-etcd-0                             1/1     Running                  0                 4m37s   10.104.23.44    4am-node27   <none>           <none>
weekly-stab-17330400-1-75-6382-etcd-1                             1/1     Running                  0                 4m37s   10.104.34.64    4am-node37   <none>           <none>
weekly-stab-17330400-1-75-6382-etcd-2                             1/1     Running                  0                 4m37s   10.104.27.77    4am-node31   <none>           <none>
weekly-stab-17330400-1-75-6382-milvus-datanode-69974d5968-l2757   1/1     Running                  3 (3m46s ago)     4m37s   10.104.33.239   4am-node36   <none>           <none>
weekly-stab-17330400-1-75-6382-milvus-indexnode-d79cd4847-whz2d   1/1     Running                  3 (3m44s ago)     4m37s   10.104.32.196   4am-node39   <none>           <none>
weekly-stab-17330400-1-75-6382-milvus-mixcoord-6546549bb4-kfq44   1/1     Running                  3 (3m44s ago)     4m37s   10.104.32.197   4am-node39   <none>           <none>
weekly-stab-17330400-1-75-6382-milvus-proxy-7748965744-kh7dg      1/1     Running                  3 (3m45s ago)     4m37s   10.104.34.52    4am-node37   <none>           <none>
weekly-stab-17330400-1-75-6382-milvus-querynode-7956f54765vvssk   1/1     Running                  3 (3m42s ago)     4m37s   10.104.30.7     4am-node38   <none>           <none>
weekly-stab-17330400-1-75-6382-minio-0                            1/1     Running                  0                 4m37s   10.104.23.45    4am-node27   <none>           <none>
weekly-stab-17330400-1-75-6382-minio-1                            1/1     Running                  0                 4m37s   10.104.34.71    4am-node37   <none>           <none>
weekly-stab-17330400-1-75-6382-minio-2                            1/1     Running                  0                 4m36s   10.104.27.76    4am-node31   <none>           <none>
weekly-stab-17330400-1-75-6382-minio-3                            1/1     Running                  0                 4m36s   10.104.20.65    4am-node22   <none>           <none>
weekly-stab-17330400-1-75-6382-pulsarv3-bookie-0                  1/1     Running                  0                 4m37s   10.104.27.71    4am-node31   <none>           <none>
weekly-stab-17330400-1-75-6382-pulsarv3-bookie-1                  1/1     Running                  0                 4m37s   10.104.23.46    4am-node27   <none>           <none>
weekly-stab-17330400-1-75-6382-pulsarv3-bookie-2                  1/1     Running                  0                 4m36s   10.104.34.73    4am-node37   <none>           <none>
weekly-stab-17330400-1-75-6382-pulsarv3-bookie-init-7gjlw         0/1     Completed                0                 4m37s   10.104.9.92     4am-node14   <none>           <none>
weekly-stab-17330400-1-75-6382-pulsarv3-broker-0                  1/1     Running                  0                 4m37s   10.104.9.88     4am-node14   <none>           <none>
weekly-stab-17330400-1-75-6382-pulsarv3-broker-1                  1/1     Running                  0                 4m37s   10.104.13.190   4am-node16   <none>           <none>
weekly-stab-17330400-1-75-6382-pulsarv3-proxy-0                   1/1     Running                  0                 4m37s   10.104.9.90     4am-node14   <none>           <none>
weekly-stab-17330400-1-75-6382-pulsarv3-proxy-1                   1/1     Running                  0                 4m37s   10.104.14.58    4am-node18   <none>           <none>
weekly-stab-17330400-1-75-6382-pulsarv3-pulsar-init-cwzh5         0/1     Completed                0                 4m37s   10.104.9.98     4am-node14   <none>           <none>
weekly-stab-17330400-1-75-6382-pulsarv3-recovery-0                1/1     Running                  0                 4m37s   10.104.9.89     4am-node14   <none>           <none>
weekly-stab-17330400-1-75-6382-pulsarv3-zookeeper-0               1/1     Running                  0                 4m37s   10.104.23.42    4am-node27   <none>           <none>
weekly-stab-17330400-1-75-6382-pulsarv3-zookeeper-1               1/1     Running                  0                 4m37s   10.104.34.57    4am-node37   <none>           <none>
weekly-stab-17330400-1-75-6382-pulsarv3-zookeeper-2               1/1     Running                  0                 4m37s   10.104.27.75    4am-node31   <none>           <none> (base.py:261)
[2024-12-30 03:51:02,681 -  INFO - fouram]: [Cmd Exe]  kubectl get pods  -n qa-milvus  -o wide | grep -E 'NAME|weekly-stab-17330400-1-75-6382-milvus|weekly-stab-17330400-1-75-6382-minio|weekly-stab-17330400-1-75-6382-etcd|weekly-stab-17330400-1-75-6382-pulsar|weekly-stab-17330400-1-75-6382-zookeeper|weekly-stab-17330400-1-75-6382-kafka|weekly-stab-17330400-1-75-6382-log|weekly-stab-17330400-1-75-6382-tikv'  (util_cmd.py:14)
[2024-12-30 03:51:32,319 -  INFO - fouram]: [CliClient] pod details of release(weekly-stab-17330400-1-75-6382): 
 I1230 03:51:03.932027    3340 request.go:665] Waited for 1.168514318s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/jaegertracing.io/v1?timeout=32s
I1230 03:51:14.130065    3340 request.go:665] Waited for 11.365479355s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/operator.tekton.dev/v1alpha1?timeout=32s
I1230 03:51:24.130395    3340 request.go:665] Waited for 8.193502521s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/argoproj.io/v1alpha1?timeout=32s
NAME                                                              READY   STATUS                   RESTARTS         AGE     IP              NODE         NOMINATED NODE   READINESS GATES
weekly-stab-17330400-1-75-6382-etcd-0                             1/1     Running                  0                24h     10.104.23.81    4am-node27   <none>           <none>
weekly-stab-17330400-1-75-6382-etcd-1                             1/1     Running                  0                27h     10.104.34.64    4am-node37   <none>           <none>
weekly-stab-17330400-1-75-6382-etcd-2                             1/1     Running                  0                27h     10.104.27.77    4am-node31   <none>           <none>
weekly-stab-17330400-1-75-6382-milvus-datanode-69974d5968-l2757   1/1     Running                  3 (27h ago)      27h     10.104.33.239   4am-node36   <none>           <none>
weekly-stab-17330400-1-75-6382-milvus-indexnode-d79cd4847-whz2d   1/1     Running                  3 (27h ago)      27h     10.104.32.196   4am-node39   <none>           <none>
weekly-stab-17330400-1-75-6382-milvus-mixcoord-6546549bb4-kfq44   1/1     Running                  3 (27h ago)      27h     10.104.32.197   4am-node39   <none>           <none>
weekly-stab-17330400-1-75-6382-milvus-proxy-7748965744-kh7dg      1/1     Running                  3 (27h ago)      27h     10.104.34.52    4am-node37   <none>           <none>
weekly-stab-17330400-1-75-6382-milvus-querynode-7956f54765vvssk   1/1     Running                  3 (27h ago)      27h     10.104.30.7     4am-node38   <none>           <none>
weekly-stab-17330400-1-75-6382-minio-0                            1/1     Running                  0                24h     10.104.23.91    4am-node27   <none>           <none>
weekly-stab-17330400-1-75-6382-minio-1                            1/1     Running                  0                27h     10.104.34.71    4am-node37   <none>           <none>
weekly-stab-17330400-1-75-6382-minio-2                            1/1     Running                  0                27h     10.104.27.76    4am-node31   <none>           <none>
weekly-stab-17330400-1-75-6382-minio-3                            1/1     Running                  0                27h     10.104.20.65    4am-node22   <none>           <none>
weekly-stab-17330400-1-75-6382-pulsarv3-bookie-0                  1/1     Running                  0                27h     10.104.27.71    4am-node31   <none>           <none>
weekly-stab-17330400-1-75-6382-pulsarv3-bookie-1                  1/1     Running                  0                25h     10.104.23.93    4am-node27   <none>           <none>
weekly-stab-17330400-1-75-6382-pulsarv3-bookie-2                  1/1     Running                  0                27h     10.104.34.73    4am-node37   <none>           <none>
weekly-stab-17330400-1-75-6382-pulsarv3-bookie-init-7gjlw         0/1     Completed                0                27h     10.104.9.92     4am-node14   <none>           <none>
weekly-stab-17330400-1-75-6382-pulsarv3-broker-0                  1/1     Running                  0                27h     10.104.9.88     4am-node14   <none>           <none>
weekly-stab-17330400-1-75-6382-pulsarv3-broker-1                  1/1     Running                  0                27h     10.104.13.190   4am-node16   <none>           <none>
weekly-stab-17330400-1-75-6382-pulsarv3-proxy-0                   1/1     Running                  0                27h     10.104.9.90     4am-node14   <none>           <none>
weekly-stab-17330400-1-75-6382-pulsarv3-proxy-1                   1/1     Running                  0                27h     10.104.14.58    4am-node18   <none>           <none>
weekly-stab-17330400-1-75-6382-pulsarv3-pulsar-init-cwzh5         0/1     Completed                0                27h     10.104.9.98     4am-node14   <none>           <none>
weekly-stab-17330400-1-75-6382-pulsarv3-recovery-0                1/1     Running                  0                27h     10.104.9.89     4am-node14   <none>           <none>
weekly-stab-17330400-1-75-6382-pulsarv3-zookeeper-0               1/1     Running                  0                24h     10.104.23.84    4am-node27   <none>           <none>
weekly-stab-17330400-1-75-6382-pulsarv3-zookeeper-1               1/1     Running                  0                27h     10.104.34.57    4am-node37   <none>           <none>
weekly-stab-17330400-1-75-6382-pulsarv3-zookeeper-2               1/1     Running                  0                27h     10.104.27.75    4am-node31   <none>           <none>

client log:

[2024-12-29 03:00:28,914 - DEBUG - fouram]: (api_request)  : [Collection.insert] args: <Collection.insert fields: 3, length: 1, content: [ [ `type<class 'int'>, dtype<>` 921 ... ], [ `type<class 'dict'>, dtype<>` {271113: 0.7575791492644 ... ], [ `type<class 'numpy.float32'>, dtype<float32>` 921.0 ... ] ]>, [None], kwargs: {'timeout': None}, [requestId: 0f48e14c-c591-11ef-834e-1a1804a2bd9c] (api_request.py:77)

[2024-12-29 02:53:38,157 - DEBUG - fouram]: (api_request)  : [Collection.insert] args: <Collection.insert fields: 3, length: 1, content: [ [ `type<class 'int'>, dtype<>` 920 ... ], [ `type<class 'dict'>, dtype<>` {6672: 0.715189674531461 ... ], [ `type<class 'numpy.float32'>, dtype<float32>` 920.0 ... ] ]>, [None], kwargs: {'timeout': None}, [requestId: 1a746cf4-c590-11ef-834e-1a1804a2bd9c] (api_request.py:77)

[2024-12-29 02:47:44,228 - DEBUG - fouram]: (api_request)  : [Collection.flush] args: [], kwargs: {'timeout': None}, [requestId: 477f2a0a-c58f-11ef-834e-1a1804a2bd9c] (api_request.py:77)

[2024-12-29 02:47:53,188 - DEBUG - fouram]: (api_request)  : [Collection.flush] args: [], kwargs: {'timeout': None}, [requestId: 4cd648ee-c58f-11ef-834e-1a1804a2bd9c] (api_request.py:77)

[2024-12-29 02:47:53,210 - DEBUG - fouram]: (api_request)  : [Collection.flush] args: [], kwargs: {'timeout': None}, [requestId: 4cd9a2aa-c58f-11ef-834e-1a1804a2bd9c] (api_request.py:77)

[2024-12-29 02:47:53,552 - DEBUG - fouram]: (api_request)  : [Collection.flush] args: [], kwargs: {'timeout': None}, [requestId: 4d0de2a4-c58f-11ef-834e-1a1804a2bd9c] (api_request.py:77)

[2024-12-29 02:48:13,442 - DEBUG - fouram]: (api_request)  : [Collection.flush] args: [], kwargs: {'timeout': None}, [requestId: 58e8e8f8-c58f-11ef-834e-1a1804a2bd9c] (api_request.py:77)

[2024-12-29 02:49:19,134 - DEBUG - fouram]: (api_request)  : [Collection.flush] args: [], kwargs: {'timeout': None}, [requestId: 8010a010-c58f-11ef-834e-1a1804a2bd9c] (api_request.py:77)

[2024-12-29 02:49:19,738 - DEBUG - fouram]: (api_request)  : [Collection.flush] args: [], kwargs: {'timeout': None}, [requestId: 806cd84e-c58f-11ef-834e-1a1804a2bd9c] (api_request.py:77)

[2024-12-29 02:49:35,450 - DEBUG - fouram]: (api_request)  : [Collection.flush] args: [], kwargs: {'timeout': None}, [requestId: 89ca58a8-c58f-11ef-834e-1a1804a2bd9c] (api_request.py:77)

[2024-12-29 02:49:45,088 - DEBUG - fouram]: (api_request)  : [Collection.flush] args: [], kwargs: {'timeout': None}, [requestId: 8f88f79a-c58f-11ef-834e-1a1804a2bd9c] (api_request.py:77)

[2024-12-29 02:50:02,888 - DEBUG - fouram]: (api_request)  : [Collection.flush] args: [], kwargs: {}, [requestId: 9a2505cc-c58f-11ef-834e-1a1804a2bd9c] (api_request.py:77)

[2024-12-29 02:50:02,903 - DEBUG - fouram]: (api_request)  : [Collection.flush] args: [], kwargs: {}, [requestId: 9a2738d8-c58f-11ef-834e-1a1804a2bd9c] (api_request.py:77)

[2024-12-29 02:50:02,924 - DEBUG - fouram]: (api_request)  : [Collection.flush] args: [], kwargs: {}, [requestId: 9a2a712e-c58f-11ef-834e-1a1804a2bd9c] (api_request.py:77)

[2024-12-29 02:50:04,742 - DEBUG - fouram]: (api_request)  : [Collection.insert] args: <Collection.insert fields: 2, length: 3000, content: [ [ `type<class 'int'>, dtype<>` 0 ... ], [ `type<class 'scipy.sparse._csr.csr_matrix'>, dtype<float32>`   (0, 1012)	0.36311877
  ... ] ]>, [None], kwargs: {}, [requestId: 9b3fcf0a-c58f-11ef-834e-1a1804a2bd9c] (api_request.py:77)

[2024-12-29 02:50:06,569 - DEBUG - fouram]: (api_request)  : [Collection.insert] args: <Collection.insert fields: 2, length: 3000, content: [ [ `type<class 'int'>, dtype<>` 0 ... ], [ `type<class 'scipy.sparse._csr.csr_matrix'>, dtype<float32>`   (0, 1012)	0.36311877
  ... ] ]>, [None], kwargs: {}, [requestId: 9c568b90-c58f-11ef-834e-1a1804a2bd9c] (api_request.py:77)

[2024-12-29 02:50:07,752 - DEBUG - fouram]: (api_request)  : [Collection] args: ['scene_search_test_mOmbGKDA', {'auto_id': False, 'description': '', 'fields': [{'name': 'id', 'description': '', 'type': <DataType.INT64: 5>, 'is_primary': True, 'auto_id': False}, {'name': 'sparse_float_vector', 'description': '', 'type': <DataType.SPARSE_FLOAT_VECTOR: 104>}], 'enable_dynamic_field': False}, 'default'], kwargs: {'shards_num': 2}, [requestId: 9d0b1f9c-c58f-11ef-834e-1a1804a2bd9c] (api_request.py:77)

[2024-12-29 02:50:07,752 - DEBUG - fouram]: (api_request)  : [Collection.load] args: [None, 1, None], kwargs: {}, [requestId: 9d0b3b30-c58f-11ef-834e-1a1804a2bd9c] (api_request.py:77)

[2024-12-29 02:50:08,008 - DEBUG - fouram]: (api_request)  : [Collection.insert] args: <Collection.insert fields: 3, length: 1, content: [ [ `type<class 'int'>, dtype<>` 919 ... ], [ `type<class 'dict'>, dtype<>` {251846: 0.1587360175998 ... ], [ `type<class 'numpy.float32'>, dtype<float32>` 919.0 ... ] ]>, [None], kwargs: {'timeout': None}, [requestId: 9d323c58-c58f-11ef-834e-1a1804a2bd9c] (api_request.py:77)

[2024-12-29 03:06:08,348 - DEBUG - fouram]: (api_request)  : [Collection] args: ['scene_search_test_nqMnfyiB', {'auto_id': False, 'description': '', 'fields': [{'name': 'id', 'description': '', 'type': <DataType.INT64: 5>, 'is_primary': True, 'auto_id': False}, {'name': 'sparse_float_vector', 'description': '', 'type': <DataType.SPARSE_FLOAT_VECTOR: 104>}], 'enable_dynamic_field': False}, 'default'], kwargs: {'shards_num': 2}, [requestId: d99a804a-c591-11ef-834e-1a1804a2bd9c] (api_request.py:77)

test result:

[2024-12-30 00:35:56,646 -  INFO - fouram]: Print locust final stats. (locust_runner.py:56)
[2024-12-30 00:35:56,647 -  INFO - fouram]: Type     Name                                                                          # reqs      # fails |    Avg     Min     Max    Med |   req/s  failures/s (stats.py:789)
[2024-12-30 00:35:56,647 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2024-12-30 00:35:56,647 -  INFO - fouram]: grpc     load                                                                             969     3(0.31%) |    479       4   30003      9 |    0.11        0.00 (stats.py:789)
[2024-12-30 00:35:56,647 -  INFO - fouram]: grpc     query                                                                           8965     5(0.06%) |    274       4   60008      8 |    0.99        0.00 (stats.py:789)
[2024-12-30 00:35:56,647 -  INFO - fouram]: grpc     scene_insert_delete_flush                                                        910  146(16.04%) |  89175    2470  286287  40000 |    0.10        0.02 (stats.py:789)
[2024-12-30 00:35:56,647 -  INFO - fouram]: grpc     scene_search_test                                                                875     0(0.00%) |  84472   14765  211600  80000 |    0.10        0.00 (stats.py:789)
[2024-12-30 00:35:56,647 -  INFO - fouram]: grpc     search                                                                         18293    22(0.12%) |    199      13   60016     20 |    2.03        0.00 (stats.py:789)
[2024-12-30 00:35:56,647 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2024-12-30 00:35:56,647 -  INFO - fouram]:          Aggregated                                                                     30012   176(0.59%) |   5385       4  286287     19 |    3.33        0.02 (stats.py:789)
[2024-12-30 00:35:56,647 -  INFO - fouram]:  (stats.py:790)
[2024-12-30 00:35:56,649 -  INFO - fouram]: [PerfTemplate] Report data: 
{'server': {'deploy_tool': 'helm',
            'deploy_mode': 'cluster',
            'config_name': 'cluster_8c16m',
            'config': {'queryNode': {'resources': {'limits': {'cpu': 8.0, 'memory': '32Gi'}, 'requests': {'cpu': 6.0, 'memory': '16Gi'}}, 'replicas': 1},
                       'indexNode': {'resources': {'limits': {'cpu': '8.0', 'memory': '16Gi'}, 'requests': {'cpu': '5.0', 'memory': '9Gi'}}},
                       'dataNode': {'resources': {'limits': {'cpu': '8.0', 'memory': '16Gi'}, 'requests': {'cpu': '5.0', 'memory': '9Gi'}}},
                       'cluster': {'enabled': True},
                       'pulsarv3': {},
                       'kafka': {},
                       'minio': {'metrics': {'podMonitor': {'enabled': True}}},
                       'etcd': {'metrics': {'enabled': True, 'podMonitor': {'enabled': True}}},
                       'metrics': {'serviceMonitor': {'enabled': True}},
                       'log': {'level': 'debug'},
                       'image': {'all': {'repository': 'harbor.milvus.io/milvus/milvus', 'tag': '2.5-20241227-ef400227-amd64'}}},
            'host': 'weekly-stab-17330400-1-75-6382-milvus.qa-milvus.svc.cluster.local',
            'port': '19530',
            'uri': ''},
 'client': {'test_case_type': 'ConcurrentClientBase',
            'test_case_name': 'test_concurrent_locust_custom_parameters',
            'test_case_params': {'dataset_params': {'metric_type': 'IP',
                                                    'vector_field_name': 'sparse_float_vector',
                                                    'dim': 400000,
                                                    'sparse_range': [200, 300],
                                                    'dataset_name': 'sparse_full',
                                                    'dataset_size': '4m',
                                                    'ni_per': 10000},
                                 'collection_params': {'other_fields': ['float_1'], 'shards_num': 2},
                                 'resource_groups_params': {'reset': False},
                                 'database_user_params': {'reset_rbac': False, 'reset_db': False},
                                 'index_params': {'index_type': 'SPARSE_INVERTED_INDEX', 'index_param': {'drop_ratio_build': 0.2}},
                                 'concurrent_params': {'concurrent_number': 20, 'during_time': '24h', 'interval': 20, 'spawn_rate': None},
                                 'concurrent_tasks': [{'type': 'search',
                                                       'weight': 20,
                                                       'params': {'nq': 10,
                                                                  'top_k': 10,
                                                                  'search_param': {'drop_ratio_search': 0.2},
                                                                  'expr': {'float_1': {'GT': -1, 'LT': 50000}},
                                                                  'ignore_growing': False,
                                                                  'timeout': 60,
                                                                  'random_data': True}},
                                                      {'type': 'query',
                                                       'weight': 10,
                                                       'params': {'ids': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
                                                                  'ignore_growing': False,
                                                                  'timeout': 60,
                                                                  'random_data': False,
                                                                  'random_count': 0,
                                                                  'random_range': [0, 1],
                                                                  'field_name': 'id',
                                                                  'field_type': 'int64'}},
                                                      {'type': 'load', 'weight': 1, 'params': {'replica_number': 1, 'timeout': 30}},
                                                      {'type': 'scene_insert_delete_flush',
                                                       'weight': 1,
                                                       'params': {'insert_length': 1,
                                                                  'delete_length': 1,
                                                                  'start_id': 0,
                                                                  'random_id': True,
                                                                  'random_vector': True,
                                                                  'varchar_filled': True}},
                                                      {'type': 'scene_search_test',
                                                       'weight': 1,
                                                       'params': {'dataset': 'sparse_full',
                                                                  'dim': 40000,
                                                                  'data_size': '10000',
                                                                  'nb': 3000,
                                                                  'index_type': 'SPARSE_INVERTED_INDEX',
                                                                  'index_param': {'drop_ratio_build': 0.3},
                                                                  'metric_type': 'IP',
                                                                  'nq': 10,
                                                                  'top_k': 10,
                                                                  'search_param': {'drop_ratio_search': 0.5},
                                                                  'search_counts': 20}}]},
            'run_id': 2024122905532039,
            'datetime': '2024-12-29 00:02:33.657959',
            'client_version': '2.5.0'},
 'result': {'test_result': {'index': {'RT': 4.5405},
                            'insert': {'total_time': 1584.767, 'VPS': 2524.0303, 'batch_time': 3.9619, 'batch': 10000},
                            'flush': {'RT': 3.0302},
                            'load': {'RT': 12.1418},
                            'Locust': {'Aggregated': {'Requests': 30012,
                                                      'Fails': 176,
                                                      'RPS': 3.33,
                                                      'fail_s': 0.01,
                                                      'RT_max': 286287.88,
                                                      'RT_avg': 5385.67,
                                                      'TP50': 19,
                                                      'TP99': 157000.0},
                                       'load': {'Requests': 969,
                                                'Fails': 3,
                                                'RPS': 0.11,
                                                'fail_s': 0.0,
                                                'RT_max': 30003.86,
                                                'RT_avg': 479.37,
                                                'TP50': 9,
                                                'TP99': 7800.0},
                                       'query': {'Requests': 8965,
                                                 'Fails': 5,
                                                 'RPS': 0.99,
                                                 'fail_s': 0.0,
                                                 'RT_max': 60008.97,
                                                 'RT_avg': 274.59,
                                                 'TP50': 8,
                                                 'TP99': 4900.0},
                                       'scene_insert_delete_flush': {'Requests': 910,
                                                                     'Fails': 146,
                                                                     'RPS': 0.1,
                                                                     'fail_s': 0.16,
                                                                     'RT_max': 286287.88,
                                                                     'RT_avg': 89175.12,
                                                                     'TP50': 40000.0,
                                                                     'TP99': 268000.0},
                                       'scene_search_test': {'Requests': 875,
                                                             'Fails': 0,
                                                             'RPS': 0.1,
                                                             'fail_s': 0.0,
                                                             'RT_max': 211600.24,
                                                             'RT_avg': 84472.35,
                                                             'TP50': 80000.0,
                                                             'TP99': 190000.0},
                                       'search': {'Requests': 18293,
                                                  'Fails': 22,
                                                  'RPS': 2.03,
                                                  'fail_s': 0.0,
                                                  'RT_max': 60016.02,
                                                  'RT_avg': 199.3,
                                                  'TP50': 20,
                                                  'TP99': 1900.0}}}}}

@xiaofan-luan
Copy link
Collaborator

@aoiasd
is this a pprof for querynode?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug test/benchmark benchmark test triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

5 participants