[Performance Analysis] DPM/ACA gRPC Performance Report #384

haboy52581 · 2020-09-15T23:48:42Z

No description provided.

update table for test result

xieus

@haboy52581 Some initial comments. Thanks for setting up tests and collecting the data points.

xieus · 2020-09-24T22:54:55Z

docs/modules/ROOT/pages/performance_analysis/dpmAcaGrpcTest.adoc

@@ -0,0 +1,215 @@
+= ALCOR CONTROL AGENT-ALCOR DATAPLANE MANAGER Test Report 


Suggested to change to "Alcor gRPC Performance Test Report"

xieus · 2020-09-24T22:56:36Z

docs/modules/ROOT/pages/performance_analysis/dpmAcaGrpcTest.adoc

+|*cpu MHz* |2231.772 |2599.079
+|*Memory* |192GB |386GB
+|*Network* |NetXtreme BCM5719 Gigabit Ethernet PCIe (GB network) |82599ES 10-Gigabit SFI/SFP+ Network Connection
+|*Storage* |LSI raid (no ssd) |AVAGO (no ssd)


I think the DPM machine (.188) has 6X1600GB SSD. Could you confirm?

xieus · 2020-09-24T22:58:18Z

docs/modules/ROOT/pages/performance_analysis/dpmAcaGrpcTest.adoc

+|*Model Name* |Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz |Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz
+|*cpu MHz* |2231.772 |2599.079
+|*Memory* |192GB |386GB
+|*Network* |NetXtreme BCM5719 Gigabit Ethernet PCIe (GB network) |82599ES 10-Gigabit SFI/SFP+ Network Connection


Check the network bandwidth. As the results shows DPM client is network bounded, so we would need to revisit this configuration.

xieus · 2020-09-24T23:02:40Z

docs/modules/ROOT/pages/performance_analysis/dpmAcaGrpcTest.adoc

+[arabic, start=2]
+. *Test step:*
+
+F send goal state message to A-E at the same time concurrently after first warming up then wait for the response, goal state message is different in each payload


Can you upload the test scripts or codes that generate the payload to https://github.com/futurewei-cloud/alcor-int/tree/master/tools? This can be done in a sperate PR.

xieus · 2020-09-24T23:03:18Z

docs/modules/ROOT/pages/performance_analysis/dpmAcaGrpcTest.adoc

+
+F send goal state message to A-E at the same time concurrently after first warming up then wait for the response, goal state message is different in each payload
+
+On A-E there are 2600 ACA running on each box, ACA code has been revised to cut off the ovsdb and mq operations


2,600 or 2,000? I thought 2,000 is the stable setup. Need to update the image accordingly.

xieus · 2020-09-24T23:28:21Z

docs/modules/ROOT/pages/performance_analysis/dpmAcaGrpcTest.adoc

+image::128-2.png["128 thread 2nd time",width=262,height=156]
+____
+
+for 256 threads and below, the success rate is 100%


Can we add one more data point of 256 threads? People will be interested in seeing the limit.

also can we put some resource utilization diagram including CPU, RAM, Disk IO and Network IO in this extreme case? This would help.

xieus · 2020-09-24T23:30:40Z

docs/modules/ROOT/pages/performance_analysis/dpmAcaGrpcTest.adoc

+____
+
+____
+* 10k neighbor, every connection time cost for different concurrent thread number*


Please explain the x-axis, what do those numbers represent? for example, first number is number of threads and the second number is number of successful run out of a total of 10K runs.

xieus · 2020-09-24T23:31:47Z

docs/modules/ROOT/pages/performance_analysis/dpmAcaGrpcTest.adoc

+____
+
+____
+* 10k neighbor, every connection time cost for different concurrent thread number*


Also, as discussed, we need to verify the extreme large value (5,594,098) and rerun the test.

xieus · 2020-09-24T23:36:16Z

docs/modules/ROOT/pages/performance_analysis/dpmAcaGrpcTest.adoc

+
+
+____
+* when neighbor number changed, every connection time cost and overall time cost for different concurrent thread number*


This image is important. Let us work to collect more data based on two dimensions (concurrent thread # and neighbor numbers), fix one and adjust the other.

xieus · 2020-09-24T23:40:19Z

docs/modules/ROOT/pages/performance_analysis/dpmAcaGrpcTest.adoc

+____
+
+____
+* when neighbor number changed, overall time cost for different concurrent thread number*


Same comment as to image other-ov-jc.png.

"Let us work to collect more data based on two dimensions (concurrent thread # and neighbor numbers), fix one and adjust the other."

we can take out the data point for "1t-1w" and explain in the texts.

xieus · 2020-10-03T13:13:36Z

docs/modules/ROOT/pages/performance_analysis/dpmAcaGrpcTest.adoc

@@ -65,6 +65,28 @@ image::p1.png["Test Deployment",width=488,height=302]
 |*90% TILE* |12 |11 |32 |28 |78 |84 |292 |262
 |===  

+different payload sizes vary from 1 neighbor to 10000 neighbor(2MB) each
+
+*1WR+other OV-MAX+average*


Could you elaborate what this means?

xieus · 2020-10-03T13:15:27Z

docs/modules/ROOT/pages/performance_analysis/dpmAcaGrpcTest.adoc

@@ -65,6 +65,28 @@ image::p1.png["Test Deployment",width=488,height=302]
 |*90% TILE* |12 |11 |32 |28 |78 |84 |292 |262
 |===  


The column and row of this table is opposite of the next one. Could we make them consistent?

haboy52581 and others added 10 commits September 15, 2020 16:47

add adoc for dpm aca grpc analysis

76a66fc

update the images

effc445

adoc is not full compatible with word

5e5e7a1

update table for test result

051c7a7

update table for test result

add pic

9fd3615

update adoc with pic

8a43224

Update dpmAcaGrpcTest.adoc

85cdfc7

add images

8bc0ec3

add images

c0d9ea8

add execution detail chart

d759317

xieus changed the title ~~add adoc for dpm aca grpc analysis~~ [Performance Analysis] DPM/ACA gRPC Performance Report Sep 24, 2020

xieus requested review from xieus, chenpiaoping, Eric-Yuan and er1cthe0ne September 24, 2020 22:53

xieus assigned haboy52581 Sep 24, 2020

xieus added the perf testing Performance Testing label Sep 24, 2020

xieus added this to the Version 0.9.2020.09.30 milestone Sep 24, 2020

xieus suggested changes Sep 24, 2020

View reviewed changes

add matrix based on variety of dpm payload size

7fb47ed

xieus modified the milestones: Version 0.9.2020.09.30, Version 1.0.2020.11.30 Oct 3, 2020

xieus reviewed Oct 3, 2020

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Performance Analysis] DPM/ACA gRPC Performance Report #384

[Performance Analysis] DPM/ACA gRPC Performance Report #384

haboy52581 commented Sep 15, 2020

xieus left a comment

xieus Sep 24, 2020

xieus Sep 24, 2020

xieus Sep 24, 2020

xieus Sep 24, 2020

xieus Sep 24, 2020

xieus Sep 24, 2020

xieus Sep 24, 2020

xieus Sep 24, 2020

xieus Sep 24, 2020

xieus Sep 24, 2020

xieus Sep 24, 2020

xieus Oct 3, 2020

xieus Oct 3, 2020

		@@ -0,0 +1,215 @@
		= ALCOR CONTROL AGENT-ALCOR DATAPLANE MANAGER Test Report


		F send goal state message to A-E at the same time concurrently after first warming up then wait for the response, goal state message is different in each payload

		On A-E there are 2600 ACA running on each box, ACA code has been revised to cut off the ovsdb and mq operations



		____
		* when neighbor number changed, every connection time cost and overall time cost for different concurrent thread number*

		@@ -65,6 +65,28 @@ image::p1.png["Test Deployment",width=488,height=302]
		\|90% TILE \|12 \|11 \|32 \|28 \|78 \|84 \|292 \|262
		\|===

[Performance Analysis] DPM/ACA gRPC Performance Report #384

Are you sure you want to change the base?

[Performance Analysis] DPM/ACA gRPC Performance Report #384

Conversation

haboy52581 commented Sep 15, 2020

xieus left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment