OPTE maintains two sets of benchmarks: userland microbenchmarks, and kernel module benchmarks. Userland benchmarks can be run on most development machines, while the kernel module benchmarks will require a full Helios install and additional lab setup depending on what benchmarks you want to run.
Benchmark outputs are located in opte/target/criterion
, and any flamegraphs built during kmod benchmarks are placed into opte/target/xde-bench
.
We use criterion
to measure and profile individual packet processing times for slow-/fast-path traffic as well as generated hairpin packets.
These can be called using cargo ubench
, or cargo bench --package opte-bench --bench userland — <options>
.
This benchmark runner uses the standard criterion CLI.
To see a clean list of available benchmarks, use the cargo ubench --list 2> /dev/null | sort | uniq
command.
Benchmarks are split into several categories:
-
Metric:
wallclock
,alloc_ct
,alloc_sz
. -
Action:
parse
,process
. -
Packet family.
The kernel module benchmarks can be called using cargo kbench
, or cargo bench --package opte-bench --bench xde — <options>
.
They require that:
-
you are running on an up-to-date Helios instance.
-
the XDE kernel module and
opteadm
are installed, either via IPS or thecargo xtask install
command. -
you have installed the IPS packages
flamegraph
,demangle
,iperf
andsparse
.
They implement zont-to-zone iperf traffic in two scenarios:
-
cargo kbench local
on one machine. This uses an identical test setup toxde-tests/loopback
. Two sparse zones will be created on the current machine, with simnet links being used as an underlay network. This is lower fidelity than the below two-node setup. -
cargo kbench server
andcargo kbench remote <SERVER_IP>
on two separate machines. One zone will be created on each machine (running an iperf server and client respectively), using the shared lab/home network to exchange link local addresses.
Below you can find a lab setup which suffices for the second option.
Currently, linklocals must be created with the name syntax <nic>/ll
: this can be done using, e.g., pfexec ipadm create-addr igb0/ll -T addrconf
.
The benchmark defaults to using the NICs igb0
and igb1
, and can be overridden to match your setup using the --underlay-nics
option.
E.g., when testing over a Chelsio NIC --underlay-nics cxgbe0 cxgbe1
will select these devices and use the link-local addresses cxgbe0/ll
and cxgbe1/ll
.
Additionally, MTUs should be set to 9000
for physical underlay links.
fe80::a236:9fff:fe0c:2586 fe80::a236:9fff:fe0c:25b6
fe80::a236:9fff:fe0c:2587 fe80::a236:9fff:fe0c:25b7
┌─────────────────────────────────────┐
│ │
│ ┌─────────────────┐ │
│ │ │ │
igb0┌┴┐ ┌┴┐igb1 igb1┌┴┐ ┌┴┐igb0
╔═╩═╩═══════╩═╩═╗ ╔═╩═╩═══════╩═╩═╗
║ cargo kbench ║░ ║ cargo kbench ║░
║ remote ║░ ║ server ║░
║ 10.0.125.173 ║░ ║ ║░
╚══════╦═╦══════╝░ ╚══════╦═╦══════╝░
░░░░░░░│░░░░░░░░░ ░░░░░░░│░░░░░░░░░
10.0.147.187/8 10.0.125.173/8
│ ┌ ─ ─ ─ ─ ─ ┐ │
Lab/Home
└ ─ ─ ▶│ Network │◀ ─ ─ ─ ┘
─ ─ ─ ─ ─ ─
Connecting igb0<→igb0
, etc., is not a requirement, as NDP tables are inspected for inserting underlay network routes.
In both scenarios, the benchmark harness will run iperf in client-to-server and server-to-client modes, and will record periodic stack information and timings using dtrace
.
These are converted into flamegraphs and timing data for further analysis by criterion.
The kernel module benchmark harness can be moved onto a gimlet or other development system for measurement. The path to the binary can be found using the command:
cargo bench --package opte-bench \
--no-run --message-format json-render-diagnostics \
| jq -r -s "map( \
select(.reason==\"compiler-artifact\") \
| select( \
.target.kind\
| map_values(.==\"bench\") \
| any \
) \
| select(.target.name==\"xde\") \
) | map(.executable)"
Once the binary is moved onto the global zone of a target machine, measurements can be taken using xde in-situ
.
On a gimlet we add the -d
flag as we do not have access to flamegraph
.
This places captured stacks into the xde-bench
folder.
$ ./xde in-situ expt-name -d
# ...
exit
$ ls -R xde-bench
xde-bench:
expt-name
xde-bench/expt-name:
histos.out raw.stacks
Measured data in xde-bench
can be moved and processed into flamegraphs and histograms on any development machine using the command ./xde in-situ expt-name -c none
.