Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Baremetal e2e scripts #248

Merged
merged 26 commits into from
Oct 28, 2021
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
b760183
changes for baremetal
jdowni000 Aug 10, 2021
faf4c7e
Nodedensity (#5)
mkarg75 Aug 11, 2021
fa36045
exiting if benchmark state becomes Failed
jdowni000 Aug 11, 2021
e093af1
Merge branch 'master' of https://github.com/jdowni000/e2e-benchmarking
jdowni000 Aug 11, 2021
443d56a
removing duplicated check for baremetal from PR
jdowni000 Aug 11, 2021
aa2f5a9
adding log() for kube-burner
jdowni000 Aug 11, 2021
b208b54
to pick 1 client pod (#6)
mukrishn Aug 11, 2021
7af2068
Improve update (#7)
jdowni000 Aug 31, 2021
398719f
fixed conflicts
jdowni000 Sep 1, 2021
320994b
Merge branch 'master' into master
jdowni000 Sep 1, 2021
eb5d646
Merge branch 'master' into master
jdowni000 Sep 1, 2021
cfaef9c
moved script to upgrade dir
mukrishn Sep 8, 2021
77697e1
rebased and fixed conflict
mukrishn Sep 8, 2021
9993e9b
modified mc selector
mukrishn Sep 9, 2021
09c3be9
excluding few more workers
mukrishn Sep 15, 2021
dc32aa5
static mb configuration
mukrishn Sep 15, 2021
8b7bc18
corrected log format
mukrishn Sep 16, 2021
4037270
added logs and missing variable
mukrishn Sep 16, 2021
fdb68a6
removed operator delete command
mukrishn Sep 16, 2021
268c5cb
Updated pod selector
mukrishn Sep 24, 2021
282339f
removed cleanup functions
mukrishn Sep 24, 2021
a103f2c
Merge branch 'master' into final_patch
mukrishn Oct 12, 2021
f6d0276
Merge branch 'master' into final_patch
mukrishn Oct 21, 2021
3aed0e9
python3.8 for ripsaw-cli
mukrishn Oct 26, 2021
433d76d
Small logging fixes
rsevilla87 Oct 28, 2021
04d9156
Improve benchmark finished message
rsevilla87 Oct 28, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 31 additions & 14 deletions workloads/kube-burner/common.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,11 @@

source env.sh


log() {
echo ${bold}$(date -u): ${@}${normal}
}

# If INDEXING is disabled we disable metadata collection
if [[ ${INDEXING} == "false" ]]; then
export METADATA_COLLECTION=false
Expand All @@ -12,21 +17,34 @@ fi
export TOLERATIONS="[{key: role, value: workload, effect: NoSchedule}]"
export UUID=$(uuidgen)

log() {
echo -e "\033[1m$(date "+%d-%m-%YT%H:%M:%S") ${@}\033[0m"
}
mukrishn marked this conversation as resolved.
Show resolved Hide resolved
# Check if we're on bareMetal
export baremetalCheck=$(oc get infrastructure cluster -o json | jq .spec.platformSpec.type)

#Check to see if the infrastructure type is baremetal to adjust script as necessary
if [[ "${baremetalCheck}" == '"BareMetal"' ]]; then
log "BareMetal infastructure: setting isBareMetal accordingly"
export isBareMetal=true
else
export isBareMetal=false
fi


deploy_operator() {
log "Removing benchmark-operator namespace, if it already exists"
oc delete namespace benchmark-operator --ignore-not-found
log "Cloning benchmark-operator from branch ${OPERATOR_BRANCH} of ${OPERATOR_REPO}"
rm -rf benchmark-operator
git clone --single-branch --branch ${OPERATOR_BRANCH} ${OPERATOR_REPO} --depth 1
(cd benchmark-operator && make deploy)
kubectl apply -f benchmark-operator/resources/backpack_role.yaml
kubectl apply -f benchmark-operator/resources/kube-burner-role.yml
log "Waiting for benchmark-operator to be running"
oc wait --for=condition=available "deployment/benchmark-controller-manager" -n benchmark-operator --timeout=300s
if [[ "${isBareMetal}" == "false" ]]; then

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh i think we should just leave the operator around each time and change this to apply to all cluster types. I don't see a reason why we need to delete/recreate the operator

Copy link
Collaborator Author

@mukrishn mukrishn Sep 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Member

@rsevilla87 rsevilla87 Sep 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the operator is deployed, we shouldn't delete it.
I'd modify the current code to always try to deploy the operator. With the current make deploy implementation, if the operator is already running it won't be redeployed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed delete command

log "Removing benchmark-operator namespace, if it already exists"
oc delete namespace benchmark-operator --ignore-not-found
log "Cloning benchmark-operator from branch ${OPERATOR_BRANCH} of ${OPERATOR_REPO}"
else
log "Baremetal infrastructure: Keeping benchmark-operator namespace"
log "Cloning benchmark-operator from branch ${OPERATOR_BRANCH} ${OPERATOR_REPO}"
fi
rm -rf benchmark-operator
git clone --single-branch --branch ${OPERATOR_BRANCH} ${OPERATOR_REPO} --depth 1
(cd benchmark-operator && make deploy)
kubectl apply -f benchmark-operator/resources/backpack_role.yaml
kubectl apply -f benchmark-operator/resources/kube-burner-role.yml
log "Waiting for benchmark-operator to be running"
oc wait --for=condition=available "deployment/benchmark-controller-manager" -n benchmark-operator --timeout=300s
}

deploy_workload() {
Expand Down Expand Up @@ -135,4 +153,3 @@ check_running_benchmarks() {
cleanup() {
oc delete ns -l kube-burner-uuid=${UUID}
}

106 changes: 80 additions & 26 deletions workloads/network-perf/common.sh
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ export_defaults() {
operator_repo=${OPERATOR_REPO:=https://github.com/cloud-bulldozer/benchmark-operator.git}
operator_branch=${OPERATOR_BRANCH:=master}
CRD=${CRD:-ripsaw-uperf-crd.yaml}
export cr_name=${BENCHMARK:=benchmark}
export _es=${ES_SERVER:-https://search-perfscale-dev-chmf5l4sh66lvxbnadi4bznl3a.us-west-2.es.amazonaws.com:443}
_es_baseline=${ES_SERVER_BASELINE:-https://search-perfscale-dev-chmf5l4sh66lvxbnadi4bznl3a.us-west-2.es.amazonaws.com:443}
export _metadata_collection=${METADATA_COLLECTION:=true}
Expand All @@ -43,11 +44,41 @@ export_defaults() {
export pin=true
export networkpolicy=${NETWORK_POLICY:=false}
export multi_az=${MULTI_AZ:=true}
export baremetalCheck=$(oc get infrastructure cluster -o json | jq .spec.platformSpec.type)
zones=($(oc get nodes -l node-role.kubernetes.io/workload!=,node-role.kubernetes.io/infra!=,node-role.kubernetes.io/worker -o go-template='{{ range .items }}{{ index .metadata.labels "topology.kubernetes.io/zone" }}{{ "\n" }}{{ end }}' | uniq))
platform=$(oc get infrastructure cluster -o jsonpath='{.status.platformStatus.type}' | tr '[:upper:]' '[:lower:]')
log "Platform is found to be : ${platform} "
# If multi_az we use one node from the two first AZs
if [[ ${platform} == "vsphere" ]]; then

#Check to see if the infrastructure type is baremetal to adjust script as necessary
if [[ "${baremetalCheck}" == '"BareMetal"' ]]; then
log "BareMetal infastructure: setting isBareMetal accordingly"
export isBareMetal=true
else
export isBareMetal=false
fi

#If using baremetal we use different query to find worker nodes
if [[ "${isBareMetal}" == "true" ]]; then
log "Colocating uperf pods for baremetal"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not really colocating are we?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, we are not. we randomly pick 2 worker nodes. I will delete this log.

nodeCount=$(oc get nodes --no-headers -l node-role.kubernetes.io/worker | wc -l)
if [[ ${nodeCount} -ge 2 ]]; then
serverNumber=$(( $RANDOM %${nodeCount} + 1 ))
clientNumber=$(( $RANDOM %${nodeCount} + 1 ))
while (( $serverNumber == $clientNumber ))
do
clientNumber=$(( $RANDOM %${nodeCount} + 1 ))
done
export server=$(oc get nodes --no-headers -l node-role.kubernetes.io/worker | awk 'NR=='${serverNumber}'{print $1}')
export client=$(oc get nodes --no-headers -l node-role.kubernetes.io/worker | awk 'NR=='${clientNumber}'{print $1}')
else
log "At least 2 worker nodes are required"
exit 1
fi
log "Finished assigning server and client nodes"
log "Server to be scheduled on node: $server"
log "Client to be scheduled on node: $client"
# If multi_az we use one node from the two first AZs
elif [[ ${platform} == "vsphere" ]]; then
nodes=($(oc get nodes -l node-role.kubernetes.io/worker,node-role.kubernetes.io/workload!="",node-role.kubernetes.io/infra!="" -o jsonpath='{range .items[*]}{ .metadata.labels.kubernetes\.io/hostname}{"\n"}{end}'))
if [[ ${#nodes[@]} -lt 2 ]]; then
log "At least 2 worker nodes placed are required"
Expand Down Expand Up @@ -83,9 +114,13 @@ export_defaults() {
export serviceip=false
elif [ ${WORKLOAD} == "service" ]
then
export _metadata_targeted=false
export hostnetwork=false
export serviceip=true
if [[ "${isBareMetal}" == "true" ]]; then
export _metadata_targeted=true
else
export _metadata_targeted=false
fi
else
export hostnetwork=false
export serviceip=false
Expand Down Expand Up @@ -132,17 +167,21 @@ export_defaults() {
}

deploy_operator() {
log "Removing benchmark-operator namespace, if it already exists"
oc delete namespace benchmark-operator --ignore-not-found
log "Cloning benchmark-operator from branch ${operator_branch} of ${operator_repo}"
rm -rf benchmark-operator
git clone --single-branch --branch ${operator_branch} ${operator_repo} --depth 1
(cd benchmark-operator && make deploy)
kubectl apply -f benchmark-operator/resources/backpack_role.yaml
oc wait --for=condition=available "deployment/benchmark-controller-manager" -n benchmark-operator --timeout=300s
oc adm policy -n benchmark-operator add-scc-to-user privileged -z benchmark-operator
oc adm policy -n benchmark-operator add-scc-to-user privileged -z backpack-view
oc patch scc restricted --type=merge -p '{"allowHostNetwork": true}'
if [[ "${isBareMetal}" == "false" ]]; then
log "Removing benchmark-operator namespace, if it already exists"
oc delete namespace benchmark-operator --ignore-not-found
log "Cloning benchmark-operator from branch ${operator_branch} of ${operator_repo}"
else
log "Baremetal infrastructure: Keeping benchmark-operator namespace"
log "Cloning benchmark-operator from branch ${operator_branch} of ${operator_repo}"
fi
rm -rf benchmark-operator
git clone --single-branch --branch ${operator_branch} ${operator_repo} --depth 1
(cd benchmark-operator && make deploy)
oc wait --for=condition=available "deployment/benchmark-controller-manager" -n benchmark-operator --timeout=300s
oc adm policy -n benchmark-operator add-scc-to-user privileged -z benchmark-operator
oc adm policy -n benchmark-operator add-scc-to-user privileged -z backpack-view
oc patch scc restricted --type=merge -p '{"allowHostNetwork": true}'
}

deploy_workload() {
Expand All @@ -153,7 +192,8 @@ deploy_workload() {
}

check_logs_for_errors() {
client_pod=$(oc get pods -n benchmark-operator --no-headers | awk '{print $1}' | grep uperf-client | awk 'NR==1{print $1}')
uuid=$(oc describe -n benchmark-operator benchmarks/uperf-${cr_name}-${WORKLOAD}-network-${pairs} | grep Suuid | awk '{print $2}')
client_pod=$(oc get pods -n benchmark-operator --no-headers | awk '{print $1}' | grep $uuid | grep uperf-client | awk 'NR==1{print $1}')
if [ ! -z "$client_pod" ]; then
num_critical=$(oc logs ${client_pod} -n benchmark-operator | grep CRITICAL | wc -l)
if [ $num_critical -gt 3 ] ; then
Expand All @@ -174,7 +214,11 @@ wait_for_benchmark() {
log "Cerberus status is False, Cluster is unhealthy"
exit 1
fi
oc describe -n benchmark-operator benchmarks/uperf-benchmark-${WORKLOAD}-network-${pairs} | grep State | grep Complete
if [ "${benchmark_state}" == "Failed" ]; then
log "Benchmark state is Failed, exiting"
exit 1
fi
oc describe -n benchmark-operator benchmarks/uperf-${cr_name}-${WORKLOAD}-network-${pairs} | grep State | grep Complete
if [ $? -eq 0 ]; then
log "uperf workload done!"
uperf_state=$?
Expand Down Expand Up @@ -224,31 +268,39 @@ assign_uuid() {
}

run_benchmark_comparison() {
log "Begining benchamrk comparison"
../../utils/touchstone-compare/run_compare.sh uperf ${baseline_uperf_uuid} ${compare_uperf_uuid} ${pairs}
pairs_array=( "${pairs_array[@]}" "compare_output_${pairs}.yaml" )
log "Finished benchmark comparison"
}

generate_csv() {
log "Generating CSV"
python3 csv_gen.py --files $(echo "${pairs_array[@]}") --latency_tolerance=$latency_tolerance --throughput_tolerance=$throughput_tolerance
log "Finished generating CSV"
}

init_cleanup() {
log "Cloning benchmark-operator from branch ${operator_branch} of ${operator_repo}"
rm -rf /tmp/benchmark-operator
git clone --single-branch --branch ${operator_branch} ${operator_repo} /tmp/benchmark-operator --depth 1
oc delete -f /tmp/benchmark-operator/deploy
oc delete -f /tmp/benchmark-operator/resources/crds/ripsaw_v1alpha1_ripsaw_crd.yaml
oc delete -f /tmp/benchmark-operator/resources/operator.yaml
if [[ "${isBareMetal}" == "false" ]]; then
rsevilla87 marked this conversation as resolved.
Show resolved Hide resolved
log "Cloning benchmark-operator from branch ${operator_branch} of ${operator_repo}"
rm -rf /tmp/benchmark-operator
git clone --single-branch --branch ${operator_branch} ${operator_repo} /tmp/benchmark-operator --depth 1
oc delete -f /tmp/benchmark-operator/deploy
oc delete -f /tmp/benchmark-operator/resources/crds/ripsaw_v1alpha1_ripsaw_crd.yaml
oc delete -f /tmp/benchmark-operator/resources/operator.yaml
else
log "BareMetal Infrastructure: Skipping cleanup"
fi
}

delete_benchmark() {
oc delete benchmarks.ripsaw.cloudbulldozer.io/uperf-benchmark-${WORKLOAD}-network-${pairs} -n benchmark-operator
oc delete benchmarks.ripsaw.cloudbulldozer.io/uperf-${cr_name}-${WORKLOAD}-network-${pairs} -n benchmark-operator
}

update() {
benchmark_state=$(oc get benchmarks.ripsaw.cloudbulldozer.io/uperf-benchmark-${WORKLOAD}-network-${pairs} -n benchmark-operator -o jsonpath='{.status.state}')
benchmark_uuid=$(oc get benchmarks.ripsaw.cloudbulldozer.io/uperf-benchmark-${WORKLOAD}-network-${pairs} -n benchmark-operator -o jsonpath='{.status.uuid}')
benchmark_current_pair=$(oc get benchmarks.ripsaw.cloudbulldozer.io/uperf-benchmark-${WORKLOAD}-network-${pairs} -n benchmark-operator -o jsonpath='{.spec.workload.args.pair}')
benchmark_state=$(oc get benchmarks.ripsaw.cloudbulldozer.io/uperf-${cr_name}-${WORKLOAD}-network-${pairs} -n benchmark-operator -o jsonpath='{.status.state}')
benchmark_uuid=$(oc get benchmarks.ripsaw.cloudbulldozer.io/uperf-${cr_name}-${WORKLOAD}-network-${pairs} -n benchmark-operator -o jsonpath='{.status.uuid}')
benchmark_current_pair=$(oc get benchmarks.ripsaw.cloudbulldozer.io/uperf-${cr_name}-${WORKLOAD}-network-${pairs} -n benchmark-operator -o jsonpath='{.spec.workload.args.pair}')
}

get_gold_ocp_version(){
Expand All @@ -257,6 +309,7 @@ get_gold_ocp_version(){
}

print_uuid() {
log "Logging uuid.txt"
cat uuid.txt
}

Expand All @@ -270,3 +323,4 @@ export_defaults
init_cleanup
check_cluster_health
deploy_operator

3 changes: 2 additions & 1 deletion workloads/network-perf/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
gspread
gspread-formatting
oauth2client
pyyaml
PyYAML>=5.4.1
make
2 changes: 1 addition & 1 deletion workloads/network-perf/ripsaw-uperf-crd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
apiVersion: ripsaw.cloudbulldozer.io/v1alpha1
kind: Benchmark
metadata:
name: uperf-benchmark-${WORKLOAD}-network-${pairs}
name: uperf-${cr_name}-${WORKLOAD}-network-${pairs}
namespace: benchmark-operator
spec:
elasticsearch:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,4 +27,5 @@ if [[ ${ENABLE_SNAPPY_BACKUP} == "true" ]] ; then
../../utils/snappy-move-results/run_snappy.sh metadata.json $snappy_path
store_on_elastic
rm -rf files_list
fi
fi
echo -e "${bold}Finished workload run_hostnetwork_network_test_gromgit.sh"
1 change: 1 addition & 0 deletions workloads/network-perf/run_multus_network_tests_fromgit.sh
Original file line number Diff line number Diff line change
Expand Up @@ -114,4 +114,5 @@ fi
# Cleanup
rm -rf /tmp/benchmark-operator
rm -f compare_output_*.yaml
echo -e "${bold}Finished workload run_multus_network_tests_fromgit.sh"
exit 0
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,4 @@ delete_benchmark
done
print_uuid
generate_csv
echo -e "${bold}Finished workload run_pod_network_policy_test_fromgit.sh"
3 changes: 2 additions & 1 deletion workloads/network-perf/run_pod_network_test_fromgit.sh
Original file line number Diff line number Diff line change
Expand Up @@ -31,4 +31,5 @@ if [[ ${ENABLE_SNAPPY_BACKUP} == "true" ]] ; then
../../utils/snappy-move-results/run_snappy.sh metadata.json $snappy_path
store_on_elastic
rm -rf files_list
fi
fi
echo -e "${bold}Finished workload run_pod_network_test_fromgit.sh"
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,4 @@ delete_benchmark
done
print_uuid
generate_csv
echo -e "${bold}Finished workload run_serviceip_network_policy_test_fromgit.sh"
3 changes: 2 additions & 1 deletion workloads/network-perf/run_serviceip_network_test_fromgit.sh
Original file line number Diff line number Diff line change
Expand Up @@ -31,4 +31,5 @@ if [[ ${ENABLE_SNAPPY_BACKUP} == "true" ]] ; then
../../utils/snappy-move-results/run_snappy.sh metadata.json $snappy_path
store_on_elastic
rm -rf files_list
fi
fi
echo -e "${bold}Finished workload run_serviceip_network_test_fromgit.sh"
3 changes: 2 additions & 1 deletion workloads/network-perf/smoke_test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -31,4 +31,5 @@ if [[ ${ENABLE_SNAPPY_BACKUP} == "true" ]] ; then
../../utils/snappy-move-results/run_snappy.sh metadata.json $snappy_path
store_on_elastic
rm -rf files_list
fi
fi
echo -e "${bold}Finished workload smoke_test.sh"
Loading