Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add Tls secret to the reader sidecar container (#906)
[comment]: # (Note that your PR title should follow the conventional commit format: https://conventionalcommits.org/en/v1.0.0/#summary) # PR Description Test Cluster: https://ms.portal.azure.com/#@microsoft.onmicrosoft.com/resource/subscriptions/1a3fd8b1-7a92-4730-8e47-dec9e67f49a9/resourceGroups/testrecalertswcussoham/providers/Microsoft.ContainerService/managedClusters/TestRecAlertsWcusSoham/overview Openssl command used to generate the certs with specific IP SAN: openssl req -new -newkey rsa:2048 -days 365 -nodes -x509 -keyout client-key.pem -out client-cert.pem -subj "/C=US/ST=WA/L=Seattle/O=Microsoft/CN=PrometheusClient" -addext "subjectAltName = IP:10.224.0.4" This fix adds Tls secret to the reader sidecar container. Testing with configmap: 1. Secret create: kubectl create secret generic ama-metrics-mtls-secret --from-file=client-cert.pem=client-cert.pem --from-file=client-key.pem=client-key.pem -n kube-system 2. Configmap used: https://github.com/Azure/prometheus-collector/blob/main/internal/referenceapp/linux-https-scrape-config.yaml <img width="1889" alt="configmaptls" src="https://github.com/Azure/prometheus-collector/assets/31517098/ac20ffa9-6225-4e16-afd3-005c3b7a2338"> <img width="1892" alt="targetstls" src="https://github.com/Azure/prometheus-collector/assets/31517098/3d39971b-ffef-4995-936d-b597118ca7e8"> Delete secret and then create the secret to an invalid/corrupted cert -> pods restart -> metric flow stops due to invalid auth. I then deleted and created the secret again to correct cert using command: kubectl create secret generic ama-metrics-mtls-secret --from-file=client-cert.pem=client-cert.pem --from-file=client-key.pem=client-key.pem. Metric flow continues after pods restart. <img width="1859" alt="invalidtovalid" src="https://github.com/Azure/prometheus-collector/assets/31517098/dc03304b-c2e9-4e16-af0c-06d544643532"> Testing with CRD: 1. Then I deleted secret and created a secret for CRD with command: kubectl create secret generic ama-metrics-mtls-secret --from-file=secret_kube-system_ama-metrics-mtls-secret_client-cert.pem=secret_kube-system_ama-metrics-mtls-secret_client-cert.pem --from-file=secret_kube-system_ama-metrics-mtls-secret_client-key.pem=secret_kube-system_ama-metrics-mtls-secret_client-key.pem -n kube-system 2. I then deleted the configmap and created a podmonitor. File used: https://github.com/Azure/prometheus-collector/blob/main/otelcollector/deploy/example-custom-resources/pod-monitor/pod-monitor-reference-app-mtls.yaml Metrics flow. <img width="1637" alt="crdsettings" src="https://github.com/Azure/prometheus-collector/assets/31517098/e142ad7c-6e3b-4819-aa23-466a581b7d22"> <img width="1694" alt="crdrestart" src="https://github.com/Azure/prometheus-collector/assets/31517098/8cee2c0d-86c3-4cbc-be55-6a25a05d3340"> <img width="1873" alt="crdtls" src="https://github.com/Azure/prometheus-collector/assets/31517098/fa93d183-9452-48aa-bd41-59357db9e1ed"> [comment]: # (The below checklist is for PRs adding new features. If a box is not checked, add a reason why it's not needed.) # New Feature Checklist - [ ] List telemetry added about the feature. - [ ] Link to the one-pager about the feature. - [ ] List any tasks necessary for release (3P docs, AKS RP chart changes, etc.) after merging the PR. - [ ] Attach results of scale and perf testing. [comment]: # (The below checklist is for code changes. Not all boxes necessarily need to be checked. Build, doc, and template changes do not need to fill out the checklist.) # Tests Checklist - [ ] Have end-to-end Ginkgo tests been run on your cluster and passed? To bootstrap your cluster to run the tests, follow [these instructions](/otelcollector/test/README.md#bootstrap-a-dev-cluster-to-run-ginkgo-tests). - Labels used when running the tests on your cluster: - [ ] `operator` - [ ] `windows` - [ ] `arm64` - [ ] `arc-extension` - [ ] `fips` - [ ] Have new tests been added? For features, have tests been added for this feature? For fixes, is there a test that could have caught this issue and could validate that the fix works? - [ ] Is a new scrape job needed? - [ ] The scrape job was added to the folder [test-cluster-yamls](/otelcollector/test/test-cluster-yamls/) in the correct configmap or as a CR. - [ ] Was a new test label added? - [ ] A string constant for the label was added to [constants.go](/otelcollector/test/utils/constants.go). - [ ] The label and description was added to the [test README](/otelcollector/test/README.md). - [ ] The label was added to this [PR checklist](/.github/pull_request_template). - [ ] The label was added as needed to [testkube-test-crs.yaml](/otelcollector/test/testkube/testkube-test-crs.yaml). - [ ] Are additional API server permissions needed for the new tests? - [ ] These permissions have been added to [api-server-permissions.yaml](/otelcollector/test/testkube/api-server-permissions.yaml). - [ ] Was a new test suite (a new folder under `/tests`) added? - [ ] The new test suite is included in [testkube-test-crs.yaml](/otelcollector/test/testkube/testkube-test-crs.yaml).
- Loading branch information