POC: agent integration tests using test containers #322

pablochacin · 2023-08-25T11:24:25Z

Description

The agent tests usually require running the agent process and the SUT process, which is the target of the test requests. For example, for HTTP fault injection the SUT is usually httpbin. The agent uses iptables to redirect the traffic from the SUT to itself.

The main difficulty when testing the agent was how to set an environment in which the agent could safely modify the iptables.

A Kubernetes pod seemed a reasonable option as each pod can run multiple containers sharing the same network stack. Besides, the most common deployment of the agent is as an ephemeral container in a Pod.

However, implementing the integration tests in this way created several issues:

Tests were slow because required creating a cluster (this has been recently mitigated by Add tool for e2e test cluster setup and cleanup #313) and also deploying pods in the cluster for each test
The setup was complex and obscured the purpose of the tests

This PR is a Proof of concept of using TestContainers for the agent integration tests. TestContainers allows spawning multiple containers with the components of the tests. It also offers a library of utilities for setting the containers and retrieving information such as the container's exposed ports.

Under the philosophy of TestContainer, the agent and the SUP should run as two independent containers. However, as explained above, the agent needs to share the network stack with the SUT in order to inject the traffic redirection rules.

The test implemented in this POC exploits a not well-documented feature in Docker that allows attaching a container to the network stack (or network namespace) of another container.

This workaround could be avoided by creating a test image that includes not only the agent but also other components such as httpbin and grpcbin, and starting them as processes in the same container. However, this approach introduces several issues, such as creating the test image and launching each component as a process inside the container.

Known issues and limitations

The tests seem not to work in MacOS workers. It fails with this error

2023/08/25 12:06:05 failed getting information about docker server: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?`

Currently the tests use the latest tag for the agent's image. As explained in detail in Use the current branch's commit as the tag for Agent's integration tests #324, doing so introduces the risk of testing the version from the main branch instead of the one from the current branch. This could be solved relatively easily when testing locally, but in the CI requires adding additional steps for publishing the image with a tag that refers to the current branch.

Checklist:

I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have added tests that prove my fix is effective or that my feature works.
I have run linter locally (make lint) and all checks pass.
I have run tests locally (make test) and all tests pass.
I have run relevant e2e test locally (make e2e-xxx for agent, disruptors, kubernetes or cluster related changes)
Any dependent changes have been merged and published in downstream modules

roobre

These look very nice! I just gave them a couple of runs locally and run very fast. The network mode workaround looks pretty neat as the testContainers API supports it nicely.

RE test not working in MacOS, that's interesting, as the log seems to imply that TestContainer is attempting to connect to docker through the usual unix socket. If I recall correctly, in MacOS DOCKER_HOST should be set to a tcp://somethingsomething address, and TestContainers claims to honor DOCKER_HOST.

Perhaps it would be worth checking what is the value of DOCKER_HOST in that machine to see if the system is misconfigured, or if that's a bug in TestContainers.

pablochacin · 2023-08-25T16:44:13Z

For me, that it doesn't work in MacOS in the CI is a minor issue, as we can run it only in Linux.

What really worries me is that we don't find a workaround for developers trying to contribute to the project.

roobre · 2023-08-25T19:02:14Z

I'll see if I can get my hands on an OSX laptop to see if I can reproduce the issue.

roobre · 2023-08-28T10:35:53Z

I was able to successfully run these on a borrowed M1 MacBook Pro with Docker Desktop 4.21.1, without any additional changes (go test -tags integration -v -cover -race ./...).

On that machine, DOCKER_HOST was unset, and there was a unix socket file in the usual path (/var/run/docker.sock).

Signed-off-by: Pablo Chacin <[email protected]>

pablochacin marked this pull request as draft August 25, 2023 11:30

pablochacin force-pushed the agent-integration-tests branch 4 times, most recently from 857a26c to 181657c Compare August 25, 2023 13:42

pablochacin marked this pull request as ready for review August 25, 2023 14:37

pablochacin requested a review from roobre August 25, 2023 14:37

roobre approved these changes Aug 25, 2023

View reviewed changes

pablochacin mentioned this pull request Aug 26, 2023

Use the current branch's commit as the tag for Agent's integration tests #324

Open

pablochacin added 3 commits August 29, 2023 18:58

POC: agent integration tests using test containers

95bf137

Signed-off-by: Pablo Chacin <[email protected]>

Add target in Makefile for integration tests

b848926

Signed-off-by: Pablo Chacin <[email protected]>

Document integration tests

fb6d1f5

Signed-off-by: Pablo Chacin <[email protected]>

pablochacin force-pushed the agent-integration-tests branch from 4acd929 to fb6d1f5 Compare August 29, 2023 17:04

pablochacin merged commit 09fcf9e into main Aug 29, 2023
6 checks passed

pablochacin deleted the agent-integration-tests branch August 29, 2023 17:10

pablochacin mentioned this pull request Aug 30, 2023

Complete agent integration tests #327

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

POC: agent integration tests using test containers #322

POC: agent integration tests using test containers #322

pablochacin commented Aug 25, 2023 •

edited

Loading

roobre left a comment

pablochacin commented Aug 25, 2023

roobre commented Aug 25, 2023

roobre commented Aug 28, 2023

POC: agent integration tests using test containers #322

POC: agent integration tests using test containers #322

Conversation

pablochacin commented Aug 25, 2023 • edited Loading

Description

Known issues and limitations

Checklist:

roobre left a comment

Choose a reason for hiding this comment

pablochacin commented Aug 25, 2023

roobre commented Aug 25, 2023

roobre commented Aug 28, 2023

pablochacin commented Aug 25, 2023 •

edited

Loading