Based on Fu Wei's idea, we employ blocking on L7 but without using
external tools.
[Main ideas]
The main idea is
1) utilize the `X-PeerURLs` field from the header, as we know all
traffic to peer will contain this header field.
2) As we also know that all nodes will create direct connections with their
peers, so the traffic blocking will have to happen at all nodes'
proxies, contrary to the current design, where only the proxy of the
peer is being blackholed.
Based on the main ideas, we introduce a SSL termination proxy so we can
obtain the `X-PeerURLs` field from the header.
[Issues]
There are 2 known issues with this approach
1) still leaking some traffic. But the leaked traffic (as discussed
later won't affect the blackhole idea that we would like to achieve (as
stream and pipeline traffic between raft nodes are now properly terminated)
2) we would need to employ SSL termination proxy, which might lead to a
small performance hit
For 1), as this way of blocking (by utilizing `X-PeerURLs` from the
header) will miss certain types of traffic to certain endpoints, as
the traffic to some endpoints doesn't have the `X-PeerURLs` field.
Currently, here are the known ones: /members, /version, and /raft/probing.
As you can see from the log, its header doesn't contain the `X-PeerURLs`
field, but only the following fields:
- map[Accept-Encoding:[gzip] User-Agent:[Go-http-client/1.1]] /members
- map[Accept-Encoding:[gzip] User-Agent:[Go-http-client/1.1]] /version
- map[Accept-Encoding:[gzip] User-Agent:[Go-http-client/1.1]] /raft/probing
For 2) in order to read out `X-PeerURLs` from the header, we need to
terminate the SSL connection, as we can't drop cleartext traffic
(ref [1]). Thus, a new option `e2e.WithSSLTerminationProxy(true)`
is introduced, which will change the network flow into
```
A -- B's SSL termination proxy - B's transparent proxy - B
^ newly introduced ^ in the original codebase
```
[Known improvements required before turning RFC into PR]
The prototype needs to be further improved for code review after
fixing the following issues:
- blocking only RX or TX traffic (as currently a call to `blackholeTX`
or `blackholeRX` will black both TX and RX traffic instead of just
the specified one.
- slowness when performing test cleanup (I think this is related to the
SSL timeout setting, but I haven't looked into it yet)
- coding style improvements
References:
[1] etcd-io#15595