-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wireguard: bridge fdb entry sometimes missing? #147
Comments
You did keep in mind that some of our tools filter the |
No, I didn't. Which tools filtering it? |
I thought we did in wg_established; but that one's implicit, as the entry does just never handshake and is therefore blocked by awk. |
E.g. the statistics export 5d22e24 |
True, but not what I had in mind.
|
@lemoer Why should there be an fdb entry for 00:00:00:00:00:00? Does it have to do anything with the dummy peer at all? |
Interesting:
|
Maybe we should add a script, that allows to reproduce this issue with less effort in order to have more eyes on it? |
Maybe it's not a bridge fdb problem. I just observed on sn01, that the sn01:
sn09:
|
As quickfix I fired:
My node is now properly connected. But I am not sure whether all problems described in this issue are solved. |
I'll look into it tomorrow in the afternoonn |
I added the milestone "Beginn der stabilen Phase", as this is likely to be a bug. But as this happens sporadically, I am not sure, whether we will resolve this issue before the "stabile Phase". |
On some supernodes, this did not work on system boot. The vx-* interfaces were added to batman but directly removed again. This was because the interfaces were not up. Now we set them up before the vx-* is added to batman. Discussed in #147. #147 (comment)
I implemented a fix for the mentioned issue in 5fc0673. But I am not sure whether all problems described in this issue are solved. |
If what you did in 5fc0673 is indeed a fix, |
I think, the discussed problem here is the same as #175. |
Today there appeared a similar issue, but this time only the route is missing and the fdb entry is there. Maybe it's related, maybe not... (Originally reported by @bschelm via Mail.) I collected some data:WG is established:
WG is established:
IPv6 of the router:
But no appropriate route is installed:
Bridge fdb entry is ok:
|
Even if we restart the service, the route is not created... Some analysis is following: Here we see that we have 91 peers per interface:
A small patch applied to netlink.py: diff --git a/netlink.py b/netlink.py
index 31a1e76..743dfdb 100644
--- a/netlink.py
+++ b/netlink.py
@@ -97,10 +97,13 @@ class ConfigManager:
with WireGuard() as wg:
clients = wg.info(self.wg_interface)[0].WGDEVICE_A_PEERS.value
+ print(f"LEN: {len(clients)}, iface={self.wg_interface}")
for client in clients:
latest_handshake = client.WGPEER_A_LAST_HANDSHAKE_TIME["tv_sec"]
public_key = client.WGPEER_A_PUBLIC_KEY["value"].decode("utf-8")
+ print(f"A: {public_key}")
+
peer = self.find_by_public_key(public_key)
if len(peer) < 1:
peer = WireGuardPeer(public_key)
Shows only 89 or 90 peers:
(Even if 90 would only be an off by one discrepancy, this discrepancy would not be consistent, as a few interfaces also have 89 only.) |
The recent finding of this week has now been fixed in freifunkh/wireguard-vxlan-glue@7c876de . |
Can this be closed as of the last comment? Or is there anything we can / should regularily test (via monitoring)? |
I think it reads as if the finding of that week and not the whole issue was resolved. |
Added further monitoring in Zabbix (FF Wireguard Template). Per supernode:
Added a trigger if these numbers mismatch. |
I just found for my router, that the bridge fdb entry for
00:00:00:00:00:00
was missing when I usedbridge fdb
. Only72:4c:e2:db:6f:37 dev vx-99 dst fe80::247:34ff:fef4:26cc via wg-99 self
is visible.Details:
systemctl restart wg_netlink.service
didn't help.00:00:00:00:00:00
is still not existing.We should keep an eye on this.
The text was updated successfully, but these errors were encountered: