You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 16, 2024. It is now read-only.
Hi, I've met a problem and have no idea how to fix. I have several nodes deployed with rdma sriov device plugin and the sriov cni, and everythin goes ok and pods can communicate with others via the vhca device whether the pods are launched on the same node or not. But one day, one the node goes bad, new pod launched on it fails to require a vhca device(the pod is launched normally and in Running phase), but everything seems ok.
I've checked the log as below:
I use test-sriov-pod.yaml to create test pod, the pod can be lauched normally and in Running phase, but the network interface is not a vhca device and no vhca devices found with show_gids:
# ethtool -i eth0
driver: veth
version: 1.0
firmware-version:
expansion-rom-version:
bus-info:
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: no
# show_gids
DEV PORT INDEX GID IPv4 VER DEV
--- ---- ----- --- ------------ --- ---
n_gids_found=0
The sriov cni configuration is as below and it's the only cni on that node:
Besides, I found that all the vhca interface is in down status with command ip a, and I let them up with command ifconfig <eth-name> up manually, but nothing is changed.
Thanks for your help!
The text was updated successfully, but these errors were encountered:
Hi @flymark2010, I see the output of ethtool -i eth0.
Driver is veth instead of mlx5.
It indicates that instead of sriov-cni some other cni provided the eth device. You might want to check if you have any other cni config file which is taking priority due to lexical ordering that you faced last time.
@flymark2010, veth is certainly not a mlx5 driver. So something went wrong there.
you can try to unload veth driver at that host using rmmod veth and see that this interface goes away from the Pod.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Hi, I've met a problem and have no idea how to fix. I have several nodes deployed with rdma sriov device plugin and the sriov cni, and everythin goes ok and pods can communicate with others via the vhca device whether the pods are launched on the same node or not. But one day, one the node goes bad, new pod launched on it fails to require a vhca device(the pod is launched normally and in Running phase), but everything seems ok.
I've checked the log as below:
test-sriov-pod.yaml
to create test pod, the pod can be lauched normally and in Running phase, but the network interface is not a vhca device and no vhca devices found withshow_gids
:Besides, I found that all the vhca interface is in
down
status with commandip a
, and I let them up with commandifconfig <eth-name> up
manually, but nothing is changed.Thanks for your help!
The text was updated successfully, but these errors were encountered: