Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] ovs process killed, coredump #4645

Open
bobz965 opened this issue Oct 22, 2024 · 7 comments
Open

[BUG] ovs process killed, coredump #4645

bobz965 opened this issue Oct 22, 2024 · 7 comments
Assignees
Labels
bug Something isn't working

Comments

@bobz965
Copy link
Collaborator

bobz965 commented Oct 22, 2024

Kube-OVN Version

master

Kubernetes Version

1.31

Operation-system/Kernel Version

6.8 github ci

Description

image

ovs killed

Steps To Reproduce

github ci

Current Behavior

ovs process killed

Expected Behavior

ovs process running

@bobz965 bobz965 added the bug Something isn't working label Oct 22, 2024
@bobz965
Copy link
Collaborator Author

bobz965 commented Oct 22, 2024

test arm env

image

image

the e2e failed, but the ovs pod is still running (not crashed)

@bobz965
Copy link
Collaborator Author

bobz965 commented Oct 22, 2024

image

@bobz965 bobz965 self-assigned this Oct 22, 2024
@bobz965
Copy link
Collaborator Author

bobz965 commented Oct 22, 2024

先修复一下 arm 环境 lsp 类型 ovn eip 无法 ready 的问题: #4647

@bobz965
Copy link
Collaborator Author

bobz965 commented Oct 23, 2024

image

@zcq98 目前基本确认在该步骤之后触发了 ovs pod 中,ovn-controller 进程的崩溃。

@bobz965
Copy link
Collaborator Author

bobz965 commented Oct 23, 2024

已确认仅创建第二个vlan子网,没有在第二个子网中创建 lrp 类型的 eip, 不会导致 ovn-controller 崩溃

image

image

可以看到 extra 还没有和 vpc 连接
image

@bobz965
Copy link
Collaborator Author

bobz965 commented Oct 23, 2024

一旦 extra vlan subnet 绑定到 vpc,创建出 lrp 就崩溃了 @zcq98

image

image

@bobz965
Copy link
Collaborator Author

bobz965 commented Oct 23, 2024

(ae86) ➜  ovs git:(main) k ko nbctl show
switch 845c29dc-823b-497e-a2b9-32c4a4a8cb25 (extra)
    port localnet.extra
        type: localnet
        addresses: ["unknown"]
    port extra-no-bfd-vpc-124613236
        type: router
        router-port: no-bfd-vpc-124613236-extra
switch 5f9821a5-b892-48b7-889e-e414c64e0efb (ovn-default)
    port kube-ovn-pinger-vksmb.kube-system
        addresses: ["5e:75:0a:8e:a7:b4 10.16.0.9"]
    port coredns-6f6b679f8f-ghz49.kube-system
        addresses: ["36:ca:58:3f:be:a8 10.16.0.7"]
    port ovn-default-ovn-cluster
        type: router
        router-port: ovn-cluster-ovn-default
    port kube-ovn-pinger-nl6jj.kube-system
        addresses: ["3e:df:d4:7f:17:77 10.16.0.8"]
    port coredns-6f6b679f8f-gzxmm.kube-system
        addresses: ["fe:fc:af:4f:5b:a7 10.16.0.6"]
switch abab4d01-5632-40b8-9dd8-fde6935ac865 (join)
    port node-kube-ovn-worker
        addresses: ["2e:d9:82:be:74:fd 100.64.0.2"]
    port join-ovn-cluster
        type: router
        router-port: ovn-cluster-join
    port node-kube-ovn-control-plane
        addresses: ["42:6e:be:d2:5f:2a 100.64.0.3"]
switch fc2d7d79-5e84-4354-9d4b-37aa30e7c80c (no-bfd-subnet-186440052)
    port no-bfd-kube-ovn-worker.ovn-vpc-nat-gw-3437
        addresses: ["ee:02:04:ca:b3:51 192.168.0.3"]
    port no-bfd-kube-ovn-control-plane.ovn-vpc-nat-gw-3437
        addresses: ["fe:34:eb:0d:e5:d3 192.168.0.2"]
    port fip-pod-141121111.ovn-vpc-nat-gw-3437
        addresses: ["d2:65:92:2c:e7:b1 192.168.0.4"]
    port no-bfd-subnet-186440052-no-bfd-vpc-124613236
        type: router
        router-port: no-bfd-vpc-124613236-no-bfd-subnet-186440052
switch 5a722dff-9753-44ee-bc87-654cda9ed951 (external)
    port localnet.external
        type: localnet
        addresses: ["unknown"]
    port external-ovn-cluster
        type: router
        router-port: ovn-cluster-external
    port external-no-bfd-vpc-124613236
        type: router
        router-port: no-bfd-vpc-124613236-external
router 04174800-2e31-446a-a754-cd8702e4ac70 (no-bfd-vpc-124613236)
    port no-bfd-vpc-124613236-no-bfd-subnet-186440052
        mac: "1a:bc:de:93:5e:eb"
        networks: ["192.168.0.1/24"]
    port no-bfd-vpc-124613236-extra
        mac: "66:97:98:78:6f:0b"
        networks: ["172.20.0.4/16"]
        gateway chassis: [8ba50fca-0bfd-4bcc-bf4d-804f2defa7a6 91d42c2a-5131-4eb1-9139-a78a7a21c34f]
    port no-bfd-vpc-124613236-external
        mac: "12:9e:c2:4d:8e:d1"
        networks: ["172.19.0.5/16"]
        gateway chassis: [8ba50fca-0bfd-4bcc-bf4d-804f2defa7a6 91d42c2a-5131-4eb1-9139-a78a7a21c34f]
    nat 1afa9eb9-24ec-4382-a7e4-fb95819a4104
        external ip: "172.19.0.5"
        logical ip: "192.168.0.0/24"
        type: "snat"
    nat 2f32799d-b30e-4b6d-8acb-8984586e905b
        external ip: "172.19.0.5"
        logical ip: "192.168.0.5"
        type: "dnat_and_snat"
    nat d771a18f-75fd-4452-ba42-f4df56b4b7c1
        external ip: "172.19.0.7"
        logical ip: "192.168.0.4"
        type: "dnat_and_snat"
router 9e47e901-d7dc-47a5-a678-919bea116d0f (ovn-cluster)
    port ovn-cluster-external
        mac: "42:cf:21:b1:4e:c9"
        networks: ["172.19.0.6/16"]
        gateway chassis: [91d42c2a-5131-4eb1-9139-a78a7a21c34f 8ba50fca-0bfd-4bcc-bf4d-804f2defa7a6]
    port ovn-cluster-ovn-default
        mac: "66:32:ec:72:c1:36"
        networks: ["10.16.0.1/16"]
    port ovn-cluster-join
        mac: "56:b3:e3:3e:2c:ea"
        networks: ["100.64.0.1/16"]
(ae86) ➜  ovs git:(main)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant