Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDI import from HTTP source gives certificate errors. #3443

Open
keepthemomentum opened this issue Sep 20, 2024 · 7 comments
Open

CDI import from HTTP source gives certificate errors. #3443

keepthemomentum opened this issue Sep 20, 2024 · 7 comments
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@keepthemomentum
Copy link

keepthemomentum commented Sep 20, 2024

I'm trying to import ubuntu cloud image using CDI datavolumes. I get the following certificate error.
It's been working fine, but not now. I have updated CDI operator and cdi-cr from v1.59.0 to v1.60.1 but no luck.

Warning Error 2m41s (x5 over 5m44s) datavolume-import-controller Unable to connect to http data source: HTTP request errored: Get "https://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64.img": tls: failed to verify certificate: x509:

Below is my VM template, I'm using hostpath volumes. I have tried different URL's to pull the cloud image, but i get the same error message.

+++++++++++++++++++++++++++++

---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-storage-stg-vm01
reclaimPolicy: Retain
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: stg-vm01-datavolume
  labels:
    type: local
spec:
  storageClassName: local-storage-stg-vm01
  capacity:
    storage: 15Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  hostPath:
    path: "/datavolume/stg-vm01"
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - vm01-stg-se-hq

apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  name: vm-stg-vm01
spec:
  running: true
  template:
    metadata:
      labels:
        kubevirt.io/size: small
        kubevirt.io/domain: vm-stg-vm01
    spec:
      nodeSelector:
        kubernetes.io/hostname: vm01-stg-se-hq
      domain:
        cpu:
          cores: 2
        devices:
          disks:
            - name: datavolume
              disk:
                bus: virtio
            - name: cloudinitdisk
              disk:
                bus: virtio
          interfaces:
          - name: kubevirt-bridge
            bridge: {}
          - name: default
            bridge: {}
        resources:
          requests:
            memory: 4Gi
      networks:
      - name: kubevirt-bridge
        multus:
          networkName: ovs-kubevirt-bridge-static-stg-vm01
      - name: default
        pod: {}
      volumes:
        - name: cloudinitdisk
          cloudInitNoCloud:
            secretRef:
              name: vm-cloud-config
            networkData: |
              version: 2
              ethernets:
                enp1s0:
                  dhcp4: true
        - name: datavolume
          dataVolume:
            name: data-volume-stg-vm01

  dataVolumeTemplates:
  - metadata:
      name: data-volume-stg-vm01
    spec:
      pvc:
        storageClassName: local-storage-stg-vm01
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 15Gi
      source:
        http:
          url: "https://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64.img"
@awels
Copy link
Member

awels commented Sep 20, 2024

Do you have an http proxy between you and server? I just tried here with the latest from main and it imported just fine.

@keepthemomentum
Copy link
Author

I have listed all the resources running in my k8s cluster. I don't seem to run any proxy services in the cluster.
I have a kubevirt VM running traefik, but thats running for a long time. But no proxy between me and the k8s cluster.

datavolume-import-controller Unable to connect to http data source: HTTP request errored: Get "https://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64.img": tls: failed to verify certificate: x509: certificate is valid for df18656602a54d1a81f437bf661f244e.c276ae934b166edd23d6f5f105f00f17.traefik.default, not cloud-images.ubuntu.com

I see a lot of TLS errors from 'cd-apiserver' pod.

2024/09/21 07:14:31 http: TLS handshake error from 192.168.10.162:41329: EOF 2024/09/21 07:15:31 http: TLS handshake error from 192.168.10.162:13026: EOF 2024/09/21 07:16:31 http: TLS handshake error from 192.168.10.162:7965: EOF 2024/09/21 07:17:31 http: TLS handshake error from 192.168.10.162:38824: EOF 2024/09/21 07:18:31 http: TLS handshake error from 192.168.10.162:45439: EOF 2024/09/21 07:19:31 http: TLS handshake error from 192.168.10.162:8760: EOF 2024/09/21 07:20:31 http: TLS handshake error from 192.168.10.162:42814: EOF 2024/09/21 07:22:31 http: TLS handshake error from 192.168.10.162:11894: EOF 2024/09/21 07:23:31 http: TLS handshake error from 192.168.10.162:3782: EOF

@akalenyu
Copy link
Collaborator

df18656602a54d1a81f437bf661f244e.c276ae934b166edd23d6f5f105f00f17.traefik.default

This bit here suggests there's a proxy in the way (in the default namespace), maybe there's a way to opt out of it?
https://traefik.io/traefik/
We've had community members before submit PRs to opt out of service meshes #3186
/cc @bc185174
I am not sure if your install ended up unmeshing, or did we ever figure out a way to mesh the importer pod?

@keepthemomentum
Copy link
Author

keepthemomentum commented Sep 23, 2024

Traefik is running as a container in a kubevirt VM in my k8's cluster. It's one of the many kubevirt VM's, but im unsure why the traefik container gets the request when i try to import the datavolume using https://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64.img.
Does it mean the traefik container is somehow receiving the request? When i run the VM template i provided above, i see this in the traefik container logs.

time="2024-09-23T16:54:51+02:00" level=debug msg="Serving default certificate for request: "cloud-images.ubuntu.com"" time="2024-09-23T16:54:51+02:00" level=debug msg="http: TLS handshake error from 10.0.0.2:15803: remote error: tls: bad certificate" time="2024-09-23T16:54:51+02:00" level=debug msg="Serving default certificate for request: "cloud-images.ubuntu.com"" time="2024-09-23T16:54:51+02:00" level=debug msg="http: TLS handshake error from 10.0.0.2:59660: remote error: tls: bad certificate"

This is from the CDI importer pod.

I0923 15:00:22.785640 1 importer.go:107] Starting importer
I0923 15:00:22.785691 1 importer.go:182] begin import process
E0923 15:00:22.885924 1 importer.go:347] Get "https://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64.img": tls: failed to verify certificate: x509: certificate is valid for 684689cd270b581267ecdb47244e7812.c92ef085da7e6782f82e8e1c6b2cd628.traefik.default, not cloud-images.ubuntu.com
HTTP request errored
kubevirt.io/containerized-data-importer/pkg/importer.createHTTPReader
pkg/importer/http-datasource.go:350
kubevirt.io/containerized-data-importer/pkg/importer.NewHTTPDataSource
pkg/importer/http-datasource.go:102
main.newDataSource
cmd/cdi-importer/importer.go:272
main.handleImport
cmd/cdi-importer/importer.go:184
main.main
cmd/cdi-importer/importer.go:148
runtime.main
GOROOT/src/runtime/proc.go:271
runtime.goexit
src/runtime/asm_amd64.s:1695

@awels
Copy link
Member

awels commented Sep 23, 2024

That is what it looks like to us. Somehow traefik is intercepting the request and acting like a proxy for it. This is what is causing the certificate failure in the importer pod. We don't know anything about how traefik works, but that looks like the culprit here.

@keepthemomentum
Copy link
Author

Thank you so much your help! I will look more into traefik configuration and try to solve it!

@kubevirt-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@kubevirt-bot kubevirt-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

4 participants