Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Boot timer seems off #4429

Closed
3 tasks done
DavidVentura opened this issue Feb 7, 2024 · 6 comments
Closed
3 tasks done

[Bug] Boot timer seems off #4429

DavidVentura opened this issue Feb 7, 2024 · 6 comments
Assignees
Labels
Priority: Low Indicates that an issue or pull request should be resolved behind issues or pull requests labelled ` Status: Awaiting author Indicates that an issue or pull request requires author action

Comments

@DavidVentura
Copy link
Contributor

Boot timer calculations do not match my expectations

When running a minimal Linux kernel in firecracker, on x86_64, the measured boot time seems off.

To Reproduce

Run this program as /init (creates the /mem char dev)

// Based on https://github.com/firecracker-microvm/firecracker/blob/main/resources/tests/init.c

#include <sys/types.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <unistd.h>
#include <stdio.h>

#include <sys/types.h>
#include <sys/stat.h>
#include <sys/sysmacros.h>

// Base address values are defined in arch/src/lib.rs as arch::MMIO_MEM_START.
// Values are computed in arch/src/<arch>/mod.rs from the architecture layouts.
// Position on the bus is defined by MMIO_LEN increments, where MMIO_LEN is
// defined as 0x1000 in vmm/src/device_manager/mmio.rs.
#ifdef __x86_64__
#define MAGIC_MMIO_SIGNAL_GUEST_BOOT_COMPLETE 0xd0000000
#endif
#ifdef __aarch64__
#define MAGIC_MMIO_SIGNAL_GUEST_BOOT_COMPLETE 0x40000000
#endif

#define MAGIC_VALUE_SIGNAL_GUEST_BOOT_COMPLETE 123

int main () {
    dev_t devnum = makedev(1, 1); // Major number 1, Minor number 1
    if (mknod("/mem", S_IFCHR | 0666, devnum) == -1) {
        perror("mknod");
        return 1;
    }
   int fd = open("/mem", (O_RDWR | O_SYNC | O_CLOEXEC));
   if (fd < 0) {
       perror("open");
       return 1;
   }
   int mapped_size = getpagesize();

   char *map_base = mmap(NULL,
        mapped_size,
        PROT_WRITE,
        MAP_SHARED,
        fd,
        MAGIC_MMIO_SIGNAL_GUEST_BOOT_COMPLETE);

   *map_base = MAGIC_VALUE_SIGNAL_GUEST_BOOT_COMPLETE;
   msync(map_base, mapped_size, MS_ASYNC);
   printf("Hello!\n");
}

Use this config (though I don't think it's relevant):

{
  "boot-source": {
    "kernel_image_path": "your kernel",
    "boot_args": "earlyprintk=serial,ttyS0 console=ttyS0,115200 panic=-1 reboot=t no_timer_check printk.time=1  cryptomgr.notests tsc=reliable 8250.nr_uarts=1 iommu=off pci=off ip.dev_wait_ms=0 mitigations=off root=/dev/vda  ip=172.16.0.2::172.16.0.1:255.255.255.0:hostname:eth0:off quiet init=/init"
  },
  "machine-config": {
    "vcpu_count": 1,
    "backed_by_hugepages": true,
    "mem_size_mib": 128
  },
  "drives": [{
    "drive_id": "rootfs",
    "path_on_host": "your ext4 rootfs, only needs the init",
    "is_root_device": true,
    "is_read_only": false
  }],
  "network-interfaces": [{
    "iface_id": "net1",
    "guest_mac": "06:00:AC:10:00:02",
    "host_dev_name": "tap0"
  }]
}

Start firecracker with

./firecracker --boot-timer --api-sock /tmp/sock --config-file config.json | ts -s '%.S'

Command output

Case 1, SMP kernel:

00.000009 2024-02-07T14:14:16.680232379 [anonymous-instance:main] Running Firecracker v1.7.0-dev
00.036482 2024-02-07T14:14:16.726643784 [anonymous-instance:main] Artificially kick devices.
00.036612 2024-02-07T14:14:16.726710009 [anonymous-instance:main] Successfully started microvm that was configured from one single json
00.037664 Linux version 6.7.3 (david@dev) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #154 SMP Wed Feb  7 12:57:33 UTC 2024
00.038646 Command line: earlyprintk=serial,ttyS0 console=ttyS0,115200 panic=-1 reboot=t no_timer_check printk.time=1  cryptomgr.notests tsc=reliable 8250.nr_uarts=1 iommu=off pci=off ip.dev_wait_ms=0 mitigations=off root=/dev/vda  ip=172.16.0.2::172.16.0.1:255.255.255.0:hostname:eth0:off quiet init=/magic root=/dev/vda rw virtio_mmio.device=4K@0xd0001000:5 virtio_mmio.device=4K@0xd0002000:6
00.038830 [Firmware Bug]: TSC doesn't count with P0 frequency!
00.038942 BIOS-provided physical RAM map:
00.039148 BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
00.039358 BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
00.039568 BIOS-e820: [mem 0x0000000000100000-0x0000000007ffffff] usable
00.039719 printk: legacy bootconsole [earlyser0] enabled
00.069534 2024-02-07T14:14:16.759738272 [anonymous-instance:fc_vcpu 0] Guest-boot-time =  78955 us 78 ms,  15879 CPU us 15 CPU ms
00.069608 Hello!
00.069755 [    0.027539] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000000
00.069882 [    0.027755] Kernel Offset: disabled
00.069914 2024-02-07T14:14:16.760155159 [anonymous-instance:fc_vcpu 0] Received KVM_EXIT_SHUTDOWN signal
00.069974 2024-02-07T14:14:16.760213740 [anonymous-instance:main] Vmm is stopping.
00.070350 2024-02-07T14:14:16.760586684 [anonymous-instance:main] Vmm is stopping.
00.154732 2024-02-07T14:14:16.844886877 [anonymous-instance:main] Firecracker exiting successfully. exit_code=0

Case 2, no-SMP kernel:

00.000018 2024-02-07T14:22:29.345724412 [anonymous-instance:main] Running Firecracker v1.7.0-dev
00.024534 2024-02-07T14:22:29.381460004 [anonymous-instance:main] Artificially kick devices.
00.024594 2024-02-07T14:22:29.381512423 [anonymous-instance:main] Successfully started microvm that was configured from one single json
00.025864 Linux version 6.7.3 (david@dev) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #155 Wed Feb  7 13:22:09 UTC 2024
00.027037 Command line: earlyprintk=serial,ttyS0 console=ttyS0,115200 panic=-1 reboot=t no_timer_check printk.time=1  cryptomgr.notests tsc=reliable 8250.nr_uarts=1 iommu=off pci=off ip.dev_wait_ms=0 mitigations=off root=/dev/vda  ip=172.16.0.2::172.16.0.1:255.255.255.0:hostname:eth0:off quiet init=/magic root=/dev/vda rw virtio_mmio.device=4K@0xd0001000:5 virtio_mmio.device=4K@0xd0002000:6
00.027204 [Firmware Bug]: TSC doesn't count with P0 frequency!
00.027298 BIOS-provided physical RAM map:
00.027454 BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
00.027613 BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
00.027768 BIOS-e820: [mem 0x0000000000100000-0x0000000007ffffff] usable
00.027897 printk: legacy bootconsole [earlyser0] enabled
00.032816 2024-02-07T14:22:29.389725092 [anonymous-instance:fc_vcpu 0] Guest-boot-time =  43539 us 43 ms,  12921 CPU us 12 CPU ms
00.032852 Hello!
00.032971 [    0.004877] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000000
00.033038 [    0.005021] Kernel Offset: disabled
00.033052 2024-02-07T14:22:29.390008166 [anonymous-instance:fc_vcpu 0] Received KVM_EXIT_SHUTDOWN signal
00.033062 2024-02-07T14:22:29.390014799 [anonymous-instance:main] Vmm is stopping.
00.033217 2024-02-07T14:22:29.390138352 [anonymous-instance:main] Vmm is stopping.
00.092148 2024-02-07T14:22:29.449031397 [anonymous-instance:main] Firecracker exiting successfully. exit_code=0

Expected behaviour

In case 1, I'd expect the Guest-boot-time to be 33ms, as that's how much time elapsed between the startup time and the message being sent.

If the measurement includes the VM creation time, then 78ms still doesn't make sense, as ts is measuring 70ms.

Similarly for case 2, I'd expect 8ms for the Guest-boot-time and 32ms for the guest boot + VM creation, instead of 43ms.

Environment

  • Firecracker version: v1.7.0-dev, also tried on v1.6.0
  • Host and guest kernel versions: Host=6.2.0-39-generic, guest=6.7.3
  • Rootfs used: Single binary in /init, source pasted above
  • Architecture: x86_64
  • Any other relevant software versions: No

Additional context

I'm trying to measure boot times with more accuracy than using ts offsets, but the results don't seem to line up.
I've not yet looked into the source code for possible reasons.

Checks

  • Have you searched the Firecracker Issues database for similar problems?
  • Have you read the existing relevant Firecracker documentation?
  • Are you certain the bug being reported is a Firecracker issue?
@xmarcalx xmarcalx added the Priority: Low Indicates that an issue or pull request should be resolved behind issues or pull requests labelled ` label Feb 12, 2024
@pb8o
Copy link
Contributor

pb8o commented Feb 12, 2024

What CPU type are you using? The [Firmware Bug]: TSC doesn't count with P0 frequency! seems similar to #4099 Is it also an AMD CPU? Can you reproduce these differences in an Intel or an ARM CPU?

Also, could you try to run your test with --level Debug ? That outputs some information with event_start/event_end that could be useful for timing.

@DavidVentura
Copy link
Contributor Author

I'm on an AMD Ryzen 5 7640U, I can try to repro on ARM

@roypat roypat added the Status: Awaiting author Indicates that an issue or pull request requires author action label Feb 28, 2024
@pb8o
Copy link
Contributor

pb8o commented Apr 22, 2024

Hi, can you try the same workaround as in #4099, i.e. use lapic=notscdeadline as a Linux boot option? Does that workaround the issue?

@billywhizz
Copy link

i've been looking at firecracker source in recent days and i saw this issue so thought i would investigate. i see same behaviour on ubuntu kernel 6 host/guest.

if we look at the source, we can see the initial timestamp for boot_timer is recorded here:

https://github.com/firecracker-microvm/firecracker/blob/v1.7.0/src/vmm/src/builder.rs#L233

this is before all the work is done to load kernel, initrd (optionally) from disk, configure any boot disk, setup the cpus and guest memory, start the guest kernel and attach devices etc.

you can see the "kick devices" message is happening after all that work has been done when resume_vm is called here

00.036482 2024-02-07T14:14:16.726643784 [anonymous-instance:main] Artificially kick devices.

https://github.com/firecracker-microvm/firecracker/blob/v1.7.0/src/vmm/src/builder.rs#L385

so the 8ms you are talking about is just the time to give the guest thread state machines a kick and start running i think? 🤔

imo, 78ms and 43 ms are the correct timings if what we want to measure includes all the setup and configuration of the vm to the time the port gets kicked in the guest.

@xmarcalx
Copy link
Contributor

xmarcalx commented Jun 10, 2024

Thanks @billywhizz for your dive deep!

@DavidVentura does the explanation provided by @billywhizz solve your concerns?

@pb8o
Copy link
Contributor

pb8o commented Jul 15, 2024

Closing the issue since it seems the issue is resolved. Feel free to reopen if there are any followups.

@pb8o pb8o closed this as completed Jul 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority: Low Indicates that an issue or pull request should be resolved behind issues or pull requests labelled ` Status: Awaiting author Indicates that an issue or pull request requires author action
Projects
None yet
Development

No branches or pull requests

5 participants