Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mount control #14

Open
l0kod opened this issue Jan 19, 2024 · 9 comments
Open

Mount control #14

l0kod opened this issue Jan 19, 2024 · 9 comments
Assignees
Labels
enhancement New feature or request

Comments

@l0kod
Copy link
Member

l0kod commented Jan 19, 2024

To avoid filesystem (FS) security policy bypass, a landlocked process with FS restrictions cannot do any FS topology changes (see d722036), which include any mount calls.

Even with FS restrictions, it would be useful for some use cases to be able to safely do new mount and umount.

The main issue I see is that we may want to allow a set of accesses on the newly mount points, independently from the existing path_beneath rules because the new mount point would overlap part of the initial file hierarchy. For this reason, I think we could have a new type of rule dedicated to mount access rights, something like LANDLOCK_RULE_MOUNT. With a dedicated attr struct, probably with landlock_path_beneath_attr's equivalent fields, we'll be able to configure a whole mount point and enforce specific options such as ro and noexec. I guess the current LSM hooks should be enough.

Because a mount would change the file hierarchy, we would also need a dedicated LANDLOCK_ACCESS_FS_MOUNT right to control this change. Everything beneath such mount point will get the source's LANDLOCK_RULE_MOUNT properties/restrictions.

For bind mounts, I think we can follow the same checks as for LANDLOCK_ACCESS_FS_REFER with the LANDLOCK_ACCESS_FS_MOUNT (which could then be used for source and destination).

This approach should also enable to allow a service to do mounts without giving it the right to access them.

I suggest to start with the bind mount case and then incrementally add support for the block device mount case (and the new related rule type).

I now think unmounts should never be denied though, so we may want to add a new flag at the ruleset level to only manage compatibility.

Related chromeOS CL: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/5077507

@l0kod l0kod added the enhancement New feature or request label Jan 19, 2024
@supersonnic
Copy link

This is useful, thanks for the details. I am working on updating that patch to use the refer logic instead of naively allowing a bind mount. I will update as I make progress.

@l0kod l0kod added this to Landlock Feb 1, 2024
@l0kod l0kod moved this to In Progress in Landlock Feb 1, 2024
@supersonnic
Copy link

Hi @l0kod, I updated the patch to support the same checks as REFER for bind-mounts and wanted to get some feedback on the high-level design. One interesting side-effect is that for the bind-mount to succeed, the source's parent needs to carry the LANDLOCK_ACCESS_FS_MOUNT rule, in addition to the destination directory (i.e. mount point). I think this is OK, as it works the same way for rename and link. For example, if I want to move /parent/child, I need to have the REFER rule applied to /parent. Similarly, to mount /parent/child the MOUNT rule should be applied to /parent. Let me know if the patch is inline with what you envisioned. Thanks.

@l0kod
Copy link
Member Author

l0kod commented Feb 13, 2024

This looks good overall.

Thinking more about it, we should probably (re)use LANDLOCK_ACCESS_FS_REFER (instead of LANDLOCK_ACCESS_FS_MOUNT) to check the right to "refer" a file/directory because the related Landlock constraints are the same. This is for the UAPI, but the kernel implementation may differ a bit because of different assumptions (e.g. EXDEV, EACCES, or EPERM error codes).

LANDLOCK_ACCESS_FS_MOUNT should still be required on the destination though, similarly to LANDLOCK_ACCESS_FS_MAKE_*.

I'm worried about a race conditions in hook_sb_mount() regarding dev_name. I think we need a new LSM hook for bind mounts to avoid this kind of issue.

With that in mind, and a clang-formatted code, you can send a first RFC patch series, we'll review it there. Please add at least a few tests for now and propose a set of test scenarios for when we'll be OK with the approach.

You can use https://github.com/landlock-lsm/landlock-test-tools on top of my next branch to test it, or at least on top of b60a173 to avoid the IOCTL patches.

@supersonnic
Copy link

LANDLOCK_ACCESS_FS_MOUNT should still be required on the destination though, similarly to LANDLOCK_ACCESS_FS_MAKE_*.

Doesn’t this mean in practice we are also requiring LANDLOCK_ACCESS_FS_MOUNT on the source? If the rule only exists on the destination, the source would gain a new privilege (mount privilege) through the bind mount operation so the operation would fail with EXDEV. If it is missing on the destination, we get EACCES. Is that the behavior we want?

I'm worried about a race conditions in hook_sb_mount() regarding dev_name. I think we need a new LSM hook for bind mounts to avoid this kind of issue.

I think I found a place where we can add a hook for bind mount after the path is evaluated to eliminate the race.

Will work on the tests and preparing for RFC patches.

@l0kod
Copy link
Member Author

l0kod commented Mar 11, 2024

LANDLOCK_ACCESS_FS_MOUNT should still be required on the destination though, similarly to LANDLOCK_ACCESS_FS_MAKE_*.

Doesn’t this mean in practice we are also requiring LANDLOCK_ACCESS_FS_MOUNT on the source? If the rule only exists on the destination, the source would gain a new privilege (mount privilege) through the bind mount operation so the operation would fail with EXDEV. If it is missing on the destination, we get EACCES. Is that the behavior we want?

Because of the way mounts propagate by default, we should only allow bind mounts on private mounts if the source doesn't have LANDLOCK_ACCESS_FS_MOUNT. We should then be able to create a new bind mount on the destination directory and mark it as private, and then do another bind mount with the source directory on the destination one (covering the private mount point).

We should also be careful to only create mount point that are not less-restrictive than the source (e.g. removing ro).

For mount operation, no EXDEV error should be returned, only EPERM or EACCES (see mount(2)).

I'm worried about a race conditions in hook_sb_mount() regarding dev_name. I think we need a new LSM hook for bind mounts to avoid this kind of issue.

I think I found a place where we can add a hook for bind mount after the path is evaluated to eliminate the race.

Looking at it again with an up-to-date mount utility, the new mount syscalls (e.g. open_tree, move_mount) are now used and security_move_mount() may be enough.

Will work on the tests and preparing for RFC patches.

Great, we can continue the discussion there.

@l0kod
Copy link
Member Author

l0kod commented Mar 11, 2024

Because of the way mounts propagate by default, we should only allow bind mounts on private mounts if the source doesn't have LANDLOCK_ACCESS_FS_MOUNT.

Well, to allow a bind mount on a shared mount point, it's not the source that should have LANDLOCK_ACCESS_FS_MOUNT but the destination (shared) mount point (not only the destination file hierarchy). collect_domain_accesses() may be useful to check that.

However, if the destination mount point is private, we can look for LANDLOCK_ACCESS_FS_MOUNT in its parent mount point and continue looking for these properties in the file hierarchy...

l0kod pushed a commit that referenced this issue Sep 9, 2024
[ Upstream commit a699781 ]

A sysfs reader can race with a device reset or removal, attempting to
read device state when the device is not actually present. eg:

     [exception RIP: qed_get_current_link+17]
  #8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
  #9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
 #10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
 #11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
 #12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
 #13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
 #14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
 #15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
 #16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
 #17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb

 crash> struct net_device.state ffff9a9d21336000
    state = 5,

state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
The device is not present, note lack of __LINK_STATE_PRESENT (0b10).

This is the same sort of panic as observed in commit 4224cfd
("net-sysfs: add check for netdevice being present to speed_show").

There are many other callers of __ethtool_get_link_ksettings() which
don't have a device presence check.

Move this check into ethtool to protect all callers.

Fixes: d519e17 ("net: export device speed and duplex via sysfs")
Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show")
Signed-off-by: Jamie Bainbridge <[email protected]>
Link: https://patch.msgid.link/8bae218864beaa44ed01628140475b9bf641c5b0.1724393671.git.jamie.bainbridge@gmail.com
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
@l0kod
Copy link
Member Author

l0kod commented Nov 14, 2024

@supersonnic, what is the status of your work? Do you need some help?

This looks ready for an RFC: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/5722641

@supersonnic
Copy link

supersonnic commented Nov 15, 2024

The main issue I'm having right now is that I cannot get LandLock to only accept the bind-mount if the LANDLOCK_ACCESS_FS_MOUNT rule is carried directly by the mount point, and not its parent directories. I tried to achieve this by adding the extra bind-mount check (in fs.c: 1144-1163 of the referenced patch). To do this I pass the parent of the target directory as the mt_root argument for the collect_domain_accesses function, hoping that it stops collecting rules at the parent, effectively only collecting the rules carried by the mount point, but this does not seem to be the case, as the parents' rules keep getting collected also.

In other words, the bug in the patch is that it allows mounts even when one of the mount point parent directories has the rule. This is a problem, as aloowing bind-mounts under an existing bind-mount can open a pathway for policy bypass.

This is an example of a policy bypass that can be achieved if we allow inheritance of the mount point rule. Consider the following directory structure. First we bind-mount /tmp/public to /usr/local/share using the --shared-mount option. This is allowed because /usr/local/share does carry the mount rule and there is no privileges gained (though execution is dropped). Then we bind-mount /bin/exe to /usr/local/share/b. This is allowed because the mount point rule is inherited from b's parent, share and there is no privilege change between /bin/exe and /usr/local/share/b/exe, they both have only RW. However, because the first mount was a shared-mount, the exe mount also propagates under /tmp/public/b through which it can now be executed, hence the privilege escalation!

PXL_20240416_190019349

I considered intercepting the propagate system call, but did not have success with that, so I think the best way to mitigate this is to modify the rule checking behavior for mount point, such that it cannot be inherited, which brings us back the issue I'm having with that patch. I just need a way to collect and check the rules carried by a single directory.

@l0kod
Copy link
Member Author

l0kod commented Nov 25, 2024

OK, this would indeed be a valid attack.

The issue is with the MS_SHARED option which sets the source as a future implicit mount destination. I think it would make sense to only allow MS_SHARED if the destination and the source have (or inherit) the LANDLOCK_ACCESS_FS_MOUNT right.

There is still the problem of the initial (potentially shared) mount points before entering a sandbox (which is already there), but that is outside the scope of Landlock, this is the responsibility of the system owner.

l0kod pushed a commit that referenced this issue Dec 16, 2024
commit 8d9ffb2 upstream.

The kdump kernel is broken on SME systems with CONFIG_IMA_KEXEC=y enabled.
Debugging traced the issue back to

  b69a2af ("x86/kexec: Carry forward IMA measurement log on kexec").

Testing was previously not conducted on SME systems with CONFIG_IMA_KEXEC
enabled, which led to the oversight, with the following incarnation:

...
  ima: No TPM chip found, activating TPM-bypass!
  Loading compiled-in module X.509 certificates
  Loaded X.509 cert 'Build time autogenerated kernel key: 18ae0bc7e79b64700122bb1d6a904b070fef2656'
  ima: Allocated hash algorithm: sha256
  Oops: general protection fault, probably for non-canonical address 0xcfacfdfe6660003e: 0000 [#1] PREEMPT SMP NOPTI
  CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-rc2+ #14
  Hardware name: Dell Inc. PowerEdge R7425/02MJ3T, BIOS 1.20.0 05/03/2023
  RIP: 0010:ima_restore_measurement_list
  Call Trace:
   <TASK>
   ? show_trace_log_lvl
   ? show_trace_log_lvl
   ? ima_load_kexec_buffer
   ? __die_body.cold
   ? die_addr
   ? exc_general_protection
   ? asm_exc_general_protection
   ? ima_restore_measurement_list
   ? vprintk_emit
   ? ima_load_kexec_buffer
   ima_load_kexec_buffer
   ima_init
   ? __pfx_init_ima
   init_ima
   ? __pfx_init_ima
   do_one_initcall
   do_initcalls
   ? __pfx_kernel_init
   kernel_init_freeable
   kernel_init
   ret_from_fork
   ? __pfx_kernel_init
   ret_from_fork_asm
   </TASK>
  Modules linked in:
  ---[ end trace 0000000000000000 ]---
  ...
  Kernel panic - not syncing: Fatal exception
  Kernel Offset: disabled
  Rebooting in 10 seconds..

Adding debug printks showed that the stored addr and size of ima_kexec buffer
are not decrypted correctly like:

  ima: ima_load_kexec_buffer, buffer:0xcfacfdfe6660003e, size:0xe48066052d5df359

Three types of setup_data info

  — SETUP_EFI,
  - SETUP_IMA, and
  - SETUP_RNG_SEED

are passed to the kexec/kdump kernel. Only the ima_kexec buffer
experienced incorrect decryption. Debugging identified a bug in
early_memremap_is_setup_data(), where an incorrect range calculation
occurred due to the len variable in struct setup_data ended up only
representing the length of the data field, excluding the struct's size,
and thus leading to miscalculation.

Address a similar issue in memremap_is_setup_data() while at it.

  [ bp: Heavily massage. ]

Fixes: b3c72fc ("x86/boot: Introduce setup_indirect")
Signed-off-by: Baoquan He <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Acked-by: Tom Lendacky <[email protected]>
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
l0kod pushed a commit that referenced this issue Dec 16, 2024
[ Upstream commit 8619593 ]

I found the following bug in my fuzzer:

  UBSAN: array-index-out-of-bounds in drivers/net/wireless/ath/ath9k/htc_hst.c:26:51
  index 255 is out of range for type 'htc_endpoint [22]'
  CPU: 0 UID: 0 PID: 8 Comm: kworker/0:0 Not tainted 6.11.0-rc6-dirty #14
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
  Workqueue: events request_firmware_work_func
  Call Trace:
   <TASK>
   dump_stack_lvl+0x180/0x1b0
   __ubsan_handle_out_of_bounds+0xd4/0x130
   htc_issue_send.constprop.0+0x20c/0x230
   ? _raw_spin_unlock_irqrestore+0x3c/0x70
   ath9k_wmi_cmd+0x41d/0x610
   ? mark_held_locks+0x9f/0xe0
   ...

Since this bug has been confirmed to be caused by insufficient verification
of conn_rsp_epid, I think it would be appropriate to add a range check for
conn_rsp_epid to htc_connect_service() to prevent the bug from occurring.

Fixes: fb9987d ("ath9k_htc: Support for AR9271 chipset.")
Signed-off-by: Jeongjun Park <[email protected]>
Acked-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
l0kod pushed a commit that referenced this issue Dec 16, 2024
commit 8d9ffb2 upstream.

The kdump kernel is broken on SME systems with CONFIG_IMA_KEXEC=y enabled.
Debugging traced the issue back to

  b69a2af ("x86/kexec: Carry forward IMA measurement log on kexec").

Testing was previously not conducted on SME systems with CONFIG_IMA_KEXEC
enabled, which led to the oversight, with the following incarnation:

...
  ima: No TPM chip found, activating TPM-bypass!
  Loading compiled-in module X.509 certificates
  Loaded X.509 cert 'Build time autogenerated kernel key: 18ae0bc7e79b64700122bb1d6a904b070fef2656'
  ima: Allocated hash algorithm: sha256
  Oops: general protection fault, probably for non-canonical address 0xcfacfdfe6660003e: 0000 [#1] PREEMPT SMP NOPTI
  CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-rc2+ #14
  Hardware name: Dell Inc. PowerEdge R7425/02MJ3T, BIOS 1.20.0 05/03/2023
  RIP: 0010:ima_restore_measurement_list
  Call Trace:
   <TASK>
   ? show_trace_log_lvl
   ? show_trace_log_lvl
   ? ima_load_kexec_buffer
   ? __die_body.cold
   ? die_addr
   ? exc_general_protection
   ? asm_exc_general_protection
   ? ima_restore_measurement_list
   ? vprintk_emit
   ? ima_load_kexec_buffer
   ima_load_kexec_buffer
   ima_init
   ? __pfx_init_ima
   init_ima
   ? __pfx_init_ima
   do_one_initcall
   do_initcalls
   ? __pfx_kernel_init
   kernel_init_freeable
   kernel_init
   ret_from_fork
   ? __pfx_kernel_init
   ret_from_fork_asm
   </TASK>
  Modules linked in:
  ---[ end trace 0000000000000000 ]---
  ...
  Kernel panic - not syncing: Fatal exception
  Kernel Offset: disabled
  Rebooting in 10 seconds..

Adding debug printks showed that the stored addr and size of ima_kexec buffer
are not decrypted correctly like:

  ima: ima_load_kexec_buffer, buffer:0xcfacfdfe6660003e, size:0xe48066052d5df359

Three types of setup_data info

  — SETUP_EFI,
  - SETUP_IMA, and
  - SETUP_RNG_SEED

are passed to the kexec/kdump kernel. Only the ima_kexec buffer
experienced incorrect decryption. Debugging identified a bug in
early_memremap_is_setup_data(), where an incorrect range calculation
occurred due to the len variable in struct setup_data ended up only
representing the length of the data field, excluding the struct's size,
and thus leading to miscalculation.

Address a similar issue in memremap_is_setup_data() while at it.

  [ bp: Heavily massage. ]

Fixes: b3c72fc ("x86/boot: Introduce setup_indirect")
Signed-off-by: Baoquan He <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Acked-by: Tom Lendacky <[email protected]>
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
l0kod pushed a commit that referenced this issue Dec 16, 2024
[ Upstream commit 8619593 ]

I found the following bug in my fuzzer:

  UBSAN: array-index-out-of-bounds in drivers/net/wireless/ath/ath9k/htc_hst.c:26:51
  index 255 is out of range for type 'htc_endpoint [22]'
  CPU: 0 UID: 0 PID: 8 Comm: kworker/0:0 Not tainted 6.11.0-rc6-dirty #14
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
  Workqueue: events request_firmware_work_func
  Call Trace:
   <TASK>
   dump_stack_lvl+0x180/0x1b0
   __ubsan_handle_out_of_bounds+0xd4/0x130
   htc_issue_send.constprop.0+0x20c/0x230
   ? _raw_spin_unlock_irqrestore+0x3c/0x70
   ath9k_wmi_cmd+0x41d/0x610
   ? mark_held_locks+0x9f/0xe0
   ...

Since this bug has been confirmed to be caused by insufficient verification
of conn_rsp_epid, I think it would be appropriate to add a range check for
conn_rsp_epid to htc_connect_service() to prevent the bug from occurring.

Fixes: fb9987d ("ath9k_htc: Support for AR9271 chipset.")
Signed-off-by: Jeongjun Park <[email protected]>
Acked-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
l0kod pushed a commit that referenced this issue Dec 16, 2024
commit 9af2efe upstream.

The fields in the hist_entry are filled on-demand which means they only
have meaningful values when relevant sort keys are used.

So if neither of 'dso' nor 'sym' sort keys are used, the map/symbols in
the hist entry can be garbage.  So it shouldn't access it
unconditionally.

I got a segfault, when I wanted to see cgroup profiles.

  $ sudo perf record -a --all-cgroups --synth=cgroup true

  $ sudo perf report -s cgroup

  Program received signal SIGSEGV, Segmentation fault.
  0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48
  48		return RC_CHK_ACCESS(map)->dso;
  (gdb) bt
  #0  0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48
  #1  0x00005555557aa39b in map__load (map=0x0) at util/map.c:344
  #2  0x00005555557aa592 in map__find_symbol (map=0x0, addr=140736115941088) at util/map.c:385
  #3  0x00005555557ef000 in hists__findnew_entry (hists=0x555556039d60, entry=0x7fffffffa4c0, al=0x7fffffffa8c0, sample_self=true)
      at util/hist.c:644
  #4  0x00005555557ef61c in __hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0,
      block_info=0x0, sample=0x7fffffffaa90, sample_self=true, ops=0x0) at util/hist.c:761
  #5  0x00005555557ef71f in hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0,
      sample=0x7fffffffaa90, sample_self=true) at util/hist.c:779
  #6  0x00005555557f00fb in iter_add_single_normal_entry (iter=0x7fffffffa900, al=0x7fffffffa8c0) at util/hist.c:1015
  #7  0x00005555557f09a7 in hist_entry_iter__add (iter=0x7fffffffa900, al=0x7fffffffa8c0, max_stack_depth=127, arg=0x7fffffffbce0)
      at util/hist.c:1260
  #8  0x00005555555ba7ce in process_sample_event (tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0,
      machine=0x5555560388e8) at builtin-report.c:334
  #9  0x00005555557b30c8 in evlist__deliver_sample (evlist=0x555556039010, tool=0x7fffffffbce0, event=0x7ffff7c14128,
      sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at util/session.c:1232
  #10 0x00005555557b32bc in machines__deliver_event (machines=0x5555560388e8, evlist=0x555556039010, event=0x7ffff7c14128,
      sample=0x7fffffffaa90, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1271
  #11 0x00005555557b3848 in perf_session__deliver_event (session=0x5555560386d0, event=0x7ffff7c14128, tool=0x7fffffffbce0,
      file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1354
  #12 0x00005555557affaf in ordered_events__deliver_event (oe=0x555556038e60, event=0x555556135aa0) at util/session.c:132
  #13 0x00005555557bb605 in do_flush (oe=0x555556038e60, show_progress=false) at util/ordered-events.c:245
  #14 0x00005555557bb95c in __ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND, timestamp=0) at util/ordered-events.c:324
  #15 0x00005555557bba46 in ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND) at util/ordered-events.c:342
  #16 0x00005555557b1b3b in perf_event__process_finished_round (tool=0x7fffffffbce0, event=0x7ffff7c15bb8, oe=0x555556038e60)
      at util/session.c:780
  #17 0x00005555557b3b27 in perf_session__process_user_event (session=0x5555560386d0, event=0x7ffff7c15bb8, file_offset=117688,
      file_path=0x555556038ff0 "perf.data") at util/session.c:1406

As you can see the entry->ms.map was NULL even if he->ms.map has a
value.  This is because 'sym' sort key is not given, so it cannot assume
whether he->ms.sym and entry->ms.sym is the same.  I only checked the
'sym' sort key here as it implies 'dso' behavior (so maps are the same).

Fixes: ac01c8c ("perf hist: Update hist symbol when updating maps")
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Matt Fleming <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this issue Dec 31, 2024
This patch introduces a new LandLock rule, LANDLOCK_ACCESS_FS_MOUNT for
controlling bind mount operations. While this patch focuses on bind
mounts only, the long-term goal is to utilize this rule for controlling
regular device mounts as well. To perform a successful bind mount, both
the source parent and the destination need to have the REFER rule and
the destination needs to carry the MOUNT rule. Note that this does imply
that the MOUNT rule needs to also exist in the source hierarchy, to
avoid rejection due to privilege escalation. The same path access logic
as REFER (used for rename/move and link) is used to check for any
possible privilege escalations before allowing the bind mount.

Additionally, only private bind mounts are allowed. This is to avoid
filesystem hierarchy changes that happen through mount propagation
between peer mount groups. For instance, bind mounting a directory in a
shared mount namespace, can result in hierarchy changes propagating
across the peer groups in that namespace. Such changes cannot be checked
for privilege escalation as the REFER check only covers the source and
the destination surfaced in the bind mount hook function.

If a bind mount request carries a --make-shared flag, or any flag other
than --make-private, the bind mount succeeds while the subsequent
propagation type change fails. This is because, the kernel calls the
mount hook once with the --bind mount flag, and then once more with the
flag for the propagation type change. If the bind mount request carries
the --make-private flag, both the mount operation and the subsequent
propagation type change succeed. Any mount request with the
--make-private flag only is also allowed. Such requests operate on
existing mount points, changing the propagation type to private. In this
case any previously propagated mounts would continue to exist, but
additional bind mounts under the newly privatized mount point, would not
propagate anymore.

Finally, any existing mounts or bind mounts before the process enters a
LandLock domain remain as they are. Such mounts can be of the shared
propagation type, and they would continue to share updates with the rest
of their peer group. While this is an existing behavior, after this
patch such mounts can also be remounted as private, or be unmounted
after the process enters the sandbox. Existing mounts are outside the
scope of LandLock and should be considered before entering the sandbox.

Tests: Regular mount is rejected. Bind mount is rejected if either the
source parent or the mount point do not have the REFER rule, or if the
destination does not have the MOUNT rule. Bind mount is allowed only if
the REFER check passes, i.e.: no restrictions are dropped. Bind mounting
a directory onto itself is always allowed. Mount calls with the flag
--make-private are always allowed. Unmounting is always allowed. Link
and rename functionality is unaffected. All LandLock self-tests pass.

Link: landlock-lsm#14
Signed-off-by: Shervin Oloumi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: In Progress
Development

No branches or pull requests

2 participants