-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add events to enumerate kernel symbols #22
base: master
Are you sure you want to change the base?
Conversation
This adds an event for each kernel symbol during statedump. It uses the kallsyms_on_each_symbol kernel function to go through all available symbols. Going through the symbols adds a fixed cost to the statedump (9ms on a machine where statedump usually takes 20ms, so the overhead for the statedump is slightly lower than 50%) when tracepoints are _not_ enabled. With the symbol tracepoints enabled, it adds ~135 000 events and more than doubles the time of the statedump (adds 35ms, so an overhead of 175% compared to statedump without the symbols). It may cause lost events with the default channel value since the event rate is very dense during that time. A trace with only the kernel symbol events has a size of ~5MB on a normal desktop. In comparison the /proc/kallsyms fils has 6MB. Signed-off-by: Geneviève Bastien <[email protected]>
lttng-kallsyms.c
Outdated
|
||
/* Inspired by and partly taken from linux kernel's | ||
* module.c file, module_kallsyms_on_each_symbol function */ | ||
kallsyms = rcu_dereference(mod->kallsyms); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you use rcu_dereference here ? We are within a module coming notifier, before load_module invokes do_init_module which does rcu_assign_pointer(mod->kallsyms, &mod->core_kallsyms);
So I don't see any point is doing a rcu dereference here, since we are serialized with respect to module loading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because of this comment in module.c, function module_kallsyms_on_each_symbol (line 4153)
/* We hold module_mutex: no need for rcu_dereference_sched */
struct mod_kallsyms *kallsyms = mod->kallsyms;
Since we don't hold the module_mutex here, I thought we need the rcu_dereference call. But if you are sure we don't need it, I'll remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the module coming notifier, called from module loading while holding an initial reference to the module (only released within do_init_module after the coming notifier), we are guaranteed that no concurrent changes occur to this structure (they are not freed by module unloading because we still hold the initial reference).
The comment you refer to applies to accessing this structure from other execution contexts which are not part of the module notifiers.
Add tracepoints to enumerate the new symbols brought when a new kernel module is loaded. The symbol enumeration is done using the struct module's kallsyms field. Part of the code for the enumeration is inspired by or taken from the linux kernel's module.c file. The overhead when tracepoints are enabled depends on the number of symbols in the module, but is in tens or hundreds or microseconds. With tracepoints disabled, overhead is below 10us in the cases tested. When the module is unloaded, there is only one event to notify of the module unload. The lttng_kallsyms_module_coming function is greatly inspired by and a few lines are copy-pasted from linux's module.c file. Signed-off-by: Geneviève Bastien <[email protected]>
kallsyms_symbol_value() was introduced in kernel v5.0. Before that, module.c seems to access st_value field directly. We might want to introduce a wrapper in lttng-modules to use kallsyms_symbol_value for kernels 5.0+, and access the field directly prior to that. |
Don't put extra time on this until we discuss the overall approach. Dumping 5MB worth of data in a statedump, mostly repeating information available from ELF and DWARF, seems inefficient. We might want to do like lttng-ust statedump, where we dump base addresses where libraries are loaded, their path, and build ID. We just need to fetch the information required to map address back to offsets in the kernel image and in kernel modules. The rest could be done offline. |
The kernel symbols in the trace allow to resolve function pointers automatically from the trace's data. This can be useful for events having function pointers as fields, like timers, or the callstack-kernel context which lists the kernel addresses of the current callstack.
Details on overhead and event count can be found in the specific commits.