-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IH-533: Remove usage of forkexecd daemon to execute processes #5995
base: master
Are you sure you want to change the base?
Conversation
// child | ||
if (pipe_fds[0] >= 0) | ||
close(pipe_fds[0]); | ||
execve(args[0], args, envs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now a lot closer to vfork
, and should avoid previous concerns about what functions can be safely called in-between :)
(I haven't reviewed the rest yet)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to reduce that part (C code in Ocaml) the only type passed is a list of string (2 parameters) and the code doing the Ocaml -> C transformation (and vice versa) is in a separated function from the C function doing the job, as suggested by Ocaml manual.
Very happy to see this making progress again. Two questions:
|
} | ||
|
||
/* | ||
* Wait a process with a given timeout. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't this be implemented in OCaml? Is a binding to some function missing for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could, but it was done in C for resource constraints. Launching a thread in higher level languages is not that light (note that there are additional code even to reduce C resource usage).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason about the thread is not that easy to grasp.
On newer Linux kernel (similarly to Windows) you could use pidfd
and a simple poll
to wait for a process easily (on Windows you use the handle to the process). But that would require a newer kernel with newer Glibc (and Linux, but that should not be an issue). Waiting a process without pidfd
and threads would require some global setting (like a signal handler) which, done in a library, is not nice (library should not change global settings) and racy.
Start to think this would be nice as a comment in the code...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do these threads run? We wouldn't want to spawn additional C threads inside XAPI, if anything goes wrong there it could crash XAPI. Can we move the monitoring threads to forkexecd
and query its status via the socket? (although if the spawned process is not a child of forkexec, monitoring its status might be more difficult).
In fact in the future we probably want to move most long running processes to systemd transient units (we just need a newer systemd that supports file descriptor passing), and we can leave the monitoring (and logging) to systemd.
We could also have a single dedicated OCaml thread in XAPI, that is responsible for waiting for all children and notifying the appropriate condition variables. In fact would calling Unix.wait
in a loop on a dedicated thread achieve that? It wouldn't increase resource usage by much (a single thread should have a fixed cost), in fact with lots of processes it might even decrease the resources used compared to launching a C thread for each one.
This should then further reduce the size of the C code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, your comment does not make sense, the intention is to remove forkexed
, it always has been, not to use it in different ways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a dedicated thread in XAPI doesn't require retaining forkexecd, these were 2 separate suggestions:
- keep the monitoring and vforking in forkexecd if that simplifies the code
- use a dedicated thread in XAPI to avoid the need for extra C code
We can do the latter without doing the former.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That explanation makes sense. Your main target is reducing C code. Specifically, this part waiting for a process, potentially with a timeout. However, my Ocaml skills are not that great, so I would probably end up with pretty bad Ocaml code. Can you propose something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can look in more detail on Friday, in principle something like this (but untested, and probably needs a bit more error handling):
let wait_pid_with_timeout ~timeout pid =
let need_to_kill = Atomic.make true in
let kill_pid () =
if Atomic.get need_to_kill then
try Unix.kill Sys.SIGKILL pid
with (Unix.Unix_error(Unix.ESRCH, _, _) -> ()
in
let some_unique_id = .... in
let finally () = Xapi_periodic_scheduler.remove_from_queue some_unique_id in
Xapi_periodic_scheduler.add_to_queue some_unique_id Xapi_periodic_scheduler.OneShot timeout kill_pid;
Fun.protect ~finally (fun () ->
let pid, status = Unix.waitpid [] pid in
Atomic.set need_to_kill false;
status
)
This has the race condition between a process exiting and the timeout kicking in just at the same time (although it is unlikely that we would've also went through all pids, rolled back and got something else meanwhile in that spot).
I see you improved that in your C code, so we could perhaps add a small binding to Unixext.waitid to waitid, so that we can use waitid
instead of waitpid
, and then do the 2 step:
let pid, status = Unixext.(waitid P_PID [WEXITED | WNOWAIT] pid) in
Atomic.set need_to_kill false;
let pid, status = Unix.waitpid [] pid in
4ea627d
to
996b02f
Compare
Yes, normally you don't want these behaviors, however I would prefer to not introduce any regression that could cause frictions or issues. I bet that changing 2) won't cause any issue, but on 1) it's not exactly easy to say. For instance, thinking about migration we launch a program that uses sockets and at some point socket will be closed, potentially causing SIGPIPE to get raised. Forkexecd was there "protecting" SIGPIPE, removing this protection would cause crashes if the program itself forgot to handle that signal. They can easily get fixed later. Testing was the main reason this project was stuck for months. Now the lab is in a better shape, and we fixed some issue, but I would prefer trying avoiding that path again. |
996b02f
to
1a457e5
Compare
} timeout_kill; | ||
|
||
static void * | ||
thread_proc_timeout_kill(void *arg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
XAPI already has a singleshot timer where we can register something to happen after a given time (in OCaml), so we don't necessarily need to spawn a separate thread to do that.
See 'make_timeboxed_rpc' for an example on how to use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks a pretty regression issue on the design. Currently, this library is pretty independent, being able to test it separately. The code you pointed out is using some Xapi code that is really Xapi dependent, requiring some thread to be spawn.
We'll discuss Wednesday about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To sum up discussion, we are rewriting this part in Ocaml trying to reuse Xapi scheduler (making it a separate library).
1a457e5
to
a4a2f94
Compare
361ebf4
to
0f0e1ca
Compare
0f0e1ca
to
cbfb2fd
Compare
cbfb2fd
to
d027414
Compare
d027414
to
e556722
Compare
The syslog feature allows to run a program and redirect its output/error stream to system log. It's triggered passing "syslog_stdout" argument to "Forkhelpers" functions like "execute_command_get_output". To test this feature use a small preload library to redirect syslog writing. This allows to test program without changing it. The log is redirected to /tmp/xyz instead of /dev/log. The name was chosen to allow for future static build redirection. The C program is used only for the test, so the code style take into account this (specifically it does not try to handle all redirect situation and all error paths). Signed-off-by: Frediano Ziglio <[email protected]>
Forkexecd was written to avoid some issues with Ocaml and multi-threading. Instead use C code to launch processes and avoid these issues. Interface remains unchanged from Ocaml side but implementation rely entirely on C code. vfork() is used to avoid performance memory issue. Reap of the processes are done directly. Code automatically reap child processes to avoid zombies. One small helper is used to better separate Ocaml and C code and handling syslog redirection. This allows to better debug in case of issues. Syslog handling is done in a separate process allowing to restart the toolstack and keep launched programs running; note that even with forkexecd daemon one process was used for this purpose. Code tries to keep compatibility with forkexecd, in particular: - SIGPIPE is ignored in the parent; - /dev/null is open with O_WRONLY even for stdin; - file descriptors are limited to 1024. We use close_range (if available) to reduce system calls to close file descriptors. Cgroup is set to avoid systemd closing processes on toolstack restart. There's a fuzzer program to check file remapping algorithm; for this reason the algorithm is in a separate file. To turn internal debug on you need to set FORKEXECD_DEBUG_LOGS C preprocessor macro to 1. Signed-off-by: Frediano Ziglio <[email protected]>
e556722
to
2d88bd1
Compare
We are waiting for a review of the C code by @andyhhp |
Forkexecd was written to avoid some issues with Ocaml and multi-threading.
Instead use C code to launch processes and avoid these issues.
Interface remains unchanged from Ocaml side but implementation rely entirely on C code.
vfork() is used to avoid performance memory issue.
Reap of the processes are done directly.
Code automatically reap child processes to avoid zombies.
One small helper is used to better separate Ocaml and C code and handling syslog redirection. This allows to better debug in case of issues.
Syslog handling is done in a separate process allowing to restart the toolstack and keep launched programs running; note that even with forkexecd daemon one process was used for this purpose.
Code tries to keep compatibility with forkexecd, in particular:
We use close_range (if available) to reduce system calls to close file descriptors.
Cgroup is set to avoid systemd closing processes on toolstack restart.
There's a fuzzer program to check file remapping algorithm; for this reason the algorithm is in a separate file.
To turn internal debug on, you need to set FORKEXECD_DEBUG_LOGS C preprocessor macro to 1.