-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: add lock for enif_send synchronization #19
Conversation
if(!cancelled) { | ||
enif_mutex_lock(future->msg_lock); | ||
msg = T2(future->msg_env, future->msg_ref, ATOM_ready); | ||
enif_send(NULL, &(future->pid), future->msg_env, msg); | ||
enif_mutex_unlock(future->msg_lock); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is buggy, see erlang/otp#8358 (comment) and the following comments. Both the part about not passing NULL
blindly as the environment for enif_send
, and creating a new message environment for each call to enif_send
(make sure to copy msg_ref
into that environment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jhogberg really appreciated for your comments!
We are stuck here, for what to set for the caller_env
in enif_send
.
It is clear that we could use NULL
from callback in a 3rd party managed thread.
However as you mentioned in erlang/otp#8358 (comment), it is tricky to handle the sync callback from the 3rd party which is executed by beam managed scheduler thread.
In the PR https://github.com/emqx/couchdb-erlfdb/pull/18/files#diff-ceadfa165520612e5b6b1a255f7680bcaa9e2a67bf47677f6884fd1db3cb03f5L122 it tried to fix the issue of reusing the env
of previous nif function call, as the env
becomes invalid after that NIF function call. The old behaviour is to save the caller's env and reuse it in the caller. So we change it to NULL
to pretend the enif_send
is called from a none beam managed and it seems to work in our CI env.
as I see you propose following:
thread-specific data is probably the most straightforward way
Do you mean we could use "process independent environments " ? We are a bit unsure about that because it was not stated in the doc about call_env
caller_env
The environment of the calling thread ([process bound](https://www.erlang.org/doc/man/erl_nif#proc_bound_env) or [callback](https://www.erlang.org/doc/man/erl_nif#callback_env) environment) or NULL if calling from a custom thread not spawned by ERTS.
BTW I look into the enif_send
implementation and I couldn't see set caller_env
to NULL in scheduler thread could cause issue as when caller_env
is set to NULL, the c_p
is set to NULL then the side effect of it is just send the from
to undefined. I am double checking, will be appreciated if you could give me some hints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I mean using enif_tsd_key_create
when the NIF is loaded and then enif_tsd_set(key_goes_here, nif_environment_goes_here)
on every entry into a NIF function/callback, and then passing the result of enif_tsd_get(key_goes_here)
as the caller environment of enif_send
.
That will ensure that the process-bound/callback environment is passed when running on a BEAM-managed thread, and NULL otherwise. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the process-bound env safe to pass here? what if the process is dead or moved to other scheduler thread?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The process-bound or callback environment is guaranteed to be safe to use until its corresponding NIF or NIF callback returns. Thus, if you always set the environment through enif_tsd_set
on entry to NIFs/NIF callbacks, you can use enif_tsd_get
to get a safe environment in case erlfdb_future_cb
is called synchronously.
When erlfdb_future_cb
is called asynchronously, enif_tsd_get
will return NULL
since you've never set it on the third-party threads, which is fine since enif_send
requires NULL
in that case.
cancelled = future->cancelled; | ||
enif_mutex_unlock(future->lock); | ||
enif_mutex_unlock(future->cancel_lock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we want to prevent duplicated message sending below, we should take the locker longer.
cancelled = future->cancelled; | ||
enif_mutex_unlock(future->lock); | ||
enif_mutex_unlock(future->cancel_lock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we want to prevent duplicated message sending below, we should take the locker longer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please look at a newer version 😅: #20
#18 (comment)