Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add lock for enif_send synchronization #19

Merged
merged 1 commit into from
Apr 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 14 additions & 9 deletions c_src/main.c
Original file line number Diff line number Diff line change
Expand Up @@ -78,17 +78,17 @@ erlfdb_future_cb(FDBFuture* fdb_future, void* data)
ERL_NIF_TERM msg;
bool cancelled;


enif_mutex_lock(future->lock);
enif_mutex_lock(future->cancel_lock);
cancelled = future->cancelled;
enif_mutex_unlock(future->lock);
enif_mutex_unlock(future->cancel_lock);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we want to prevent duplicated message sending below, we should take the locker longer.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we want to prevent duplicated message sending below, we should take the locker longer.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please look at a newer version 😅: #20


if(!cancelled) {
enif_mutex_lock(future->msg_lock);
msg = T2(future->msg_env, future->msg_ref, ATOM_ready);
enif_send(NULL, &(future->pid), future->msg_env, msg);
enif_mutex_unlock(future->msg_lock);
}
Comment on lines 85 to 90
Copy link

@jhogberg jhogberg Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is buggy, see erlang/otp#8358 (comment) and the following comments. Both the part about not passing NULL blindly as the environment for enif_send, and creating a new message environment for each call to enif_send (make sure to copy msg_ref into that environment).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jhogberg really appreciated for your comments!

We are stuck here, for what to set for the caller_env in enif_send.
It is clear that we could use NULL from callback in a 3rd party managed thread.

However as you mentioned in erlang/otp#8358 (comment), it is tricky to handle the sync callback from the 3rd party which is executed by beam managed scheduler thread.

In the PR https://github.com/emqx/couchdb-erlfdb/pull/18/files#diff-ceadfa165520612e5b6b1a255f7680bcaa9e2a67bf47677f6884fd1db3cb03f5L122 it tried to fix the issue of reusing the env of previous nif function call, as the env becomes invalid after that NIF function call. The old behaviour is to save the caller's env and reuse it in the caller. So we change it to NULL to pretend the enif_send is called from a none beam managed and it seems to work in our CI env.

as I see you propose following:

thread-specific data is probably the most straightforward way

Do you mean we could use "process independent environments " ? We are a bit unsure about that because it was not stated in the doc about call_env

caller_env
The environment of the calling thread ([process bound](https://www.erlang.org/doc/man/erl_nif#proc_bound_env) or [callback](https://www.erlang.org/doc/man/erl_nif#callback_env) environment) or NULL if calling from a custom thread not spawned by ERTS.

BTW I look into the enif_send implementation and I couldn't see set caller_env to NULL in scheduler thread could cause issue as when caller_env is set to NULL, the c_p is set to NULL then the side effect of it is just send the from to undefined. I am double checking, will be appreciated if you could give me some hints.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I mean using enif_tsd_key_create when the NIF is loaded and then enif_tsd_set(key_goes_here, nif_environment_goes_here) on every entry into a NIF function/callback, and then passing the result of enif_tsd_get(key_goes_here)as the caller environment of enif_send.

That will ensure that the process-bound/callback environment is passed when running on a BEAM-managed thread, and NULL otherwise. :-)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the process-bound env safe to pass here? what if the process is dead or moved to other scheduler thread?

Copy link

@jhogberg jhogberg Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The process-bound or callback environment is guaranteed to be safe to use until its corresponding NIF or NIF callback returns. Thus, if you always set the environment through enif_tsd_set on entry to NIFs/NIF callbacks, you can use enif_tsd_get to get a safe environment in case erlfdb_future_cb is called synchronously.

When erlfdb_future_cb is called asynchronously, enif_tsd_get will return NULL since you've never set it on the third-party threads, which is fine since enif_send requires NULL in that case.



// We're now done with this future which means we need
// to release our handle to it. See erlfdb_create_future
// for more on why this happens here.
Expand All @@ -107,14 +107,19 @@ erlfdb_create_future(ErlNifEnv* env, FDBFuture* future, ErlFDBFutureGetter gette
ERL_NIF_TERM ret;
fdb_error_t err;

// TODO: check results of each operation.
// TODO: return error if any operation fails and clean previously allocated resources.

f = enif_alloc_resource(ErlFDBFutureRes, sizeof(ErlFDBFuture));
f->future = future;
f->fgetter = getter;
enif_self(env, &(f->pid));
f->msg_env = enif_alloc_env();
f->msg_ref = enif_make_copy(f->msg_env, ref);
f->lock = enif_mutex_create("fdb:future_lock");
f->msg_lock = enif_mutex_create("fdb:future_msg_lock");

f->cancelled = false;
f->cancel_lock = enif_mutex_create("fdb:future_cancel_lock");

// This resource reference counting dance is a bit
// awkward as erlfdb_future_cb can be called both
Expand Down Expand Up @@ -557,10 +562,10 @@ erlfdb_future_cancel(ErlNifEnv* env, int argc, const ERL_NIF_TERM argv[])
}
future = (ErlFDBFuture*) res;

enif_mutex_lock(future->lock);
enif_mutex_lock(future->cancel_lock);

future->cancelled = true;
enif_mutex_unlock(future->lock);
enif_mutex_unlock(future->cancel_lock);

fdb_future_cancel(future->future);

Expand Down Expand Up @@ -589,11 +594,11 @@ erlfdb_future_silence(ErlNifEnv* env, int argc, const ERL_NIF_TERM argv[])
}
future = (ErlFDBFuture*) res;

enif_mutex_lock(future->lock);
enif_mutex_lock(future->cancel_lock);

future->cancelled = true;

enif_mutex_unlock(future->lock);
enif_mutex_unlock(future->cancel_lock);

return ATOM_ok;
}
Expand Down
8 changes: 6 additions & 2 deletions c_src/resources.c
Original file line number Diff line number Diff line change
Expand Up @@ -87,8 +87,12 @@ erlfdb_future_dtor(ErlNifEnv* env, void* obj)
enif_free_env(f->msg_env);
}

if(f->lock != NULL) {
enif_mutex_destroy(f->lock);
if(f->cancel_lock != NULL) {
enif_mutex_destroy(f->cancel_lock);
}

if(f->msg_lock != NULL) {
enif_mutex_destroy(f->msg_lock);
}
}

Expand Down
4 changes: 3 additions & 1 deletion c_src/resources.h
Original file line number Diff line number Diff line change
Expand Up @@ -35,8 +35,10 @@ struct _ErlFDBFuture
ErlNifPid pid;
ErlNifEnv* msg_env;
ERL_NIF_TERM msg_ref;
ErlNifMutex* lock;
ErlNifMutex* msg_lock;

bool cancelled;
ErlNifMutex* cancel_lock;
};


Expand Down