Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrency/Thread-safety #68

Open
banksJeremy opened this issue May 16, 2014 · 0 comments
Open

Concurrency/Thread-safety #68

banksJeremy opened this issue May 16, 2014 · 0 comments

Comments

@banksJeremy
Copy link
Collaborator

As discussed in #59, the public Client interface should be entirely thread-safe, whatever the users do. (The current implementation has probably unsafe behaviour internally, without any help from the users.)


The concurrency model of this code might have been sane before I touched it, but given the work I've done I'm sure isn't any more, and that there are probably many possible race conditions that could result in bugs.

For example, users from two different threads could both make requests and read and write from a Message event at the same time, possibly resulting in errors.

I propose that Client ensures that nothing modify its data from outside of a main worker thread. Anything that could modify data will need to be passed in through a queue, which will be processed by that single thread. Client will also manage our connections for throttling. Any public (non-prefixed) methods on Client should be safe to use from any thread.

Given that you have a message, and you access the missing message.user_name:

  • it calls message.scrape_transcript()
  • which queues a request for the worker thread, then blocks on a response queue
  • the worker thread makes the HTTP request through Browser (it uses the throttling > mechanism, so the request may be queued and not take place instantly)
  • once the worker thread gets the response, is updates all of the Messages and > other objects that it has learned about
  • the worker thread returns a value through the response queue
  • execution resumes in the initial thread, with the message.user_name value now > populated

The Client should also de-duplicate requests made at the same time, when possible. For example, if two different threads both call request_transcript around the same time because a field is missing, client should notice that they want the same information and only make a single request.

If a request method was called from the main worker thread, maybe it could recognize that and make the call without using the queue, to prevent a deadlock.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant