You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm still using this project and recently ran into a case where I was going to have a multiple instances of an application running concurrently. They needed to share the same cache
Because currently, the cache writes in cache_to_disk are not atomic, this isn't a safe thing to do and can cause all sorts of erroneous cache misses if you have enough concurrent processes. You probably hadn't thought too much about this use-case but I'm pretty sure you understand it to be a known limitation/design choice - it's not a bug
I would like to implement this to solve my use-case and would like to send it as PR if you're interested
It could be an optional behavior, though I don't think there's much to lose in making it default - the performance hit won't be bad for cases where it's not necessary since there will never be lock contention, and these locks are not heavyweight at all
I'm thinking of implementing it as a file-based lock (via flock()) rather than introducing a lot of complexity using something like shared memory. It's only an "advisory" lock - any process could still mess with the files if they really wanted to- but that's not really a concern here
I do understand that Python multiprocessingcan handle all of these things somewhat seamlessly with its own lock classes, but my use-case doesn't involve concurrency via the Python multiprocessing (or even subprocess) packages. Instead, each instance of my application is created completely independent of the others. Imagine, for example, a non-Python web application that invokes a standalone Python application (which happens to use cache_to_disk)
Any thoughts on this?
It should be minimally invasive because my use-case is rather simple. In the end, the logic is pretty simple
For the cache reads, roughly:
lock = get_exclusive_lock(cache_file)
data = read_cache(cache_file)
release_exclusive_lock(cache_file, lock)
I have to re-familiarize myself with the code again because I don't even recall how things are stored on the filesystem, but I don't see any reasons this general approach wouldn't work
It's possible this could even be done quite neatly with decorators, I guess
Any feedback/ideas are welcome
Thanks!
The text was updated successfully, but these errors were encountered:
Hello there!
I'm still using this project and recently ran into a case where I was going to have a multiple instances of an application running concurrently. They needed to share the same cache
Because currently, the cache writes in
cache_to_disk
are not atomic, this isn't a safe thing to do and can cause all sorts of erroneous cache misses if you have enough concurrent processes. You probably hadn't thought too much about this use-case but I'm pretty sure you understand it to be a known limitation/design choice - it's not a bugI would like to implement this to solve my use-case and would like to send it as PR if you're interested
It could be an optional behavior, though I don't think there's much to lose in making it default - the performance hit won't be bad for cases where it's not necessary since there will never be lock contention, and these locks are not heavyweight at all
I'm thinking of implementing it as a file-based lock (via flock()) rather than introducing a lot of complexity using something like shared memory. It's only an "advisory" lock - any process could still mess with the files if they really wanted to- but that's not really a concern here
I do understand that Python
multiprocessing
can handle all of these things somewhat seamlessly with its own lock classes, but my use-case doesn't involve concurrency via the Pythonmultiprocessing
(or evensubprocess
) packages. Instead, each instance of my application is created completely independent of the others. Imagine, for example, a non-Python web application that invokes a standalone Python application (which happens to usecache_to_disk
)Any thoughts on this?
It should be minimally invasive because my use-case is rather simple. In the end, the logic is pretty simple
For the cache reads, roughly:
For the cache writes, roughly:
I have to re-familiarize myself with the code again because I don't even recall how things are stored on the filesystem, but I don't see any reasons this general approach wouldn't work
It's possible this could even be done quite neatly with decorators, I guess
Any feedback/ideas are welcome
Thanks!
The text was updated successfully, but these errors were encountered: