-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential extension to the library #45
Comments
Have you ever actually encountered an exception with
The thought of having our own internal job queue never occurred to me, since that's supposed to be the whole point of using something like Grid Engine in the first place, but I can see how that would solve the batching problem we've been discussing for a while in #27. That said, I think just simply placing the jobs on hold after they get submitted to Grid Engine would be the better user experience, since they'd get to see the jobs as suspended/held in I'd love the help with GridMap, so if you put together a PR that uses either approaches, I'd gladly review it. |
I will be modifying the library code because I need these changes to run my What's a good way to go about architecting this from a code standpoint? |
Hello...
Edit: I just realized that there is a related issue already opened: Add support for limiting the number of concurrently executing jobs #27. I believe this is also what I am trying to address with my approach below.
I am trying to modify gridmap behavior. I have a bunch of jobs(~10,000 or so) that I have to run to completion. Essentially, I don't care about combining results from each individual job, I just need to run each self-contained job to completion. I would like to use your infrastructure to essentially build on top of to achieve this. Also, I don't want to submit all ~10,000 jobs at once to the GridEngine. Instead I want to batch these jobs up in chunks and submit them...essentially, build a pool of processes. A queue would probably be a better term? When one process finishes, either due to completion or exception, remove this from the job queue and add one or more new processes to the pool based on the number of free spots left in the pool.
According to me, minimally, this would involve two changes to the JobMonitor script to achieve this:
Do you think this will work? I'd appreciate your thoughts on this.
The text was updated successfully, but these errors were encountered: