Defining the number of cores to utilize. #8

jlaura · 2013-08-08T03:27:58Z

We need to be careful using (ncores -1) as the number of processing cores. This does not work on a dual core machine when we use slice notation, i.e.

step = nShapes / (cores - 1)
start = range(0, nShapes, step)[0:-1]
end = start[1:]
end.append(nShapes)
slices = zip(start, end)

for c in range(cores-1):
    pids = range(slices[c][0], slices[c][1]) #Here we through an error as c is 0.

Probably the best bet, across pysal, is something like:

import multiprocessing as mp

def my_func(arg1,arg2,kwarg=kword,cores=None):
    if cores = None:
        cores = mp.cpu_count()

pedrovma · 2013-08-08T10:49:49Z

On spreg we've adopted as default:

def __init__(self, y, x, regimes, w, cores=None):
    pool = mp.Pool(cores)

I don't think the 'if statement' is needed and haven't used it so far. In mp.Pool(processes), processes is the number of worker processes to use. If processes is None then the number returned by cpu_count() is used. So that if statement seems redundant.
http://docs.python.org/2/library/multiprocessing.html#module-multiprocessing.pool

jlaura · 2013-08-08T15:07:56Z

That works for instances where map is being used without explicitly defining a chunk size, but fails if we need to determine a chunk size or a step size for apply_async, i.e. we could not compute the first block of code in the initial post since cores would be None.

dfolch · 2013-08-08T17:37:56Z

Is this thread concerning parallel pysal (pPysal) or regular pysal? It seems that regular pysal should default to use one core, and pPysal to all cores on the machine.
A middle ground might be designating a pysal and pPysal global variable called CORES. At instantiation this is automatically set to mp.cpu_count().

2a. For pPysal all relevant functions are defaulted to cores=CORES.
2b. For pysal all relevant functions are defaulted to cores=1. But the user could always pass pysal.CORES to cores to get the maximum.
Do we want to throw a warning if the user asks for 53 cores on his 4 core machine? Multiprocessing seems to be robust to this mistake, and does not crash (or throw a warning for that matter). For Jay's case this would mess up the chunking... so some kind of check would still need to be done, whether the user is warned or not.

jlaura · 2013-08-08T20:19:54Z

We have a mix currently, with some spreg stuff already using multiprocessing (released in 1.6) and all of the pPysal stuff using multiprocessing. Defaulting to 1 will kill some of the code as written. I wonder if other projects have gone down this road and a community standard is starting to emerge (a PEP maybe?)
Can we call a function in a list of function args? Something like:

def my_func(arg1, arg2, cores=mp.cpu_count()):
    pass

I am hesitant to just default to 1. In cases where we want to support iPython integration 1 will crash on a dual core machine. In cases where we access slice by index (original post) 1 will crash.

I think that this is a good idea. Someone passing more cores than they have will (likely) see a speed decrease since the chunks will get really small and the overhead from spawning the children will start to increase. So a warning (or a silent check) might be in order.

This hits what I see as the root questions: Is PySAL going to support multiprocessing in trunk? (It looks like yes.) If so, are we going to make it a black box that just works, or leave the interfacing to the user. The former requires that we perform these checks, etc. The later assumes that the dev. using the library is fluent enough in the multiprocessing library not to break something.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Defining the number of cores to utilize. #8

Defining the number of cores to utilize. #8

jlaura commented Aug 8, 2013

pedrovma commented Aug 8, 2013

jlaura commented Aug 8, 2013

dfolch commented Aug 8, 2013

jlaura commented Aug 8, 2013

Defining the number of cores to utilize. #8

Defining the number of cores to utilize. #8

Comments

jlaura commented Aug 8, 2013

pedrovma commented Aug 8, 2013

jlaura commented Aug 8, 2013

dfolch commented Aug 8, 2013

jlaura commented Aug 8, 2013