cbi_toolbox.parallel
Submodules
Module contents
The parallel package provides tools to split parallel computations.
- cbi_toolbox.parallel.distribute_bin(dimension, rank, workers)[source]
Computes the start index and bin size to evenly split array-like data into multiple bins.
- Parameters:
dimension (int) – The size of the array to distribute.
rank (int, optional) – The rank of the worker.
workers (int, optional) – The total number of workers.
- Returns:
The start index of this bin, and its size. The distributed data should be array[start:start + bin_size].
- Return type:
(int, int)
- cbi_toolbox.parallel.distribute_bin_all(dimension, workers)[source]
Computes the start indexes and bin sizes of all splits to distribute computations across multiple workers.
- Parameters:
dimension (int) – the size of the array to be distributed
workers (int, optional) – the amount of workers
- Returns:
The list of start indexes and the list of bin sizes to distribute data.
- Return type:
([int], [int])
- cbi_toolbox.parallel.parallelize(func, size, workers=None)[source]
Launches a function multiple times in parallel using multithreading. Useful only if the GIL is released in the parallelized function (this is the case for many
numpyandscipyroutines).- Parameters:
func (function (callable)) – The function that will be run in parallel. It must take 2 arguments, which are the returns of
distribute_bin_allcorresponding to the thread pool (the list of starting indexes of data bins, and the list of bin sizes).size (int) – The size of the array that will be split between workers.
workers (int, optional) – The maximum number of workers, by default None (will be maximized for the system).
- Returns:
An iterator containing the results of the function calls, in a random order (see concurrent.futures.ThreadPoolExecutor.map).
- Return type:
iterator