cbi_toolbox.parallel.mpi
The mpi module allows to distribute operations in MPI communicators.
It is an optional feature of cbi_toolbox that requires a working implementation of MPI.
- cbi_toolbox.parallel.mpi.compute_vector_extent(axis, array=None, shape=None, dtype=None)[source]
Compute the extent in bytes of a sliced view of a given array.
- Parameters:
axis (int) – Axis on which the slices are taken.
array (numpy.ndarray, optional) – The array to slice, by default None (then shape and dtype must be given).
shape (the shape of the array to slice, optional) – The shape of the array, by default None.
dtype (numpy.dtype or str, optional) – The datatype of the array, by default None.
- Returns:
The extent of the slices underlying data.
- Return type:
int
- Raises:
ValueError – If array, shape and dtype are all None.
- cbi_toolbox.parallel.mpi.create_slice_view(axis, n_slices, array=None, shape=None, dtype=None)[source]
Create a MPI vector datatype to access given slices of a non distributed array. If the array is not provided, its shape and dtype must be specified.
- Parameters:
axis (int) – The axis on which to slice.
n_slices (int) – How many contiguous slices to take.
array (numpy.ndarray, optional) – The array to slice, by default None (then shape and dtype must be given).
shape (the shape of the array to slice, optional) – The shape of the array, by default None.
dtype (numpy.dtype or str, optional) – The datatype of the array, by default None.
- Returns:
The strided datatype allowing to access slices in the array.
- Return type:
mpi4py.MPI.Datatype
- Raises:
ValueError – If array, shape and dtype are all None.
- cbi_toolbox.parallel.mpi.create_vector_type(src_axis, tgt_axis, array=None, shape=None, dtype=None, block_size=1)[source]
Create a MPI vector datatype to communicate a distributed array and split it along a different axis.
- Parameters:
src_axis (int) – The original axis on which the array is distributed.
tgt_axis (int) – The axis on which the array is to be distributed.
array (numpy.ndarray, optional) – The array to slice, by default None (then shape and dtype must be given).
shape (the shape of the array to slice, optional) – The shape of the array, by default None.
dtype (numpy.dtype or str, optional) – The datatype of the array, by default None.
block_size (int, optional) – The size of the distributed bin, by default 1.
- Returns:
The vector datatype used for transmission/reception of the data.
- Return type:
mpi4py.MPI.Datatype
- Raises:
ValueError – If array, shape and dtype are all None.
ValueError – If the source and destination axes are the same.
NotImplementedError – If the array has more than 4 axes (should work, but tests needed).
ValueError – If the block size is bigger than the source axis.
- cbi_toolbox.parallel.mpi.distribute_mpi(dimension, mpi_comm=<mpi4py.MPI.Intracomm object>)[source]
Computes the start index and bin size to evenly split array-like data into multiple bins on an MPI communicator.
- Parameters:
dimension (int) – The size of the array to distribute.
mpi_comm (mpi4py.MPI.Comm, optional) – The communicator, by default MPI.COMM_WORLD.
- Returns:
The start index of this bin, and its size. The distributed data should be array[start:start + bin_size].
- Return type:
(int, int)
- cbi_toolbox.parallel.mpi.distribute_mpi_all(dimension, mpi_comm=<mpi4py.MPI.Intracomm object>)[source]
Computes the start indexes and bin sizes of all splits to distribute computations across an MPI communicator.
- Parameters:
dimension (int) – the size of the array to be distributed
mpi_comm (mpi4py.MPI.Comm, optional) – the communicator, by default MPI.COMM_WORLD
- Returns:
The list of start indexes and the list of bin sizes to distribute data.
- Return type:
([int], [int])
- cbi_toolbox.parallel.mpi.gather_full_shape(array, axis, mpi_comm=<mpi4py.MPI.Intracomm object>)[source]
Gather the full shape of an array distributed across an MPI communicator along a given axis.
- Parameters:
array (numpy.ndarray) – The distributed array.
axis (int) – The axis on which the array is distributed.
mpi_comm (mpi4py.MPI.Comm, optional) – The communicator, by default MPI.COMM_WORLD.
- Raises:
NotImplementedError – This is not implemented yet.
- cbi_toolbox.parallel.mpi.get_rank(mpi_comm=<mpi4py.MPI.Intracomm object>)[source]
Get this process number in the communicator.
- Parameters:
mpi_comm (mpi4py.MPI.Comm, optional) – The communicator, by default MPI.COMM_WORLD.
- Returns:
The rank of the process.
- Return type:
int
- cbi_toolbox.parallel.mpi.get_size(mpi_comm=<mpi4py.MPI.Intracomm object>)[source]
Get the process count in the communicator.
- Parameters:
mpi_comm (mpi4py.MPI.Comm, optional) – The MPI communicator, by default MPI.COMM_WORLD.
- Returns:
The size of the MPI communicator.
- Return type:
int
- cbi_toolbox.parallel.mpi.is_root_process(mpi_comm=<mpi4py.MPI.Intracomm object>)[source]
Check if the current process is root.
- Parameters:
mpi_comm (mpi4py.MPI.Comm, optional) – The MPI communicator, by default MPI.COMM_WORLD.
- Returns:
True if the current process is the root of the communicator.
- Return type:
bool
- cbi_toolbox.parallel.mpi.load(file_name, axis, mpi_comm=<mpi4py.MPI.Intracomm object>)[source]
Load a numpy array across parallel jobs in the MPI communicator. The array is sliced along the chosen dimension, with minimal bandwidth.
- Parameters:
file_name (str) – The numpy array file to load.
axis (int) – The axis on which to distribute the array.
mpi_comm (mpi4py.MPI.Comm, optional) – The MPI communicator used to distribute, by default MPI.COMM_WORLD.
- Returns:
The distributed array, and the size of the full array.
- Return type:
(numpy.ndarray, tuple(int))
- Raises:
ValueError – If the numpy version used to save the file is not supported.
NotImplementedError – If the array is saved in Fortran order.
- cbi_toolbox.parallel.mpi.redistribute(array, src_axis, tgt_axis, full_shape=None, mpi_comm=<mpi4py.MPI.Intracomm object>)[source]
Redistribute an array along a different dimension.
- Parameters:
array (numpy.ndarray) – The distributed array.
src_axis (int) – The original axis on which the array is distributed.
tgt_axis (int) – The axis on which the array is to be distributed.
full_shape (tuple(int), optional) – The full shape of the array, by default None.
mpi_comm (mpi4py.MPI.Comm, optional) – The MPI communicator used to distribute, by default MPI.COMM_WORLD.
- Returns:
The array distributed along the new axis.
- Return type:
np.ndarray
- cbi_toolbox.parallel.mpi.save(file_name, array, axis, full_shape=None, mpi_comm=<mpi4py.MPI.Intracomm object>)[source]
Save a numpy array from parallel jobs in the MPI communicator. The array is gathered along the chosen dimension.
- Parameters:
file_name (str) – The numpy array file to load.
array (numpy.ndarray) – The distributed array.
axis (int) – The axis on which to distribute the array.
full_shape (tuple(int), optional) – The size of the full array, by default None.
mpi_comm (mpi4py.MPI.Comm, optional) – The MPI communicator used to distribute, by default MPI.COMM_WORLD.
- cbi_toolbox.parallel.mpi.to_mpi_datatype(np_datatype)[source]
Returns the MPI datatype corresponding to the numpy dtype provided.
- Parameters:
np_datatype (numpy.dtype or str) – The numpy datatype, or name.
- Returns:
The corresponding MPI datatype.
- Return type:
mpi4py.MPI.Datatype
- Raises:
NotImplementedError – If the numpy datatype is not listed in the conversion table.