data_transformer
STREAM_POOL = []
module-attribute
INDICES_CACHE = {}
module-attribute
HaloExchangeSpec
dataclass
Memory description of the data exchanged.
The data stored here target a single exchange, with an optional rotation to give prior to pack. Slices are tupled following the convention of one slice per dimension
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
specification
|
QuantityHaloSpec
|
memory layout of the data |
required |
pack_slices
|
tuple[slice, ...]
|
indexing to pack, one slice per dimension |
required |
pack_clockwise_rotation
|
int
|
number of 90-degree rotations to perform before packing |
required |
unpack_slices
|
tuple[slice, ...]
|
indexing to unpack, one slice per dimension |
required |
specification
instance-attribute
pack_slices
instance-attribute
pack_clockwise_rotation
instance-attribute
unpack_slices
instance-attribute
__init__(specification, pack_slices, pack_clockwise_rotation, unpack_slices)
HaloDataTransformer
Bases: ABC
Transform data to exchange in a format optimized for network communication.
Current strategy: pack/unpack multiple nD array into/from a single buffer. Offers a pack and an unpack buffer to use for communicating data.
The class is responsible for packing & unpacking, not communication. Order of operations: - get HaloDataTransformer via get() with N transformation with the proper halo specifications. At the end of get() a _compile() will be triggered, reading the internal buffers. - call async_pack(quantities) to start packing the quantities in the internal buffer. - synchronize() to make sure all operations are finished or use get_pack_buffer() when ready to communicate which will internally call synchronize. [... user should communicate the buffers...] - call async_unpack(quantities) to start unpacking - call synchronize() to finish all the unpacking operations and make sure the quantities passed in async_unpack have been updated.
The class will hold onto the buffers up until deletion, where they will be returned to an internal buffer pool.
__init__(np_module, exchange_descriptors_x, exchange_descriptors_y=None)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
np_module
|
ModuleType
|
numpy-like module for allocation |
required |
exchange_descriptors_x
|
Sequence[HaloExchangeSpec]
|
list of memory information describing an exchange. Used for scalar data and the x-component of vectors. |
required |
exchange_descriptors_y
|
Sequence[HaloExchangeSpec] | None
|
list of memory information describing an exchange.
Optional, used for the y-component of vectors only. If |
None
|
finalize()
Deletion routine, making sure all buffers were inserted back into cache.
get(np_module, exchange_descriptors_x, exchange_descriptors_y=None)
staticmethod
Construct a module from a numpy-like module.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
np_module
|
ModuleType
|
numpy-like module to determine child transformer type. |
required |
exchange_descriptors_x
|
Sequence[HaloExchangeSpec]
|
list of memory information describing an exchange. Used for scalar data and the x-component of vectors. |
required |
exchange_descriptors_y
|
Sequence[HaloExchangeSpec] | None
|
list of memory information describing an exchange.
Optional, used for the y-component of vectors only. If |
None
|
Returns:
| Type | Description |
|---|---|
HaloDataTransformer
|
an initialized packed buffer. |
get_unpack_buffer()
Retrieve unpack buffer.
Synchronizes operations.
get_pack_buffer()
Retrieve pack buffer.
Synchronizes operations.
ready()
Check if the buffers are ready for communication.
async_pack(quantities_x, quantities_y=None)
abstractmethod
Pack all given quantities into a single send Buffer.
Does not guarantee the buffer returned by get_unpack_buffer has
received data, doing so requires calling synchronize.
Reaching for the buffer via get_pack_buffer() will call synchronize().
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
quantities_x
|
list[Quantity]
|
scalar or vector x-component quantities to pack, if one is vector they must all be vector |
required |
quantities_y
|
list[Quantity] | None
|
if quantities are vector, the y-component quantities. |
None
|
async_unpack(quantities_x, quantities_y=None)
abstractmethod
Unpack the buffer into destination quantities.
Does not guarantee the buffer returned by get_unpack_buffer has
received data, doing so requires calling synchronize.
Reaching for the buffer via get_unpack_buffer() will call synchronize().
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
quantities_x
|
Sequence[Quantity]
|
scalar or vector x-component quantities to be unpacked into, if one is vector they must all be vector |
required |
quantities_y
|
Sequence[Quantity] | None
|
if quantities are vector, the y-component quantities. |
None
|
synchronize()
abstractmethod
Synchronize all operations.
Guarantees all memory is now safe to access.
HaloDataTransformerCPU
Bases: HaloDataTransformer
Pack/unpack data in a single buffer using numpy flattening & slicing.
Default behavior, could be done with any numpy-like library.
synchronize()
async_pack(quantities_x, quantities_y=None)
async_unpack(quantities_x, quantities_y=None)
HaloDataTransformerGPU
Bases: HaloDataTransformer
Pack/unpack data in a single buffer using CUDA Kernels.
In order to efficiently pack/unpack on the GPU to a single GPU buffer
we use streamed (e.g. async) kernels per quantity per edge to send. The
kernels are store in cuda_kernels.py, they both follow the same simple pattern
by reading the indices to the device memory of the data to pack/unpack.
_flatten_indices is the routine that take the layout of the memory and
the slice and compute an array of index into the original memory.
__init__(np_module, exchange_descriptors_x, exchange_descriptors_y=None)
synchronize()
async_pack(quantities_x, quantities_y=None)
Pack the quantities into a single buffer via streamed cuda kernels
Writes into self._pack_buffer using self._x_infos and self._y_infos to read the offsets and sizes per quantity.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
quantities_x
|
list[Quantity]
|
list of quantities to pack. Must fit the specifications given at init time. |
required |
quantities_y
|
list[Quantity] | None
|
Same as above but optional, used only for vector transfer. |
None
|
async_unpack(quantities_x, quantities_y=None)
Unpack the quantities from a single buffer via streamed cuda kernels
Reads from self._unpack_buffer using self._x_infos and self._y_infos to read the offsets and sizes per quantity.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
quantities_x
|
Sequence[Quantity]
|
list of quantities to unpack. Must fit the specifications given at init time. |
required |
quantities_y
|
Sequence[Quantity] | None
|
Same as above but optional, used only for vector transfer. |
None
|