km3pipe.io.hdf5

Read and write KM3NeT-formatted HDF5 files.

Module Contents

Classes

HDF5Header(data) Wrapper class for the /raw_header table in KM3HDF5
HDF5IndexTable(h5loc)
HDF5Sink() Write KM3NeT-formatted HDF5 files, event-by-event.
HDF5Pump() Read KM3NeT-formatted HDF5 files, event-by-event.
HDF5MetaData() Metadata to attach to the HDF5 file.

Functions

check_version(h5file)
create_index_tuple(group_ids) An helper function to create index tuples for fast lookup in HDF5Pump
convert_header_dict_to_table(header_dict) Converts a header dictionary (usually from aanet) to a Table
km3pipe.io.hdf5.jit[source]
km3pipe.io.hdf5.log[source]
km3pipe.io.hdf5.FORMAT_VERSION[source]
km3pipe.io.hdf5.MINIMUM_FORMAT_VERSION[source]
exception km3pipe.io.hdf5.H5VersionError[source]

Bases:Exception

km3pipe.io.hdf5.check_version(h5file)[source]
class km3pipe.io.hdf5.HDF5Header(data)[source]

Bases:object

Wrapper class for the /raw_header table in KM3HDF5

classmethod from_table(cls, table)[source]
classmethod from_hdf5(cls, filename)[source]
classmethod from_pytable(cls, table)[source]
class km3pipe.io.hdf5.HDF5IndexTable(h5loc)[source]

Bases:object

data[source]
append(self, n_items)[source]
class km3pipe.io.hdf5.HDF5Sink[source]

Bases:km3pipe.core.Module

Write KM3NeT-formatted HDF5 files, event-by-event.

The data can be a kp.Table, a numpy structured array, a pandas DataFrame, or a simple scalar.

The name of the corresponding H5 table is the decamelised blob-key, so values which are stored in the blob under FooBar will be written to /foo_bar in the HDF5 file.

Parameters:
filename: str, optional [default: ‘dump.h5’]

Where to store the events.

h5file: pytables.File instance, optional [default: None]

Opened file to write to. This is mutually exclusive with filename.

complib : str [default: zlib]

Compression library that should be used. ‘zlib’, ‘lzf’, ‘blosc’ and all other PyTables filters are available.

complevel : int [default: 5]

Compression level.

chunksize : int [optional]

Chunksize that should be used for saving along the first axis of the input array.

flush_frequency: int, optional [default: 500]

The number of iterations to cache tables and arrays before dumping to disk.

pytab_file_args: dict [optional]

pass more arguments to the pytables File init

n_rows_expected = int, optional [default: 10000]
append: bool, optional [default: False]
configure(self)[source]
process(self, blob)[source]
flush(self)[source]

Flush tables and arrays to disk

finish(self)[source]
class km3pipe.io.hdf5.HDF5Pump[source]

Bases:km3pipe.core.Pump

Read KM3NeT-formatted HDF5 files, event-by-event.

Parameters:
filename: str

From where to read events. Either this OR filenames needs to be defined.

filenames: list_like(str)

Multiple filenames. Either this OR filename needs to be defined.

skip_version_check: bool [default: False]

Don’t check the H5 version. Might lead to unintended consequences.

ignore_hits: bool [default: False]

If True, do not read any hit information.

cut_mask: str

H5 Node path to a boolean cut mask. If specified, use the boolean array found at this node as a mask. False means “skip this event”. Example: cut_mask="/pid/survives_precut"

shuffle: bool, optional [default: False]

Shuffle the group_ids, so that the blobs are mixed up.

shuffle_function: function, optional [default: np.random.shuffle

The function to be used to shuffle the group IDs.

reset_index: bool, optional [default: True]

When shuffle is set to true, reset the group ID - start to count the group_id by 0.

configure(self)[source]
process(self, blob)[source]
get_blob(self, index)[source]
finish(self)[source]
km3pipe.io.hdf5.create_index_tuple(group_ids)[source]

An helper function to create index tuples for fast lookup in HDF5Pump

class km3pipe.io.hdf5.HDF5MetaData[source]

Bases:km3pipe.core.Module

Metadata to attach to the HDF5 file.

Parameters:
data: dict
configure(self)[source]
km3pipe.io.hdf5.convert_header_dict_to_table(header_dict)[source]

Converts a header dictionary (usually from aanet) to a Table