:mod:`km3pipe.io.hdf5`
======================

.. py:module:: km3pipe.io.hdf5

.. autoapi-nested-parse::

   Read and write KM3NeT-formatted HDF5 files.

   ..
       !! processed by numpydoc !!


Module Contents
---------------


Classes
~~~~~~~

.. autoapisummary::

   km3pipe.io.hdf5.HDF5Header
   km3pipe.io.hdf5.HDF5IndexTable
   km3pipe.io.hdf5.HDF5Sink
   km3pipe.io.hdf5.HDF5Pump
   km3pipe.io.hdf5.HDF5MetaData


Functions
~~~~~~~~~
.. autoapisummary::


   km3pipe.io.hdf5.check_version
   km3pipe.io.hdf5.create_index_tuple
   km3pipe.io.hdf5.convert_header_dict_to_table


.. data:: jit
   

.. data:: log
   

.. data:: FORMAT_VERSION
   

.. data:: MINIMUM_FORMAT_VERSION
   

.. py:exception:: H5VersionError

   Bases::class:`Exception`

   
.. function:: check_version(h5file)

   
.. py:class:: HDF5Header(data)

   Bases::class:`object`

   
   Wrapper class for the `/raw_header` table in KM3HDF5


   ..
       !! processed by numpydoc !!


   .. classmethod:: from_table(cls, table)

      
   .. classmethod:: from_hdf5(cls, filename)

      
   .. classmethod:: from_pytable(cls, table)

      
.. py:class:: HDF5IndexTable(h5loc)

   Bases::class:`object`

   
   .. attribute:: data
      

   .. method:: append(self, n_items)

      
.. py:class:: HDF5Sink

   Bases::class:`km3pipe.core.Module`

   
   Write KM3NeT-formatted HDF5 files, event-by-event.

   The data can be a ``kp.Table``, a numpy structured array,
   a pandas DataFrame, or a simple scalar.

   The name of the corresponding H5 table is the decamelised
   blob-key, so values which are stored in the blob under `FooBar`
   will be written to `/foo_bar` in the HDF5 file.

   :Parameters:

       **filename: str, optional [default: 'dump.h5']**
           Where to store the events.

       **h5file: pytables.File instance, optional [default: None]**
           Opened file to write to. This is mutually exclusive with filename.

       **complib** : str [default: zlib]
           Compression library that should be used.
           'zlib', 'lzf', 'blosc' and all other PyTables filters
           are available.

       **complevel** : int [default: 5]
           Compression level. 

       **chunksize** : int [optional]
           Chunksize that should be used for saving along the first axis
           of the input array.

       **flush_frequency: int, optional [default: 500]**
           The number of iterations to cache tables and arrays before
           dumping to disk.

       **pytab_file_args: dict [optional]**
           pass more arguments to the pytables File init

       **n_rows_expected = int, optional [default: 10000]**
           ..

       **append: bool, optional [default: False]**
           ..


   ..
       !! processed by numpydoc !!


   .. method:: configure(self)

      
   .. method:: process(self, blob)

      
   .. method:: flush(self)

      
      Flush tables and arrays to disk


      ..
          !! processed by numpydoc !!

      
   .. method:: finish(self)

      
.. py:class:: HDF5Pump

   Bases::class:`km3pipe.core.Pump`

   
   Read KM3NeT-formatted HDF5 files, event-by-event.


   :Parameters:

       **filename: str**
           From where to read events. Either this OR ``filenames`` needs to be
           defined.

       **filenames: list_like(str)**
           Multiple filenames. Either this OR ``filename`` needs to be defined.

       **skip_version_check: bool [default: False]**
           Don't check the H5 version. Might lead to unintended consequences.

       **ignore_hits: bool [default: False]**
           If True, do not read any hit information.

       **cut_mask: str**
           H5 Node path to a boolean cut mask. If specified, use the boolean array
           found at this node as a mask. ``False`` means "skip this event".
           Example: ``cut_mask="/pid/survives_precut"``

       **shuffle: bool, optional [default: False]**
           Shuffle the group_ids, so that the blobs are mixed up.

       **shuffle_function: function, optional [default: np.random.shuffle**
           The function to be used to shuffle the group IDs.

       **reset_index: bool, optional [default: True]**
           When shuffle is set to true, reset the group ID - start to count
           the group_id by 0.


   ..
       !! processed by numpydoc !!


   .. method:: configure(self)

      
   .. method:: process(self, blob)

      
   .. method:: get_blob(self, index)

      
   .. method:: finish(self)

      
.. function:: create_index_tuple(group_ids)

   
   An helper function to create index tuples for fast lookup in HDF5Pump


   ..
       !! processed by numpydoc !!

   
.. py:class:: HDF5MetaData

   Bases::class:`km3pipe.core.Module`

   
   Metadata to attach to the HDF5 file.


   :Parameters:

       **data: dict**
           ..


   ..
       !! processed by numpydoc !!


   .. method:: configure(self)

      
.. function:: convert_header_dict_to_table(header_dict)

   
   Converts a header dictionary (usually from aanet) to a Table


   ..
       !! processed by numpydoc !!