km3pipe.dataclasses¶
Dataclasses for internal use. Heavily based on Numpy arrays.
Module Contents¶
Classes¶
Table() |
2D generic Table with grouping index. |
NDArray() |
Array with HDF5 metadata. |
Vec3(x, y, z) |
Functions¶
has_structured_dt(arr) |
Check if the array representation has a structured dtype. |
is_structured(dt) |
Check if the dtype is structured. |
inflate_dtype(arr, names) |
Create structured dtype from a 2d ndarray with unstructured dtype. |
-
km3pipe.dataclasses.has_structured_dt(arr)[source]¶ Check if the array representation has a structured dtype.
-
km3pipe.dataclasses.inflate_dtype(arr, names)[source]¶ Create structured dtype from a 2d ndarray with unstructured dtype.
-
class
km3pipe.dataclasses.Table[source]¶ Bases:
numpy.recarray2D generic Table with grouping index.
This is a np.recarray subclass with some metadata and helper methods.
You can initialize it directly from a structured numpy array, a pandas DataFrame, a dictionary of (columnar) arrays; or, initialize it from a list of rows/list of columns using the appropriate factory.
This class adds the following to
np.recarray:Parameters: - data: array-like or dict(array-like)
numpy array with structured/flat dtype, or dict of arrays.
- h5loc: str
Location in HDF5 file where to store the data. [default: ‘/misc’]
- h5singleton: bool
Tables defined as h5singletons are only written once to an HDF5 file. This is used for headers for example (default=False).
- dtype: numpy dtype
Datatype over array. If not specified and data is an unstructured array,
namesneeds to be specified. [default: None]
Attributes: - h5loc: str
HDF5 group where to write into. (default=’/misc’)
- split_h5: bool
Split the array into separate arrays, column-wise, when saving to hdf5? (default=False)
- name: str
Human-readable name, e.g. ‘Hits’
- h5singleton: bool
Tables defined as h5singletons are only written once to an HDF5 file. This is used for headers for example (default=False).
Methods
from_dict(arr_dict, dtype=None, **kwargs) Create an Table from a dict of arrays (similar to pandas). from_template(data, template, **kwargs) Create an array from a dict of arrays with a predefined dtype. sorted(by) Sort the table by one of its columns. append_columns(colnames, values) Append new columns to the table. to_dataframe() Return as pandas dataframe. from_dataframe(df, **kwargs) Instantiate from a dataframe. from_rows(list_of_rows, **kwargs) Instantiate from an array-like with shape (n_rows, n_columns). from_columns(list_of_columns, **kwargs) Instantiate from an array-like with shape (n_columns, n_rows). -
classmethod
from_dict(cls, arr_dict, dtype=None, fillna=False, **kwargs)[source]¶ Generate a table from a dictionary of arrays.
-
classmethod
from_template(cls, data, template)[source]¶ Create a table from a predefined datatype.
See the
templates_availproperty for available names.Parameters: - data
Data in a format that the
__init__understands.- template: str or dict
Name of the dtype template to use from
kp.dataclasses_templatesor adictcontaining the required attributes (see the other templates for reference).
-
append_columns(self, colnames, values, **kwargs)[source]¶ Append new columns to the table.
When appending a single column,
valuescan be a scalar or an array of either length 1 or the same length as this array (the one it’s appended to). In case of multiple columns, values must have the shapelist(arrays), and the dimension of each array has to match the length of this array.See the docs for
numpy.lib.recfunctions.append_fieldsfor an explanation of the remaining options.
-
drop_columns(self, colnames, **kwargs)[source]¶ Drop columns from the table.
See the docs for
numpy.lib.recfunctions.drop_fieldsfor an explanation of the remaining options.