km3pipe.dataclasses
¶
Dataclasses for internal use. Heavily based on Numpy arrays.
Module Contents¶
Classes¶
Table () |
2D generic Table with grouping index. |
NDArray () |
Array with HDF5 metadata. |
Vec3 (x, y, z) |
Functions¶
has_structured_dt (arr) |
Check if the array representation has a structured dtype. |
is_structured (dt) |
Check if the dtype is structured. |
inflate_dtype (arr, names) |
Create structured dtype from a 2d ndarray with unstructured dtype. |
-
km3pipe.dataclasses.
has_structured_dt
(arr)[source]¶ Check if the array representation has a structured dtype.
-
km3pipe.dataclasses.
inflate_dtype
(arr, names)[source]¶ Create structured dtype from a 2d ndarray with unstructured dtype.
-
class
km3pipe.dataclasses.
Table
[source]¶ Bases:
numpy.recarray
2D generic Table with grouping index.
This is a np.recarray subclass with some metadata and helper methods.
You can initialize it directly from a structured numpy array, a pandas DataFrame, a dictionary of (columnar) arrays; or, initialize it from a list of rows/list of columns using the appropriate factory.
This class adds the following to
np.recarray
:Parameters: - data: array-like or dict(array-like)
numpy array with structured/flat dtype, or dict of arrays.
- h5loc: str
Location in HDF5 file where to store the data. [default: ‘/misc’]
- h5singleton: bool
Tables defined as h5singletons are only written once to an HDF5 file. This is used for headers for example (default=False).
- dtype: numpy dtype
Datatype over array. If not specified and data is an unstructured array,
names
needs to be specified. [default: None]
Attributes: - h5loc: str
HDF5 group where to write into. (default=’/misc’)
- split_h5: bool
Split the array into separate arrays, column-wise, when saving to hdf5? (default=False)
- name: str
Human-readable name, e.g. ‘Hits’
- h5singleton: bool
Tables defined as h5singletons are only written once to an HDF5 file. This is used for headers for example (default=False).
Methods
from_dict(arr_dict, dtype=None, **kwargs) Create an Table from a dict of arrays (similar to pandas). from_template(data, template, **kwargs) Create an array from a dict of arrays with a predefined dtype. sorted(by) Sort the table by one of its columns. append_columns(colnames, values) Append new columns to the table. to_dataframe() Return as pandas dataframe. from_dataframe(df, **kwargs) Instantiate from a dataframe. from_rows(list_of_rows, **kwargs) Instantiate from an array-like with shape (n_rows, n_columns). from_columns(list_of_columns, **kwargs) Instantiate from an array-like with shape (n_columns, n_rows). -
classmethod
from_dict
(cls, arr_dict, dtype=None, fillna=False, **kwargs)[source]¶ Generate a table from a dictionary of arrays.
-
classmethod
from_template
(cls, data, template)[source]¶ Create a table from a predefined datatype.
See the
templates_avail
property for available names.Parameters: - data
Data in a format that the
__init__
understands.- template: str or dict
Name of the dtype template to use from
kp.dataclasses_templates
or adict
containing the required attributes (see the other templates for reference).
-
append_columns
(self, colnames, values, **kwargs)[source]¶ Append new columns to the table.
When appending a single column,
values
can be a scalar or an array of either length 1 or the same length as this array (the one it’s appended to). In case of multiple columns, values must have the shapelist(arrays)
, and the dimension of each array has to match the length of this array.See the docs for
numpy.lib.recfunctions.append_fields
for an explanation of the remaining options.
-
drop_columns
(self, colnames, **kwargs)[source]¶ Drop columns from the table.
See the docs for
numpy.lib.recfunctions.drop_fields
for an explanation of the remaining options.