Hdf5 data format

This chapter describes the data format for the Hdf5 dataset format for Trident. The format is ordinarily read and written by the Hdf5DataStore component.

Layout

The root level of the file contains one group per type of data object. Inside each group are an arbitrary hierarchy of groups, with the innermost group representing one data resource. The types of datasets and attributes are found depend on the type of data resource. The following table shows the relationship with the topmost group name and the data resource type.

Group name

DataResourceType

ts

TIME_SERIES

xy

XY_CURVE

xyts

XY_TIME_SERIES

ndarrayts

ND_ARRAY_TS

string_list

STRING_LIST

Each data resource must have a path that corresponds to a known resource path in Trident. Those paths can be found under API Reference > Python API > Modules.

Data resource types

TIME_SERIES data resource

This represents a TimeSeries object.

  • data dataset

    A composite table with two members:

    • t: i64 - Microseconds since the UNIX epoch

    • v: f64 - Value

  • interpolation attribute

    This is an integer value which represents the time series interpolation type. Possible values are:

    • INSTANT = 0 - Values continue until the next point (stepwise curve)

    • LINEAR = 1 - Values between points are linearly interpolated

XY_CURVE data resource

This represents an XYCurve object.

  • x and y datasets

    This represents the combined x and y coordinates for all the XY Curves. Note that values for all curves are combined into a 1-dimensional dataset. The number of values per curve are found in the s dataset, which is needed when interpreting the data.

  • interpolation attribute

    This is an integer value which represents the time series interpolation type. Possible values are:

    • INSTANT = 0 - Values continue until the next point (stepwise curve)

    • LINEAR = 1 - Values between points are linearly interpolated

XY_TIME_SERIES data resource

This represents an XYTimeSeries object.

  • t dataset

    A 1-dimensional dataset of integers representing microseconds since the UNIX epoch.

  • s dataset

    A 1-dimensional dataset with integers representing the number of points per curve. When converting the data to a series of curves, these values must be used to extract the x and y values.

  • i dataset

    A 1-dimensional data set of ints, which represent interpolation type for each curve. Possible values are:

    • INSTANT = 0 - Values continue until the next point (stepwise curve)

    • LINEAR = 1 - Values between points are linearly interpolated

  • x and y datasets

    This represents the combined x and y coordinates for all the XY Curves. Note that values for all curves are combined into a 1-dimensional dataset. The number of values per curve are found in the s dataset, which is needed when interpreting the data.

ND_ARRAY_TS data resource

This represents an NDArrayTS object; that is, a time series of N-dimensional arrays.

  • t dataset

    A 1-dimensional dataset of integers representing microseconds since the UNIX epoch.

  • s dataset

    A 1-dimensional strided array of size information. For each time series element, the following values are included:

    • ndims: The number of dimensions (1 value)

    • dims: The dimension sizes (ndims values)

  • v dataset

    A 1-dimensional strided dataset containing all the values. They are packed after eachother, and each NDArray’s values can be located by interpretin the s dataset.

STRING_LIST data resource

This represents a std::vector<std::string> object in C++ and list[str] object in Python.

  • v dataset

    A 1-dimensional dataset of C-string values.