Hdf5 data format

This chapter describes the data format for the Hdf5 dataset format for Trident. The format is ordinarily read and written by the Hdf5DataStore component.

Layout

The root level of the file contains one group per type of data object. Inside each group are an arbitrary hierarchy of groups, with the innermost group representing one data resource. The types of datasets and attributes are found depend on the type of data resource. The following table shows the relationship with the topmost group name and the data resource type.

Group name	DataResourceType
`ts`	`TIME_SERIES`
`xy`	`XY_CURVE`
`xyts`	`XY_TIME_SERIES`
`ndarrayts`	`ND_ARRAY_TS`
`string_list`	`STRING_LIST`

Each data resource must have a path that corresponds to a known resource path in Trident. Those paths can be found under API Reference > Python API > Modules.

Data resource types

`TIME_SERIES` data resource

This represents a TimeSeries object.

data dataset

A composite table with two members:
- t: i64 - Microseconds since the UNIX epoch
- v: f64 - Value
interpolation attribute

This is an integer value which represents the time series interpolation type. Possible values are:
- INSTANT = 0 - Values continue until the next point (stepwise curve)
- LINEAR = 1 - Values between points are linearly interpolated

`XY_CURVE` data resource

This represents an XYCurve object.

x and y datasets

This represents the combined x and y coordinates for all the XY Curves. Note that values for all curves are combined into a 1-dimensional dataset. The number of values per curve are found in the s dataset, which is needed when interpreting the data.
interpolation attribute

This is an integer value which represents the time series interpolation type. Possible values are:
- INSTANT = 0 - Values continue until the next point (stepwise curve)
- LINEAR = 1 - Values between points are linearly interpolated

`XY_TIME_SERIES` data resource

This represents an XYTimeSeries object.

t dataset

A 1-dimensional dataset of integers representing microseconds since the UNIX epoch.
s dataset

A 1-dimensional dataset with integers representing the number of points per curve. When converting the data to a series of curves, these values must be used to extract the x and y values.
i dataset

A 1-dimensional data set of ints, which represent interpolation type for each curve. Possible values are:
- INSTANT = 0 - Values continue until the next point (stepwise curve)
- LINEAR = 1 - Values between points are linearly interpolated
x and y datasets

This represents the combined x and y coordinates for all the XY Curves. Note that values for all curves are combined into a 1-dimensional dataset. The number of values per curve are found in the s dataset, which is needed when interpreting the data.

`ND_ARRAY_TS` data resource

This represents an NDArrayTS object; that is, a time series of N-dimensional arrays.

t dataset

A 1-dimensional dataset of integers representing microseconds since the UNIX epoch.
s dataset

A 1-dimensional strided array of size information. For each time series element, the following values are included:
- ndims: The number of dimensions (1 value)
- dims: The dimension sizes (ndims values)
v dataset

A 1-dimensional strided dataset containing all the values. They are packed after eachother, and each NDArray’s values can be located by interpretin the s dataset.

`STRING_LIST` data resource

This represents a std::vector<std::string> object in C++ and list[str] object in Python.

v dataset

A 1-dimensional dataset of C-string values.

Hdf5 data format

Layout

Data resource types

TIME_SERIES data resource

XY_CURVE data resource

XY_TIME_SERIES data resource

ND_ARRAY_TS data resource

STRING_LIST data resource

`TIME_SERIES` data resource

`XY_CURVE` data resource

`XY_TIME_SERIES` data resource

`ND_ARRAY_TS` data resource

`STRING_LIST` data resource