DUNE-DAQ
DUNE Trigger and Data Acquisition software
|
The hdflibs repository contains the classes that are used for interfacing between dunedaq data applications (writers and readers) and the HighFive library to read/write HDF5 files.
There are two main classes in use:
More details on those classes are in the sections below, but some important interface points:
HDF5RawDataFile
handles only one file at a time. It opens it on construction, and closes it on deletion. If writing multiple files, one will need multiple HDF5RawDataFile
objects.The general structure of written files is as follows:
Note that names of datasets and groups are not as shown, and instead are configurable on writing, and later determined by the attributes "filelayout_params".
There are example programs in app
– HDF5LIBS_TestWriter
and HDF5LIBS_TestReader
– that show how to use these classes in simple C++ applications. HDF5LIBS_TestReader.py
shows how to read files using HDF5RawDataFile from a python interface.
This class defines the file layout of dunedaq raw data files. It receives a hdf5filelayout::FileLayoutParams
object for configuration, which looks like the following in json:
Under this configuration, the general file structure will look something like this:
The configuration information for the file layout are written as the attribute "filelayout_params" as JSON-formatted std::string
. When a file is later opened to be read, the file layout parameters are automatically extracted from the attribute, and used to populate an HDF5FileLayout
member of the HDF5RawDataFile
. If no attributes exist, currently a set of defaults are used.
The constructor for creating a new HDF5RawDataFile for writing looks like this:
Upon opening the file – at object construction – the following attributes are written:
daqdataformats::run_number_t
)size_t
)std::string
, string translation of the number of milliseconds since epoch)std::string)
"compression_level" (
unsigned`, zlib compression level 0&ndash-9, default: 0)alongside the file layout paramters as described above.
Upon closing the file – at object destruction – the following attributes are written:
size_t
, number of bytes written in datasets)size_t
, uncompressed number of bytes contained in all TriggerRecord or TimeSlice objects)size_t
, total number of bytes in the file, including datasets, metadata, and unallocated space)std::string
, string translation of the number of milliseconds since epoch).The key interface for writing is the HDF5RawDataFile::write(const daqdataformats::TriggerRecord& tr)
member, which takes a TriggerRecord, creates a group in the HDF5 file for it, and then writes all of the underlying data (TriggerRecordHeader
and Fragment
s) to appropriate datasets and subgroups. All data are written as dimension 1 char
arrays, with no change to the input TriggerRecord
object.
The constructor for creating a new HDF5RawDataFile for reading looks like this:
There is no need to provide file layout parameters, as these are read from the existing file's attributes.
dunedaq
raw data files can be interrogated with any HDF5 reading utilities, and the data payloads for each dataset are simple dimension 1 byte (char
) arrays. However, there are a number of useful accessors included in HDF5RawDataFile
to aid in file interrogation, traversal, and data extraction:
get_dataset_paths(std::string top_level_group_name = "")
returns all dataset paths (std::vector<std::string>
) located beneath the specified group (defaulting to the whole file), including the full list of datasets in any subgroups of the specified group;get_all_record_ids()
returns an std::set
of record IDs (std::pair of record number and sequence number) located in the file;get_trigger_record_header_dataset_paths(int max_trigger_records = -1)
returns all datset paths (up to a maximum number of desired trigger records, default is all) for TriggerRecordHeader
objects;get_all_fragment_dataset_paths(int max_trigger_records = -1)
returns all datset paths (up to a maximum number of desired trigger records, default is all) for Fragment
objects of any system type;get_trh_ptr(...)
members return a unique ptr to a TriggerRecordHeader
, with inputs either being a full path as you may get from get_trigger_record_header_dataset_paths()
, or with an input specifying the desired trigger number;get_frag_ptr(...)
members return a unique ptr to a Fragment
, with inputs either being a full path as you would get from get_all_fragment_dataset_paths()
, or by specifying the trigger number and GeoID
of the desired data (or also the elements of the GeoID
).compression_level
parameter.uncompressed_raw_data_size
and total_file_size
.get_compression_level
, get_uncompressed_raw_data_size
, and get_total_file_size
.creation_timestamp
and closing_timestamp
HDF5 File Attributes were changed from string to integer.TriggerRecordHeaderData
structure changed (support for additional trigger types was added), and it seemed prudent to update the file_layout_version
to reflect this data format change.Extensive notes on the version 3 updates can be found here.
This version is the initial version of hdf5libs
after significant restructuring of many of the existing utilities, including the introduction of the HDF5FileLayout
class, and separation of the HDF5RawDataFile
class from dfmodules
.
This version refers to files that were written before the introduction of the HDF5FileLayout
class. Currently, on reading a file, if there is no file layout attributes found in the file, it assumes a file layout parameter set as such:
Note that for all previously written files, get_dataset_paths()
will work to retrieve a list of all proper dataset paths, and get_trh_ptr(path_name)
and get_frag_ptr(path_name)
will work for data access to TriggerRecordHeader
s and Fragment
s, respectively, where path_name
is the full dataset path name. Other accessors to retrieving underlying data will throw an exception.