Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,7 @@ cmake --install build
- **Rank-7** array support
- **Expanded attribute** type coverage
- **FAPL-scoped worker pool** — `h5::create(..., h5::threads{N} | h5::backpressure{M})` opts the file into parallel filter compression; all chunked datasets opened on that file (and pt_t built from them) inherit the pool with async-pipelined dispatch
- **Async-mode scaffold** — `h5::async::fd_t fd = h5::async::create(...)` returns a descriptor whose `operator ::hid_t()` is `= delete`'d so accidental raw-C-API calls fail at compile time; per-fd executor thread serializes HDF5 calls. Operation overloads land in the next PR.
- **HDF5 1.12.2 ceiling** — tested and verified; `H5Dvlen_reclaim` / reference API compatibility
- **Windows MSVC** in the CI matrix
- **ASan + UBSan + TSan** clean on Clang 20
Expand Down
45 changes: 44 additions & 1 deletion docs/filtering-pipeline-rework-report.md
Original file line number Diff line number Diff line change
Expand Up @@ -193,4 +193,47 @@ Back-pressure is bounded by `h5::backpressure{M}`: the producer blocks on the fr

The legacy per-pt_t `h5::filter::threads{N}` constructor from #241 is removed in this cycle (see [Phase 1.4 commit message]). Two parallel threading paths in the pipeline invite contention bugs and confuse the surface; the FAPL pool fully subsumes it.

Phase II (compile-time C-API blocking on `async_fd_t`, full async mode) is tracked separately.
Phase II (compile-time C-API blocking on `h5::async::fd_t`, full async mode) is tracked separately.

## Status — Phase II PR-A (#252, async descriptors + executor scaffold)

Phase II is delivered in two PRs. PR-A (this work) lands the descriptor types, FAPL executor property, and executor thread; PR-B will wire the concept-constrained `h5::write` / `h5::read` / `h5::create(fd, "ds", …)` overloads that branch on `is_async_v<FD>`.

### User-visible surface (PR-A)

```cpp
// Namespace shape: nested h5::async::*, parallel to the classic h5::*.
namespace h5::async {
using fd_t = impl::async_hid_t<impl::fd_t, H5Fclose>;
using ds_t = impl::async_did_t<impl::ds_t, H5Dclose>;
using gr_t = impl::async_aid_t<impl::gr_t, H5Gclose>;
using at_t = impl::async_aid_t<impl::at_t, H5Aclose>;

fd_t create(const std::string& path, unsigned flags,
const h5::fcpl_t& fcpl = h5::default_fcpl,
const h5::fapl_t& fapl = h5::default_fapl);
fd_t open (const std::string& path, unsigned flags,
const h5::fapl_t& fapl = h5::default_fapl);
}

// Mode is declared exactly once — at h5::async::create / open.
// Every downstream operation deduces async-ness from the FD type via
// TAD (Phase II PR-B).
```

`h5::async::fd_t` has `operator ::hid_t() = delete`, so a stray `H5Gcreate2(async_fd, …)` is a clean compile error ("use of deleted function") rather than a silent thread-safety hazard.

### Mechanism (PR-A)

- `h5cpp/H5executor.hpp` — single worker thread per async fd, `std::packaged_task` in a `shared_ptr` wrapped in `std::function<void()>`, `submit_and_wait<Fn>(Fn&&)` blocking the caller via the future. Exceptions propagate back through `future::get`; same-thread re-entry runs the callable inline.
- `h5cpp/H5Pfapl_async.hpp` — FAPL executor slot using the same `H5Pinsert2` + shared-ptr pattern as Phase I. Kept as a defensive utility; the primary code path does not depend on it.
- `h5cpp/H5async.hpp` — the two factories. Each constructs an `executor_t`, then calls `H5Fcreate` / `H5Fopen`, then returns `h5::async::fd_t{raw_hid, exec}`.

The executor lives **directly on the wrapper** (a `std::shared_ptr<executor_t> exec` field on the `false,false` `hid_t` specialization), not retrieved from the file's FAPL via `H5Fget_access_plist`. HDF5 1.10.9 reconstructs the retrieved FAPL from standard properties only — `H5Pinsert2`-installed properties do not survive the round-trip. (This also affects Phase I's pool resolution path; tracked as a follow-up.)

### Out of scope for PR-A (lands in PR-B)

- Concept-constrained overloads of `h5::write`, `h5::read`, `h5::append`, `h5::flush`, `h5::create(fd, "ds", …)`.
- Mode-transitive factory pattern (async fd → async ds → async at).
- `h5::pt_t` as a class template `template <class DS = h5::ds_t>` deduced via CTAD.
- Performance benchmarks vs. HDF5 `--enable-threadsafe`.
176 changes: 171 additions & 5 deletions h5cpp/H5Iall.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
#include <string>
#include <vector>
#include <tuple>
#include <memory> /* std::shared_ptr — async descriptor exec field */
#include <initializer_list>

#ifdef H5CPP_CONVERSION_IMPLICIT
Expand All @@ -32,6 +33,18 @@ namespace h5::impl {
};
//forward declarations
struct at_t;

// Phase II — async descriptors carry a shared_ptr<executor_t> field
// directly on the wrapper. Why direct storage and not the FAPL slot
// pattern from Phase I: HDF5 1.10.9's H5Fget_access_plist returns a
// synthetic FAPL reconstructed from standard properties only; user
// properties installed via H5Pinsert2 are dropped. Storing the
// executor inside the wrapper class lets operation overloads in
// Phase II PR-B reach it as `fd.exec` without round-tripping through
// HDF5's property machinery. std::shared_ptr's type-erased deleter
// makes the forward declaration sufficient — the complete type is
// only needed at h5::async::create / open (defined in H5async.hpp).
struct executor_t;
}

namespace h5::impl::detail {
Expand Down Expand Up @@ -110,13 +123,133 @@ namespace h5::impl::detail {
::hid_t handle;
};

// disable from CAPI and TOCAPI conversions
//conversion ctor to packet table enabled, used for h5::impl::ds_t
// Phase II async-mode specialization — operator ::hid_t() is = delete'd so
// user code that accidentally hands an async descriptor to a raw HDF5 C
// API fails to compile with a clear "use of deleted function" diagnostic.
// h5cpp internal code reaches the raw handle via the public `handle`
// field (see workplan §4.4); user code routes through h5::write / h5::read
// / etc. which detect the type via is_async_v<> and dispatch through the
// FAPL-resolved executor.
template<class T, capi_close_t capi_close>
struct hid_t<T,capi_close, false,false,hdf5::any> : private hid_t<T,capi_close,true,true,hdf5::any> {
using parent = hid_t<T,capi_close,true,true,hdf5::any>;
struct hid_t<T,capi_close, false,false,hdf5::any> {
using hidtype = T;

// from CAPI — mirrors the true,true ctor; explicit so an accidental
// implicit promotion from ::hid_t doesn't slip an async wrapper in.
H5CPP__EXPLICIT hid_t( ::hid_t handle_ ) : handle( handle_ ){
if( H5Iis_valid( handle_ ) )
H5Iinc_ref( handle_ );
}

// Factory ctor — h5::async::create / open construct the executor
// during file creation and inject it here so operation overloads
// (Phase II PR-B) can reach it via `fd.exec`. Used by mode-
// transitive factories too (ds_t inherits parent fd's executor).
hid_t( ::hid_t handle_, std::shared_ptr<h5::impl::executor_t> e ) noexcept
: handle( handle_ ), exec( std::move(e) ) {}

// TO CAPI — DELETED. Async descriptors must not be implicitly
// converted back to ::hid_t; doing so would let user code call
// HDF5 directly and bypass the executor thread. Internal code
// reads the raw value from `handle` directly.
operator ::hid_t() const = delete;

// direct-initialization ctor; matches the classic shape — does not
// increment the refcount (caller owns the handle).
hid_t( std::initializer_list<::hid_t> fd ) : handle( *fd.begin() ){}

hid_t() : handle(H5I_UNINIT) {}

hid_t( const hid_t& ref ){
handle = ref.handle;
if( H5Iis_valid( handle ) )
H5Iinc_ref( handle );
exec = ref.exec; // shared_ptr copy bumps refcount
}
hid_t& operator=( const hid_t& ref ){
if( this == &ref ) return *this;
if( H5Iis_valid( handle ) )
capi_close( handle );
handle = ref.handle;
if( H5Iis_valid( handle ) )
H5Iinc_ref( handle );
exec = ref.exec;
return *this;
}
hid_t( hid_t&& ref ) noexcept {
handle = ref.handle;
ref.handle = H5I_UNINIT;
exec = std::move(ref.exec);
}
hid_t& operator=( hid_t&& ref ) noexcept {
if( this == &ref ) return *this;
if( H5Iis_valid( handle ) )
capi_close( handle );
handle = ref.handle;
ref.handle = H5I_UNINIT;
exec = std::move(ref.exec);
return *this;
}
~hid_t(){
if( H5Iis_valid( handle ) )
capi_close( handle );
}

// Public so internal h5cpp code (the executor, dispatch lambdas)
// can read the raw id without invoking the deleted conversion.
// User code is expected to use h5::write / h5::read / h5::async::*
// factories rather than touch this field directly.
::hid_t handle;

// Phase II — shared_ptr to the executor that owns this descriptor's
// HDF5 lifetime. Populated by h5::async::create / open at the
// file-level, then propagated to derived descriptors (async ds,
// async at, etc.) by mode-transitive factories. May be null on
// default-constructed async wrappers (un-initialized state).
std::shared_ptr<h5::impl::executor_t> exec;
};

// Phase II — async dataset id. Mirrors hdf5::dataset (line above) but
// with conversion to ::hid_t deleted. Adds the `dapl` field and the
// attribute subscript operator the classic ds_t exposes.
template<class T, capi_close_t capi_close>
struct hid_t<T,capi_close, false,false,hdf5::dataset>
: public hid_t<T,capi_close,false,false,hdf5::any> {
using parent = hid_t<T,capi_close,false,false,hdf5::any>;
using parent::parent;
using parent::handle;
using hidtype = T;
hid_t( std::initializer_list<::hid_t> fd ) : parent( fd ){}
using at_t = hid_t<h5::impl::at_t,H5Aclose,false,false,hdf5::attribute>;

hid_t(){
this->handle = H5I_UNINIT;
this->dapl = H5I_UNINIT;
}
at_t operator[]( const char arg[] );

::hid_t dapl;
};

// Phase II — async attribute id.
template<class T, capi_close_t capi_close>
struct hid_t<T,capi_close, false,false,hdf5::attribute>
: public hid_t<T,capi_close,false,false,hdf5::any> {
using parent = hid_t<T,capi_close,false,false,hdf5::any>;
using parent::parent;
using parent::handle;
using hidtype = T;
using at_t = hid_t<h5::impl::at_t,H5Aclose,false,false,hdf5::attribute>;

hid_t(){
this->handle = H5I_UNINIT;
this->ds = H5I_UNINIT;
}

template <class V> at_t operator=( V arg );
template <class V> at_t operator=( const std::initializer_list<V> args ){ return at_t{H5I_UNINIT}; }

::hid_t ds;
std::string name;
};
/*property id*/
template<class T, capi_close_t capi_close>
Expand Down Expand Up @@ -176,6 +309,14 @@ namespace h5::impl {
template <class T, capi_close_t capi_call> using hid_t = detail::hid_t<T,capi_call, true,true,detail::hdf5::any>;
template <class T, capi_close_t capi_call> using pid_t = detail::hid_t<T,capi_call, true,true,detail::hdf5::property>;
template <class T, capi_close_t capi_call> using did_t = detail::hid_t<T,capi_call, true,true,detail::hdf5::dataset>;

// Phase II — async-mode variants. Same shape as the classic aliases
// above but with operator ::hid_t() = delete'd at the type level.
// Users opt in by calling h5::async::create / h5::async::open; everything
// downstream deduces these types through TAD.
template <class T, capi_close_t capi_call> using async_aid_t = detail::hid_t<T,capi_call, false,false,detail::hdf5::attribute>;
template <class T, capi_close_t capi_call> using async_hid_t = detail::hid_t<T,capi_call, false,false,detail::hdf5::any>;
template <class T, capi_close_t capi_call> using async_did_t = detail::hid_t<T,capi_call, false,false,detail::hdf5::dataset>;
}

/*hide gory details, and stamp out descriptors */
Expand Down Expand Up @@ -203,4 +344,29 @@ namespace h5 {
#undef H5CPP__defaid_t
#undef H5CPP__defpid_t
#undef H5CPP__defhid_t

// Phase II — async-mode descriptor type aliases. Parallel to the
// classic h5::fd_t / h5::ds_t / h5::gr_t / h5::at_t above; the
// underlying class template is the false,false specialization of
// impl::hid_t so any attempt to pass one of these to a raw HDF5
// C-API call fails with "use of deleted function".
namespace async {
using fd_t = impl::async_hid_t<impl::fd_t, H5Fclose>;
using ds_t = impl::async_did_t<impl::ds_t, H5Dclose>;
using at_t = impl::async_aid_t<impl::at_t, H5Aclose>;
using gr_t = impl::async_aid_t<impl::gr_t, H5Gclose>;
using ob_t = impl::async_hid_t<impl::ob_t, H5Oclose>;
}

// Phase II type-trait: is_async_v<T> answers "is T one of the
// h5::async::* descriptors?". Used by concept-constrained operation
// overloads (Phase II PR-B) to pick the executor dispatch branch.
template <class T>
struct is_async : std::false_type {};

template <class T, impl::capi_close_t C, int K>
struct is_async< impl::detail::hid_t<T,C,false,false,K> > : std::true_type {};

template <class T>
inline constexpr bool is_async_v = is_async<std::decay_t<T>>::value;
}
91 changes: 91 additions & 0 deletions h5cpp/H5Pfapl_async.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
/*
* Copyright (c) 2018-2026 Steven Varga, Toronto,ON Canada
* Author: Varga, Steven <steven@vargaconsulting.ca>
*/
#pragma once

// Phase II FAPL executor property — same shared_ptr-in-slot pattern proven
// by Phase I's worker pool (H5Pthreads.hpp). An executor_t lives behind
// a shared_ptr stored in a heap-allocated holder whose address occupies
// the H5Pinsert2 value slot; copy_cb clones the slot but aliases the same
// executor (refcount +1), close_cb drops one slot (refcount -1). When
// the last fd dies, the executor destructor joins the worker thread.
//
// h5::async::create / h5::async::open install this property in addition
// to (and consuming) any h5::threads{N} pool the user chained in. If no
// h5::threads{N} is present, fapl_async_set auto-installs a default-sized
// pool so the executor always has a pool reference.

#include "H5Pthreads.hpp" // worker_pool_t + fapl_threads_set + resolve_worker_pool
#include "H5executor.hpp" // executor_t

#include <hdf5.h>

#include <memory>

namespace h5::impl {

#define H5CPP_FAPL_EXECUTOR "h5cpp_fapl_executor"

// Heap-allocated holder, parallel to worker_pool_slot_t in H5Pthreads.hpp.
struct executor_slot_t {
std::shared_ptr<executor_t> exec;
};

// Copy: HDF5 memcpy'd the slot pointer into the destination. Allocate a
// fresh holder whose shared_ptr aliases the same executor (++refcount).
inline herr_t fapl_exec_copy_cb(const char* /*name*/, size_t /*size*/, void* value) {
auto** slot_loc = static_cast<executor_slot_t**>(value);
*slot_loc = new executor_slot_t{(*slot_loc)->exec};
return 0;
}

// Close: delete one holder; shared_ptr drops one reference. The
// executor_t destructor (which joins the worker thread) runs when the
// last reference is released.
inline herr_t fapl_exec_close_cb(const char* /*name*/, size_t /*size*/, void* ptr) {
delete *static_cast<executor_slot_t**>(ptr);
return 0;
}

// Setter invoked by h5::async::create / h5::async::open. Idempotent —
// if the FAPL already has the property installed, leaves it alone.
// Auto-installs a default-sized worker_pool_t (n=0) when no h5::threads{N}
// was chained into the FAPL, so the executor always has a pool to hand
// to compression callbacks (Phase II PR-B wires that connection).
inline herr_t fapl_async_set(::hid_t fapl) {
if (H5Pexist(fapl, H5CPP_FAPL_EXECUTOR)) return 0;

auto pool = resolve_worker_pool(fapl);
if (!pool) {
// No h5::threads{N} in the chain — install a default-sized pool
// so the executor has a compression backend available.
fapl_threads_set(fapl, 0); // 0 → hardware_concurrency()
pool = resolve_worker_pool(fapl);
}

auto* slot = new executor_slot_t{
std::make_shared<executor_t>(std::move(pool))
};
return H5Pinsert2(fapl, H5CPP_FAPL_EXECUTOR,
sizeof(executor_slot_t*), &slot,
nullptr, // set
nullptr, // get
nullptr, // prp_del
fapl_exec_copy_cb,
nullptr, // compare
fapl_exec_close_cb);
}

// Consumer-site: given a FAPL id, retrieve the executor shared_ptr if one
// is installed. Returns nullptr when the property is absent (classic-mode
// FAPL — caller should not have been routed here).
inline std::shared_ptr<executor_t> resolve_executor(::hid_t fapl_id) noexcept {
if (fapl_id < 0 || H5Iis_valid(fapl_id) <= 0) return nullptr;
if (!H5Pexist(fapl_id, H5CPP_FAPL_EXECUTOR)) return nullptr;
executor_slot_t* slot = nullptr;
H5Pget(fapl_id, H5CPP_FAPL_EXECUTOR, &slot);
return slot ? slot->exec : nullptr;
}

} // namespace h5::impl
Loading
Loading