Skip to content

Network Storage (NetworkIO) Implementation for SST Firefly NIC#2643

Open
RishankPratikHPLabs wants to merge 2 commits intosstsimulator:develfrom
RishankPratikHPLabs:nwStorage-feature
Open

Network Storage (NetworkIO) Implementation for SST Firefly NIC#2643
RishankPratikHPLabs wants to merge 2 commits intosstsimulator:develfrom
RishankPratikHPLabs:nwStorage-feature

Conversation

@RishankPratikHPLabs
Copy link
Copy Markdown

Network Storage (NetworkIO) Implementation for SST Firefly NIC

What this does

Adds network-attached SSD storage simulation to the Firefly NIC. Compute nodes can now issue async read/write operations to storage nodes over the simulated network, with a SimpleSSD model handling the storage-side latency and bandwidth.

We needed this to study I/O performance in distributed storage setups — how network latency, SSD throughput, node placement, and access patterns interact — without needing real hardware. This plugs into the existing Ember/Firefly/Hermes stack, so existing MPI simulations are unaffected.


How it works

Architecture   Design The diagram above shows a read operation end-to-end. The client application calls `nwio_read()`, which goes through RDMA over the Slingshot network model to reach the SSD server. On the server side, the request gets scheduled into one of N queue lanes (round-robin), processed by the SSD with a bandwidth+latency delay, and the response is sent back over the network to the client's RDMA memory.

Here's what each layer does:

Hermes API (networkIOapi.h) — defines two calls: networkIORead(dest, offset, length, callback) and networkIOWrite(offset, src, length, callback). Both are async — you pass a callback that fires when the op completes.

Hades (hadesNetworkIO) — takes the global byte offset and figures out which storage node to hit. Uses a simple modulo scheme: nodeIndex = (offset / storageNodeCapacity) % numStorageNodes. Then hands it down to the NIC.

Firefly NIC (nicNetworkIO, nicNetworkIOSendEntry, nicNetworkIOStream) — this is where most of the work happens:

  • On the compute side: the NIC creates a send entry (read or write), generates a response key (respKey) with a callback attached, and queues a small request packet: [MsgHdr::NetworkIO][Read|Write][offset][addr][length][respKey]. The respKey gets stored so the NIC can match the ACK later.

  • On the storage side: a NetworkIOStream receives the packet, extracts the op type and respKey, and passes it to SimpleSSD (or falls back to DMA if SimpleSSD isn't loaded). When the SSD finishes, the stream creates a tiny ACK packet containing just the respKey and sends it back.

  • Back on the compute side: another NetworkIOStream receives the ACK, pulls out the respKey, looks up the stored callback via getRespKeyValue(), and invokes it. This completes the async operation.

NetworkIO traffic stays separated from MPI — it uses MsgHdr::NetworkIO (vs MsgHdr::Msg), its own NetworkIOStream (vs MsgStream), and vNic 0.

Ember (TestNetworkIO motif) — the workload layer. Issues configurable read/write calls with messageSize, iterations, op type, and fileSize params. Randomizes offsets within the file range.

SimpleSSD — sits on each storage node as a subcomponent. Has a multi-lane queue structure (nSSDsPerNode × queuesCountPerSSD lanes). Requests go round-robin across lanes. Each request gets a delay of overheadLatency + (bytes / bandwidth), scheduled via a self-link timer. When the delay fires, the completion callback runs and the ACK gets sent.


SimpleSSD model

firefly.SimpleSSD is a subcomponent with these params:

Parameter Default What it does
nSSDsPerNode 1 SSDs per storage node
queuesCountPerSSD 4 Queue lanes per SSD
readBandwidthPerSSD_GBps 6.25 Read bandwidth (GB/s)
writeBandwidthPerSSD_GBps 6.25 Write bandwidth (GB/s)
readOverheadLatency_ns 500 Fixed read overhead (ns)
writeOverheadLatency_ns 500 Fixed write overhead (ns)

Delay = overhead + (bytes / bandwidth). Requests go round-robin across all SSD×queue lanes.


Node setup

Some nodes are compute, some are storage. Configured in loadNetworkIO:

  • Compute nodes run workloads (TestNetworkIO) and issue I/O
  • Storage nodes run Null (idle) and serve requests through SimpleSSD
  • The split uses SSD_START_NODE and SSD_NODES vars in the load file

Files

New files (20):

  • hermes/networkIOapi.h — NetworkIO API
  • firefly/hadesNetworkIO.{cc,h} — offset-to-node mapping
  • firefly/nicNetworkIO.{cc,h} — NIC command handling
  • firefly/nicNetworkIOSendEntry.h — packet construction
  • firefly/nicNetworkIOStream.{cc,h} — incoming packet processing
  • firefly/storageModel/simpleSSD.{cc,h} — SSD simulation
  • ember/libs/emberNetworkIOLib.h — library wrapper
  • ember/libs/networkIOEvents/emberNetworkIO{Event,ReadEvent,WriteEvent}.h — events
  • ember/networkIO/emberNetworkIOGen.{cc,h} — generator base
  • ember/networkIO/motifs/emberTestNetworkIO.{cc,h} — test motif
  • ember/test/loadNetworkIO — test config
  • ember/test/networkIOParams.py — platform params

Modified files (16):

  • firefly/nic.{cc,h} — NetworkIO handler + SimpleSSD setup
  • firefly/nicEvents.h — new event types
  • firefly/nicRecvCtx.cc, nicRecvMachine.h — stream dispatch
  • firefly/nicSendEntry.h, nicSendMachine.{cc,h} — send support
  • firefly/nicVirtNic.h, virtNic.{cc,h} — delegation methods
  • firefly/Makefile.am, hermes/Makefile.am, ember/Makefile.am — build updates
  • ember/.gitignore — ignore patterns
  • CONTRIBUTORS.TXT — added HPE entry

Testing

Tested with three configs, all pass:

Test Topology Compute SSD messageSize Sim Time
1 torus 2x2 (4 nodes) 2 2 1000 2.697 us
2 torus 4x2 (8 nodes) 6 2 1000 3.661 us
3 torus 4x2 (8 nodes) 2 6 1000 1.731 us

Built with SST Core + Elements from devel, GCC 14.


How to run

Build:

git clone https://github.com/RishankPratikHPLabs/nwStorage_sst-elements.git
cd nwStorage_sst-elements && git checkout nwStorage-feature
./autogen.sh
CC=gcc-14 CXX=g++-14 ./configure --prefix=$SST_INSTALL_DIR --with-sst-core=$SST_CORE_HOME
make -j$(nproc) && make install

Verify SimpleSSD registered:

sst-info firefly.SimpleSSD

Run the default test (2 compute + 2 SSD):

cd src/sst/elements/ember/test
sst emberLoad.py -- --topo=torus --shape=2x2 --numNodes=4 --numCores=1 --platform=networkIO --loadFile=loadNetworkIO

To change the node layout, edit loadNetworkIO:

[VAR] TOTAL_NODES=8
[VAR] COMPUTE_NODES=6
[VAR] SSD_START_NODE=6
[VAR] SSD_NODES=2

Then run with a matching topology:

sst emberLoad.py -- --topo=torus --shape=4x2 --numNodes=8 --numCores=1 --platform=networkIO --loadFile=loadNetworkIO

Motif params in loadNetworkIO: messageSize=<bytes>, iterations=<N>, op=read|write, fileSize=<bytes>.


Co-authored-by: Rishank Pratik rishank.pratik@hpe.com
Co-authored-by: Pawan Kumar pawan.kumar4@hpe.com
Co-authored-by: Sumant Kalra sumant.kalra@hpe.com
Co-authored-by: Shridhar Joshi shridhar@hpe.com

SSD module integration with Firefly NIC for realistic storage I/O modeling.
End-to-end I/O path simulation: compute nodes -> network -> SSD storage nodes.
Distributed address mapping with round-robin striping across storage devices.
Direct integration with Ember motif framework for HPC workload simulation.

New components:
- NetworkIO API layer in Hermes interface
- HadesNetworkIO with distributed address mapping
- NIC-level packet handling for NetworkIO (read, write, ACK)
- SimpleSSD storage device model (configurable bandwidth, latency, multi-queue bus)
- Ember NetworkIO motif and test configurations

Co-authored-by: Rishank Pratik <rishank.pratik@hpe.com>
Co-authored-by: Pawan Kumar <pawan.kumar4@hpe.com>
Co-authored-by: Sumant Kalra <sumant.kalra@hpe.com>
Co-authored-by: Shridhar Joshi <shridhar@hpe.com>
@sst-autotester
Copy link
Copy Markdown
Contributor

Status Flag 'Pre-Test Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging
NO INSPECTION HAS BEEN PERFORMED ON THIS PULL REQUEST! - This PR must be inspected by setting label 'AT: PRE-TEST INSPECTED'.

Add SPDX-FileCopyrightText and SPDX-License-Identifier (BSD-3-Clause)
headers to all new and modified files per HPE OSRB attribution
requirements. Update CONTRIBUTORS.TXT with HPE entry. Fix corrupted
copyright line in nicEvents.h.
@sst-autotester
Copy link
Copy Markdown
Contributor

Status Flag 'Pre-Test Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging
NO INSPECTION HAS BEEN PERFORMED ON THIS PULL REQUEST! - This PR must be inspected by setting label 'AT: PRE-TEST INSPECTED'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants