Skip to content

Add VRAM max bandwidth metric to GPU status#51

Merged
sarat-k merged 1 commit intoROCm:mainfrom
bhatturu:feature/add-vram-max-bandwidth
Mar 3, 2026
Merged

Add VRAM max bandwidth metric to GPU status#51
sarat-k merged 1 commit intoROCm:mainfrom
bhatturu:feature/add-vram-max-bandwidth

Conversation

@bhatturu
Copy link
Contributor

@bhatturu bhatturu commented Feb 25, 2026

  • Add vram_max_bandwidth field 31 to GPUStatus proto (after ProcessStatus)
  • Collect from gpu_metrics in poll path (smi_gpu_fill_status)
  • Mock returns 3276800 GB/s; real MI300X returns 5325 GB/s

Motivation

Technical Details

Test Plan

Test Result

bhatturu@quanta-ccs-aus-e15-19:~/device-metrics-exporter$ ./gpuctl show gpu statistics -i "36ff74a5-0000-1000-8052-3c26e867b349"

GPU id                                 : 36ff74a5-0000-1000-8052-3c26e867b349 (6)
Current graphics power (in Watts)      : 148
Temperature information:
  Junction temperature (in C)          : 47.0
  VRAM temperature (in C)              : 48.0
Current GPU usage:
  GFX utilization                      : 0% 0% 0% 0% 0% 0% 0% 0% 
  VCN utilization                      : 0% 0% 0% 0% 
  JPEG utilization                     : 0% 0% 0% 0% 0% 0% 0% 0% 
                                         0% 0% 0% 0% 0% 0% 0% 0% 
                                         0% 0% 0% 0% 0% 0% 0% 0% 
                                         0% 0% 0% 0% 0% 0% 0% 0% 
                                         N/A N/A N/A N/A N/A N/A N/A N/A 

PCIe statistics:
  Recovery count                       : 1
  Bidirectional bandwidth (in GB/s)    : 540391652
VRAM usage:
  Total VRAM (in MB)                   : 262128
  Used VRAM (in MB)                    : 283
  Free VRAM (in MB)                    : 261845
  Total visible VRAM (in MB)           : 262128
  Used visible VRAM (in MB)            : 283
  Free visible VRAM (in MB)            : 261845
  Total GTT (in MB)                    : 1160760
  Used GTT (in MB)                     : 24
  Free GTT (in MB)                     : 1160736
Accumulated energy consumed (in uJ)    : 4550317572096.00
Power usage (in Watts)                 : 148
GFX activity accumulated               : 175
Link 2 data written (in KB)            : 1
Link 3 data written (in KB)            : 1
Link 4 data read (in KB)               : 1
Link 4 data written (in KB)            : 1
Link 5 data read (in KB)               : 1
Link 5 data written (in KB)            : 1
Link 6 data written (in KB)            : 1
Link 7 data read (in KB)               : 1
Link 7 data written (in KB)            : 1
Link 8 data written (in KB)            : 1
Current accumulated counter            : 30700743
Processor hot residency accumulated    : 0
PPT residency accumulated              : 17872
Socket thermal residency accumulated   : 0
VR thermal residency accumulated       : 0
HBM thermal residency accumulated      : 0
Processor hot residency percentage     : 0%
PPT residency percentage               : 0%
Socket thermal residency percentage    : 0%
VR thermal residency percentage        : 0%
HBM thermal residency percentage       : 0%
VRAM max bandwidth (in GB/s)           : 6144

------------------------------------------------------------------------------------------

No. of gpus : 1

Submission Checklist

@bhatturu bhatturu force-pushed the feature/add-vram-max-bandwidth branch 3 times, most recently from 2b7ceec to 3af3afa Compare February 28, 2026 20:53
@bhatturu bhatturu changed the title Add VRAM max bandwidth metric to GPU statistics Add VRAM max bandwidth metric to GPU status Feb 28, 2026
@bhatturu bhatturu force-pushed the feature/add-vram-max-bandwidth branch 2 times, most recently from d9f69d2 to d3817fd Compare March 3, 2026 20:35
- Add max_bandwidth field to aga_gpu_vram_status_t struct in aga_gpu.hpp
- Add MaxBandwidth (field 4) to GPUVRAMStatus proto message
- Set vram_status.max_bandwidth in smi_gpu_fill_status (inside g_gpu_metrics block)
  and in smi_api_mock.cc
- Populate MaxBandwidth via aga_gpu_vram_status_to_proto in gpu_to_proto.hpp
- Display VRAM max bandwidth in printGPUStatus CLI alongside other VRAM fields
@bhatturu bhatturu force-pushed the feature/add-vram-max-bandwidth branch from d3817fd to aa6f6da Compare March 3, 2026 20:56
@sarat-k sarat-k merged commit 1aec881 into ROCm:main Mar 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants