Skip to content

Quality: GPU configuration crash on CPU-only systems#629

Open
bachdev wants to merge 2 commits intomicrosoft:mainfrom
bachdev:contribai/improve/quality/gpu-configuration-crash-on-cpu-only-syst
Open

Quality: GPU configuration crash on CPU-only systems#629
bachdev wants to merge 2 commits intomicrosoft:mainfrom
bachdev:contribai/improve/quality/gpu-configuration-crash-on-cpu-only-syst

Conversation

@bachdev
Copy link
Copy Markdown

@bachdev bachdev commented Mar 31, 2026

Problem

The code attempts to process the gpus string by splitting it, but gpus can become None if torch.cuda.is_available() returns False. This will lead to an AttributeError when gpus.split(',') is called, causing the application to crash on systems without a GPU or where CUDA is not detected. This makes the application unusable in CPU-only environments.

Severity: critical
File: PW_FT_classification/main.py

Solution

Explicitly handle the case where gpus is None by assigning an empty list or a CPU-specific device identifier, or by performing the split only if gpus is a string.

Changes

  • PW_FT_classification/main.py (modified)

Testing

  • Existing tests pass
  • Manual review completed
  • No new warnings/errors introduced

The code attempts to process the `gpus` string by splitting it, but `gpus` can become `None` if `torch.cuda.is_available()` returns `False`. This will lead to an `AttributeError` when `gpus.split(',')` is called, causing the application to crash on systems without a GPU or where CUDA is not detected. This makes the application unusable in CPU-only environments.


Affected files: main.py

Signed-off-by: BachDEV <1437214+bachdev@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 31, 2026 19:06
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a crash in the classification training entrypoint when running on CPU-only systems by avoiding calling .split() on a None GPU string.

Changes:

  • Adjust GPU parsing logic to only split/cast GPU IDs when CUDA is available.
  • Explicitly set gpus = None when CUDA is unavailable.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +58 to +63
if torch.cuda.is_available():
gpus = [int(i) for i in gpus.split(',')]
else:
# If no CUDA devices are available, set gpus to None to indicate CPU usage
# PyTorch Lightning Trainer will default to CPU if devices is None
gpus = None
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On CPU-only systems this still looks likely to crash because the Trainer is configured with accelerator='gpu' unconditionally (line 162). Setting gpus=None will not make Lightning fall back to CPU when the accelerator is explicitly GPU. Consider deriving both accelerator and devices from torch.cuda.is_available() (e.g., accelerator='cpu', devices=1 when CUDA is unavailable, or use accelerator='auto'/devices='auto'). Also, the inline comment about defaulting to CPU when devices is None is misleading in the current configuration.

Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants