Foundry Toolkit: qwen3.5-9b-cuda-gpu cannot be loaded — config/runtime version mismatch (+ a related bin/ copy bug)
This report covers two issues found while running local models with the Foundry Toolkit (formerly AI Toolkit) on Windows + CUDA. Issue 1 is the blocker.
Issue 1 (BLOCKER): Shipped qwen3.5-9b-cuda-gpu model is incompatible with the bundled onnxruntime-genai runtime
Summary
The Toolkit's catalog offers qwen3.5-9b-cuda-gpu, but the bundled onnxruntime-genai runtime cannot load it. The model's genai_config.json uses a schema/architecture newer than the runtime shipped with the extension. This reproduces on a clean re-download with the latest extension (1.4.2), so it is not a corrupted-download or stale-cache problem — the catalog and the runtime are simply out of sync.
Errors (two symptoms, same root cause)
First, a strict JSON parse error on an unknown vision field:
Failed loading model qwen3.5-9b-cuda-gpu:2.
Error encountered while parsing
'C:\Users\<user>\.aitk\models\microsoft\qwen3.5-9b-cuda-gpu-2\v2\genai_config.json'
JSON Error: model:vision: Unknown value "patch_size" at line 68 index 23
Then, after removing patch_size to get past the parser, the real wall — the runtime has no implementation for the architecture:
Failed loading model qwen3.5-9b-cuda-gpu:2.
Unsupported model_type in config.json: qwen3_5
Root cause
genai_config.json declares "type": "qwen3_5" and a vision block containing patch_size.
- The bundled runtime (
libonnxruntime_cuda_windows 0.0.7) does not recognize the qwen3_5 model type, and its strict parser rejects the unknown vision.patch_size key.
- The
patch_size parse error is just the first symptom; Unsupported model_type: qwen3_5 is the definitive blocker. No config edit can fix this — the architecture support must exist in the runtime.
Repro steps
- Install Foundry Toolkit 1.4.2 (latest).
- From the catalog, download
qwen3.5-9b-cuda-gpu (CUDA execution provider).
- Load the model.
- Loading fails with the
patch_size parse error; if that field is removed, it fails with Unsupported model_type in config.json: qwen3_5.
- Deleting and re-downloading the model does not help — same result.
Relevant config snippet (genai_config.json)
Impact
- A model advertised in the catalog is impossible to run with the runtime shipped in the same extension version. Users hit this only after a multi-GB download.
Suggested fixes (product side)
- Sync the catalog with runtime capability: don't list/allow a model whose
model_type (e.g. qwen3_5) and config schema aren't supported by the runtime bundled in that extension build.
- Bump the bundled onnxruntime-genai to a version that implements
qwen3_5 and the current vision schema (incl. patch_size), and pair it with the model download.
- Fail fast with a clear message: e.g. "This model requires onnxruntime-genai ≥ X.Y.Z; your runtime is 0.0.7" instead of a late-stage JSON parse / unsupported-type error.
- Consider forward-compatible parsing: tolerate unknown optional config keys (warn instead of hard-fail) so a schema addition like
patch_size doesn't block loading on its own.
Workarounds (for users, limited)
- Editing
genai_config.json (removing patch_size) only advances to the Unsupported model_type: qwen3_5 error — not a real fix.
- Until the runtime supports
qwen3_5, use a model the current runtime can run (e.g. a Qwen2.5 / Phi CUDA build) to keep working.
Environment
- OS: Windows
- Extension: Foundry Toolkit
ms-windows-ai-studio.windows-ai-studio 1.4.2 (win32-x64)
- Execution provider: CUDA
- Bundled runtime:
libonnxruntime_cuda_windows 0.0.7
- Model:
qwen3.5-9b-cuda-gpu (model_type: qwen3_5, multimodal/vision)
Issue 2 (separate, lower severity): CUDA DLL copy fails with ENOENT when extension bin/ folder is missing
Summary
When switching a local model to the CUDA execution provider, the Toolkit fails to copy onnxruntime-genai-cuda.dll into the extension's bin/ folder. The thrown ENOENT points at the source DLL, which is misleading — the source exists and is fully downloaded. The real cause is that the destination bin/ directory does not exist, and Node's fs.copyFile throws ENOENT when the target directory is missing.
Error
Sorry, your request failed. Please try again.
Reason: ENOENT: no such file or directory, copyfile
'C:\Users\<user>\.aitk\bin\libonnxruntime_cuda_windows\0.0.7\onnxruntime-genai-cuda.dll'
-> 'c:\Users\<user>\.vscode\extensions\ms-windows-ai-studio.windows-ai-studio-1.4.2-win32-x64\bin\onnxruntime-genai-cuda.dll'
Root cause
- The source file exists and is complete:
...\.aitk\bin\libonnxruntime_cuda_windows\0.0.7\onnxruntime-genai-cuda.dll (~85 MB) ✅
- The destination directory does not exist:
...\ms-windows-ai-studio.windows-ai-studio-1.4.2-win32-x64\bin\ ❌
fs.copyFile(src, dest) raises ENOENT when the parent directory of dest is missing. The error string only names the two file paths, so it reads as if the source is missing — sending people to debug the wrong path.
Repro steps
- Fresh install of the extension where the
bin/ folder was not created (or was removed/cleaned).
- Load a local model and select the CUDA execution provider.
- Toolkit downloads
libonnxruntime_cuda_windows (0.0.7) to ~/.aitk/bin/....
- Toolkit attempts to copy the CUDA DLLs into
<extension>\bin\.
- Copy fails with the
ENOENT above.
Workaround
Manually create the missing folder, then retry:
New-Item -ItemType Directory -Force `
"$env:USERPROFILE\.vscode\extensions\ms-windows-ai-studio.windows-ai-studio-1.4.2-win32-x64\bin"
Proposed fix
Ensure the destination directory exists before copying:
import * as fs from 'fs';
import * as path from 'path';
await fs.promises.mkdir(path.dirname(dest), { recursive: true });
await fs.promises.copyFile(src, dest);
Additional improvements:
- Wrap the copy in a try/catch that distinguishes source-missing vs dest-dir-missing, and surface an actionable message (e.g. "runtime download incomplete" vs "extension bin folder missing").
- Optionally verify the source DLL size/hash before copy to catch interrupted downloads.
Environment
- OS: Windows
- Extension:
ms-windows-ai-studio.windows-ai-studio-1.4.2-win32-x64
- Execution provider: CUDA
- Runtime package:
libonnxruntime_cuda_windows 0.0.7
Foundry Toolkit:
qwen3.5-9b-cuda-gpucannot be loaded — config/runtime version mismatch (+ a relatedbin/copy bug)This report covers two issues found while running local models with the Foundry Toolkit (formerly AI Toolkit) on Windows + CUDA. Issue 1 is the blocker.
Issue 1 (BLOCKER): Shipped
qwen3.5-9b-cuda-gpumodel is incompatible with the bundled onnxruntime-genai runtimeSummary
The Toolkit's catalog offers
qwen3.5-9b-cuda-gpu, but the bundled onnxruntime-genai runtime cannot load it. The model'sgenai_config.jsonuses a schema/architecture newer than the runtime shipped with the extension. This reproduces on a clean re-download with the latest extension (1.4.2), so it is not a corrupted-download or stale-cache problem — the catalog and the runtime are simply out of sync.Errors (two symptoms, same root cause)
First, a strict JSON parse error on an unknown vision field:
Then, after removing
patch_sizeto get past the parser, the real wall — the runtime has no implementation for the architecture:Root cause
genai_config.jsondeclares"type": "qwen3_5"and avisionblock containingpatch_size.libonnxruntime_cuda_windows0.0.7) does not recognize theqwen3_5model type, and its strict parser rejects the unknownvision.patch_sizekey.patch_sizeparse error is just the first symptom;Unsupported model_type: qwen3_5is the definitive blocker. No config edit can fix this — the architecture support must exist in the runtime.Repro steps
qwen3.5-9b-cuda-gpu(CUDA execution provider).patch_sizeparse error; if that field is removed, it fails withUnsupported model_type in config.json: qwen3_5.Relevant config snippet (
genai_config.json)Impact
Suggested fixes (product side)
model_type(e.g.qwen3_5) and config schema aren't supported by the runtime bundled in that extension build.qwen3_5and the currentvisionschema (incl.patch_size), and pair it with the model download.patch_sizedoesn't block loading on its own.Workarounds (for users, limited)
genai_config.json(removingpatch_size) only advances to theUnsupported model_type: qwen3_5error — not a real fix.qwen3_5, use a model the current runtime can run (e.g. a Qwen2.5 / Phi CUDA build) to keep working.Environment
ms-windows-ai-studio.windows-ai-studio1.4.2 (win32-x64)libonnxruntime_cuda_windows0.0.7qwen3.5-9b-cuda-gpu(model_type: qwen3_5, multimodal/vision)Issue 2 (separate, lower severity): CUDA DLL copy fails with
ENOENTwhen extensionbin/folder is missingSummary
When switching a local model to the CUDA execution provider, the Toolkit fails to copy
onnxruntime-genai-cuda.dllinto the extension'sbin/folder. The thrownENOENTpoints at the source DLL, which is misleading — the source exists and is fully downloaded. The real cause is that the destinationbin/directory does not exist, and Node'sfs.copyFilethrowsENOENTwhen the target directory is missing.Error
Root cause
...\.aitk\bin\libonnxruntime_cuda_windows\0.0.7\onnxruntime-genai-cuda.dll(~85 MB) ✅...\ms-windows-ai-studio.windows-ai-studio-1.4.2-win32-x64\bin\❌fs.copyFile(src, dest)raisesENOENTwhen the parent directory ofdestis missing. The error string only names the two file paths, so it reads as if the source is missing — sending people to debug the wrong path.Repro steps
bin/folder was not created (or was removed/cleaned).libonnxruntime_cuda_windows(0.0.7) to~/.aitk/bin/....<extension>\bin\.ENOENTabove.Workaround
Manually create the missing folder, then retry:
Proposed fix
Ensure the destination directory exists before copying:
Additional improvements:
Environment
ms-windows-ai-studio.windows-ai-studio-1.4.2-win32-x64libonnxruntime_cuda_windows0.0.7