Fix: Resolve multimodal BOA/EOA tokens dynamically from config.json by roydsouza · Pull Request #104 · SharpAI/SwiftLM

roydsouza · 2026-05-07T23:34:14Z

Description

Currently, the boaToken and eoaToken values are hardcoded to 255010 and 255011 respectively. This causes compatibility issues with newer multimodal models (like Qwen 2-VL) that use different vocab IDs for their vision encoders.

This PR fixes this by dynamically extracting boa_token_id and eoa_token_id from the model's config.json or its audio_config fallback, gracefully defaulting to the old values if not present.

Motivation

Improves compatibility with dynamic and diverse vision-language models without requiring hardcoded token updates.

Testing

Tested on Apple Silicon (M5) compiling under swift build -c release.
Tested successful extraction against standard LLaVA and Qwen VL models.

…tead of hardcoding

solderzzc · 2026-05-08T00:32:29Z

Thanks for the PR! I've added a missing test file (MultimodalTokenExtractionTests.swift) to ensure the multimodal token extraction logic handles audio_config fallbacks correctly and maintains test coverage. Since the CI previously passed and the logic is sound, we're ready to merge once this new check passes.

Copilot

Pull request overview

Updates SwiftLM’s multimodal audio token handling to avoid hardcoded BOA/EOA token IDs by resolving them from a model’s config.json (with audio_config fallback), improving compatibility with newer multimodal models.

Changes:

Replace hardcoded BOA/EOA token IDs in model factory wiring with dynamically extracted values from config.json.
Expand config parsing from “num audio embeddings only” to “num audio + BOA/EOA tokens” via a new helper.
Add unit tests covering defaults, top-level config extraction, and audio_config fallback extraction.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
`Sources/SwiftLM/Server.swift`	Uses dynamically extracted BOA/EOA + audio embedding counts when constructing ALM/Omni processors; introduces `extractMultimodalTokens`.
`tests/SwiftLMTests/MultimodalTokenExtractionTests.swift`	Adds coverage for token extraction defaults and config-driven overrides.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

roydsouza and others added 2 commits May 7, 2026 13:44

Fix SharpAI#3: Resolve multimodal BOA/EOA tokens from config.json ins…

9d495d9

…tead of hardcoding

test(swiftlm): Add tests for multimodal token extraction

621a931

solderzzc requested a review from Copilot May 8, 2026 00:36

Copilot started reviewing on behalf of solderzzc May 8, 2026 00:37 View session

Copilot AI reviewed May 8, 2026

View reviewed changes

Comment thread Sources/SwiftLM/Server.swift

Potential fix for pull request finding

5cfc277

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

solderzzc merged commit a04b81e into SharpAI:main May 8, 2026
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: Resolve multimodal BOA/EOA tokens dynamically from config.json#104

Fix: Resolve multimodal BOA/EOA tokens dynamically from config.json#104
solderzzc merged 3 commits intoSharpAI:mainfrom
roydsouza:fix/moe-memory-and-multimodal-tokens-rebased

roydsouza commented May 7, 2026

Uh oh!

solderzzc commented May 8, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

roydsouza commented May 7, 2026

Description

Motivation

Testing

Uh oh!

solderzzc commented May 8, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants