Skip to content

Comments

Enable pipeline model parallelism for Evo2 inference#1478

Open
kjaniknvidia wants to merge 3 commits intoNVIDIA:mainfrom
kjaniknvidia:feat/pp_infer
Open

Enable pipeline model parallelism for Evo2 inference#1478
kjaniknvidia wants to merge 3 commits intoNVIDIA:mainfrom
kjaniknvidia:feat/pp_infer

Conversation

@kjaniknvidia
Copy link
Collaborator

Remove the PP > 1 guard, argparse choices=[1] restriction, and hardcoded pre_process/post_process=True so the model provider auto-detects pipeline stage. Tested with PP=1, PP=2, and PP=5.

Description

For the most part I just removed the guarding that forces PP=1. There's only one functional line change.

  1. Line 257 — Removed the if pipeline_model_parallel_size != 1: raise ValueError(...) guard (3 lines deleted)
  2. Line 334 — Changed model_provider.provide(pre_process=True, post_process=True) to model_provider.provide() so each pipeline stage auto-detects whether it needs embedding/output layers
  3. Line 508 — Removed choices=[1] from the --pipeline-model-parallel-size argparse argument
  4. Lines 245, 553 — Updated docstrings removing "(must be 1)"

Usage

torchrun --nproc-per-node 2 /workspace/bionemo/src/bionemo/evo2/run/infer.py
--ckpt-dir /workspace/bionemo/evo2_1b_8k_bf16_mbridge
--prompt "ATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCG"
--max-new-tokens 10
--top-k 1
--temperature 1.0
--pipeline-model-parallel-size 2

torchrun --nproc-per-node 5 /workspace/bionemo/src/bionemo/evo2/run/infer.py
--ckpt-dir /workspace/bionemo/evo2_1b_8k_bf16_mbridge
--prompt "ATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCG"
--max-new-tokens 10
--top-k 1
--temperature 1.0
--pipeline-model-parallel-size 5

│ PP=1 inference (1 GPU) PASS ATCGATCGAT │
│ PP=2 inference (2 GPUs) PASS ATCGATCGAT │
│ PP=5 inference (5 GPUs) PASS ATCGATCGAT │

Type of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Refactor
  • Documentation update
  • Other (please describe):

CI Pipeline Configuration

Configure CI behavior by applying the relevant labels. By default, only basic unit tests are run.

Unit tests marked as @pytest.mark.multi_gpu or @pytest.mark.distributed are not run in the PR pipeline.

For more details, see CONTRIBUTING

Note

By default, only basic unit tests are run. Add appropriate labels to enable an additional test coverage.

Authorizing CI Runs

We use copy-pr-bot to manage authorization of CI
runs on NVIDIA's compute resources.

  • If a pull request is opened by a trusted user and contains only trusted changes, the pull request's code will
    automatically be copied to a pull-request/ prefixed branch in the source repository (e.g. pull-request/123)
  • If a pull request is opened by an untrusted user or contains untrusted changes, an NVIDIA org member must leave an
    /ok to test comment on the pull request to trigger CI. This will need to be done for each new commit.

Triggering Code Rabbit AI Review

To trigger a code review from code rabbit, comment on a pull request with one of these commands:

See https://docs.coderabbit.ai/reference/review-commands for a full list of commands.

Pre-submit Checklist

  • I have tested these changes locally
  • I have updated the documentation accordingly
  • I have added/updated tests as needed
  • All existing tests pass successfully

@copy-pr-bot
Copy link

copy-pr-bot bot commented Feb 20, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 20, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Remove the PP > 1 guard, argparse choices=[1] restriction, and
hardcoded pre_process/post_process=True so the model provider
auto-detects pipeline stage. Tested with PP=1, PP=2, and PP=5.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Ken Janik <kjanik@nvidia.com>
Copy link
Collaborator

@jstjohn jstjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve with one comment to address.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants