Refactor: Eagle data loading by h-guo18 · Pull Request #668 · NVIDIA/Model-Optimizer

h-guo18 · 2025-12-08T22:42:07Z

What does this PR do?

Type of change: Refactor

Overview:
Jira ticket: https://jirasw.nvidia.com/browse/OMNIML-2955

Main changes :

Consolidate Eagle data loading with @ChenhanYu 's implementation of transformers_dataset.py
Refactor: baked the following logics from example/main.py to modelopt/torch for cleaner example entrance:
- default config selecting and merging with custom config
- tokenizer post-processor (chat template and pad_tok_id)
- d2t loading
Implementation refactor: In HF workflow, reuse base modfel's input hidden states as input_embedding, instead of calculating from input_ids. This has two main benefits:
- Easier VLM support, which has various embedding processing logics.
- Training effieicy.
Deprecating eagle1 from the example. It is still available by setting custom config.
Other minor fixes and readme updates.

Usage

# Add a code snippet demonstrating how to use this

Testing

Tested that training curves after changes (both online&offline) is identical with original branch:

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed.
Is this change backward compatible?: Yes/No
Did you write any new necessary tests?: Yes/No
Did you add or update any necessary documentation?: Yes/No
Did you update Changelog?: Yes/No

Additional Information

Summary by CodeRabbit

Release Notes

New Features
- Added draft vocabulary cache support for EAGLE model training, enabling runtime vocabulary customization via --draft_vocab_cache parameter
- Introduced new data loading utilities with sharding, streaming, and tokenization support for large-scale training
- Added optional --log_steps configuration to training launcher
Documentation
- Updated EAGLE configuration guides with draft vocabulary cache setup instructions and examples
Refactor
- Restructured data pipeline for offline training with improved dataset handling and batching
- Updated command-line arguments across training scripts (--input-data replaces --input-file)

copy-pr-bot · 2025-12-08T22:42:11Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

copy-pr-bot · 2026-02-06T03:00:07Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

coderabbitai · 2026-02-06T03:00:11Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

Updates speculative decoding training pipeline to support offline pre-computed data loading from .pt files, introduces draft vocabulary cache configuration, and adds new data utilities including ShardedDataset, LanguageDataCollator, and VisionLanguageDataCollator for transformer training.

Changes

Cohort / File(s)	Summary
Documentation and Configuration `examples/speculative_decoding/README.md`, `modelopt/torch/speculative/config.py`	Added guidance for configuring draft model architecture overrides and replaced draft_vocab_size override approach with draft_vocab_cache file path specification.
Shell Script Updates `examples/speculative_decoding/launch_train.sh`, `examples/speculative_decoding/collect_hidden_states/run_*.sh`	Added `--log_steps` and `--draft_vocab_cache` arguments to launcher; replaced `--input-file` flag with `--input-data` across compute hidden states scripts.
Offline Data Pipeline `examples/speculative_decoding/eagle_utils.py`	Refactored data loading from in-file preprocessing to offline pre-computed data; introduced `OfflineSupervisedDataset` and `EagleOfflineDataCollator` classes; removed legacy preprocessing logic (preprocess, preprocess_vlm functions).
Training Logic Updates `examples/speculative_decoding/main.py`	Updated `DataArguments.draft_vocab_cache` parameter; restricted `TrainingArguments.mode` to "eagle3"/"medusa"; simplified checkpoint detection; refactored model loading with config-based classification; removed hardcoded eagle config branching and draft vocab loading logic.
Eagle Integration `modelopt/torch/speculative/eagle/conversion.py`, `modelopt/torch/speculative/eagle/eagle_model.py`	Added draft_vocab_cache parameter propagation through eagle model conversion and modification paths; merged default architecture configs with user-provided overrides.
Plugin Draft Vocab Cache Support `modelopt/torch/speculative/plugins/megatron_eagle.py`, `modelopt/torch/speculative/plugins/transformers.py`	Implemented draft vocabulary cache loading with file validation and `FakeTensorMode` for efficient tensor loading; updated `modify` signatures to accept and propagate draft_vocab_cache; added base model embedding output handling; extended config resolution paths for multi-architecture support.
New Data Utilities Module `modelopt/torch/utils/plugins/transformers_dataset.py`	Introduced new module with `ShardedDataset` for distributed data loading, `LanguageDataCollator` for text tokenization, and `VisionLanguageDataCollator` for multimodal inputs; includes support for OpenAI/ShareGPT conversation formats and chat templates.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant LaunchScript as launch_train.sh
    participant TrainingScript as main.py
    participant DataModule as make_eagle_supervised_data_module()
    participant OfflineDataset as OfflineSupervisedDataset
    participant DataCollator as EagleOfflineDataCollator
    participant Trainer

    User->>LaunchScript: Provide --draft_vocab_cache path
    LaunchScript->>TrainingScript: Pass draft_vocab_cache arg
    TrainingScript->>DataModule: Call with offline_data_path & train_len
    DataModule->>OfflineDataset: Create with dumped_files from cache
    OfflineDataset->>OfflineDataset: Load .pt files on initialization
    DataModule->>DataCollator: Create with train_len parameter
    TrainingScript->>Trainer: Initialize with dataset & collator
    Trainer->>OfflineDataset: Request batch samples
    OfflineDataset-->>Trainer: Return tensor dict items
    Trainer->>DataCollator: Call with feature list
    DataCollator->>DataCollator: Pad/truncate tensors to train_len
    DataCollator-->>Trainer: Return batched dict
    Trainer->>Trainer: Train on batched data

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 47.06% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Refactor: Eagle data loading' accurately and concisely describes the main objective of the pull request, which is a refactor of Eagle data loading logic consolidating with transformers_dataset.py.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch haoguo/data-refactor

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-02-06T03:11:33Z

Codecov Report

❌ Patch coverage is 46.15385% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.72%. Comparing base (95511a0) to head (3c8bbac).
⚠️ Report is 4 commits behind head on main.

Files with missing lines	Patch %	Lines
modelopt/torch/speculative/utils.py	22.22%	7 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #668      +/-   ##
==========================================
- Coverage   73.73%   73.72%   -0.01%     
==========================================
  Files         199      199              
  Lines       21165    21176      +11     
==========================================
+ Hits        15606    15612       +6     
- Misses       5559     5564       +5

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

Actionable comments posted: 12

🤖 Fix all issues with AI agents

In `@examples/speculative_decoding/eagle_utils.py`:
- Around line 135-150: The collators are created with return_labels=True but due
to upstream bugs labels never reach the collated batch: LanguageDataCollator
computes labels in _process_chat_sample but does not add them to the returned
dict, and VisionLanguageDataCollator.__init__ does not forward return_labels to
its parent; update the collator implementations so that
LanguageDataCollator._process_chat_sample attaches the computed labels to the
output dict (e.g., output["labels"] = labels) and modify
VisionLanguageDataCollator.__init__ to accept return_labels and pass it to
super().__init__(..., return_labels=return_labels) so that constructing these
classes in eagle_utils.py actually yields batches with labels.
- Around line 53-86: In OfflineSupervisedDataset.__getitem__, the torch.load
call relies on the default behavior changed in PyTorch 2.6; update the
torch.load invocation in __getitem__ (where offline_data is loaded) to
explicitly pass weights_only=True (e.g., torch.load(self.dumped_files[i],
weights_only=True)) so the .pt tensor dictionary is loaded safely and the intent
is clear.

In `@examples/speculative_decoding/main.py`:
- Around line 200-211: The config passed to mtsp.convert when training_args.mode
== "eagle" omits the user-provided draft vocab cache, so add the DataArguments
value into the config before calling mtsp.convert: include a key that maps
EagleConfig.draft_vocab_cache (e.g., "eagle_draft_vocab_cache") to
data_args.draft_vocab_cache alongside the existing "eagle_decoder_type",
"eagle_offline", and "eagle_architecture_config" entries so mtsp.convert(model,
[("eagle", config)]) receives the draft vocab cache.

In `@examples/speculative_decoding/README.md`:
- Line 247: Fix the typo in the README sentence: change "To user 2-layer eagle
with 8192 intermediate size for MLP, set `eagle_config.json` to:" to "To use
2-layer eagle with 8192 intermediate size for MLP, set `eagle_config.json` to:"
in examples/speculative_decoding/README.md so the sentence reads correctly;
search for the sentence fragment "To user 2-layer eagle" to locate the exact
place to edit.

In `@modelopt/torch/speculative/eagle/conversion.py`:
- Around line 42-48: The current dict lookup using config.eagle_decoder_type can
raise a raw KeyError; update conversion logic around the lookup that builds
default_arch_config to validate config.eagle_decoder_type first (accepting
"llama" or "kimik2"), and if it is unsupported raise a clear ValueError with a
descriptive message including the invalid value and allowed options; reference
the existing symbols eagle3_default_config, kimik2_eagle_default_config and
config.eagle_architecture_config so the check sits before merging
({**default_arch_config, **custom_config}) and uses those defaults when valid.

In `@modelopt/torch/speculative/plugins/transformers.py`:
- Around line 847-853: The offline/pre-computed branch that checks for
"base_model_outputs" in kwargs never assigns base_input_embeds, causing a
NameError when later referencing inputs_embeds = base_input_embeds.roll(-1, 1);
fix this by computing base_input_embeds from
self._base_model_embeddings(input_ids) inside that branch as a fallback (ensure
you respect input_ids device/dtype and any attention_mask or position handling
consistent with the online path), so base_input_embeds is always defined before
the roll operation.

In `@modelopt/torch/utils/plugins/transformers_dataset.py`:
- Around line 107-112: When num_streaming_samples is set, the current logic
appends every accessed streamed item into self._raw_samples and builds
self._stream_iterator = itertools.cycle(self._stream_samples), which makes
memory grow unboundedly and causes duplicate returns on subsequent cycles;
change the behavior so streaming mode does not retain all seen items: when
self.num_streaming_samples is not None, do not append streamed items into
self._raw_samples (remove or stop the append in __getitem__/iteration), keep
self._stream_samples as the original shard or an iterator, and build
self._stream_iterator from that non-caching iterator (e.g., use
itertools.islice/shard iterator or a bounded collections.deque(maxlen=...) if a
small cache is needed); ensure the code paths that reference _raw_samples and
_stream_iterator (look for __getitem__, _raw_samples, _stream_samples,
_stream_iterator) are updated so cycles do not iterate over a growing list and
streaming respects memory limits.
- Around line 185-189: The branch guarded by return_labels computes a labels
tensor but never attaches it to tokenized_examples, so callers never receive
labels; modify the block in transformer's dataset preprocessing (the code around
return_labels, tokenized_examples, labels, and IGNORE_TOKEN_ID) to assign the
constructed labels into tokenized_examples (e.g., tokenized_examples["labels"] =
labels) before returning so the caller gets the labels when return_labels is
True.
- Around line 229-250: The __init__ accepts return_labels but doesn't forward it
to the parent, so update VisionLanguageDataCollator/constructor to both store it
(self.return_labels = return_labels) and pass it into the super call (add
return_labels=return_labels in the super().__init__(...) argument list) so the
parent sees the intended value.
- Around line 167-172: _post_process_chat_template currently calls
self.tokenizer.chat_template.replace(...) which raises AttributeError if
chat_template is None; guard the operation by checking
self.tokenizer.chat_template is not None/falsey before calling replace (or
default it to an empty string) and only perform the replace when chat_template
is a str. Update the method so it safely handles a missing chat_template
(reference: _post_process_chat_template, self.tokenizer.chat_template,
REMOVE_THINK_CHAT_TEMPLATE) and preserves existing behavior when chat_template
is present.
- Around line 201-223: The __call__ method currently builds batch items as
either plain strings or message lists but always calls _process_chat_sample;
change __call__ (in the LanguageDataCollator class) to detect when all items in
batch are plain text (strings) and route those to _process_text_sample instead
of _process_chat_sample; keep existing logic that converts ShareGPT
conversations to OpenAI messages via _sharegpt_to_openai_messages and only call
_process_chat_sample when batch contains any message lists (to avoid passing raw
strings into tokenizer.apply_chat_template).
- Around line 86-93: The __getitem__ method currently divides the incoming index
by self.num_shards (index = index // self.num_shards), which double-shards the
already shard-local dataset; remove that division so __getitem__ uses the raw
shard-local index directly, i.e., make __getitem__ use the incoming index to
access self._raw_samples (and keep the streaming fill loop that advances
self._stream_iterator when index >= len(self._raw_samples)) so items aren’t
duplicated or skipped.

🧹 Nitpick comments (7)

examples/speculative_decoding/main.py (2)
201-203: Unclosed file handle.

open(eagle_args.eagle_config) is never explicitly closed. Use a with statement or pathlib to avoid resource leaks.
Proposed fix
-            custom_config = (
-                json.load(open(eagle_args.eagle_config)) if eagle_args.eagle_config else {}
-            )
+            if eagle_args.eagle_config:
+                with open(eagle_args.eagle_config) as f:
+                    custom_config = json.load(f)
+            else:
+                custom_config = {}
163-169: Model type detection heuristic is fragile.

The substring check "vl" in model_config.model_type.lower() could match unintended model types (e.g., a hypothetical model type containing "vl" as part of another word). Consider matching against known VLM model types explicitly.
modelopt/torch/utils/plugins/transformers_dataset.py (2)
36-53: Unhandled role values will raise an unhelpful KeyError.

If a conversation entry has a role not present in role_mapping (e.g., "tool", "function", or a typo), line 50 raises a raw KeyError with no context. Consider using .get() with a fallback or raising a descriptive ValueError.
Suggested improvement
     for msg in conversations:
-        role = role_mapping[msg["role"]]
+        raw_role = msg["role"]
+        role = role_mapping.get(raw_role)
+        if role is None:
+            raise ValueError(
+                f"Unknown role '{raw_role}' in conversation. "
+                f"Supported roles: {list(role_mapping.keys())}"
+            )
         content = msg["content"]
267-301: VisionLanguageDataCollator.__call__ does not handle plain-text samples, duplicating validation logic.

The __call__ method duplicates the message-format validation logic from the parent class (lines 272–280 are nearly identical to lines 212–220 in LanguageDataCollator.__call__). Consider extracting a shared _normalize_messages helper to DRY up the conversion logic.
examples/speculative_decoding/eagle_utils.py (2)
89-127: Typo in comment and minor observation on collator.

Line 104: "consturct" → "construct".
Fix
-        # consturct copy slice
+        # construct copy slice
154-162: Use a conditional + raise instead of assert for runtime validation; sort dumped files for reproducibility.

assert can be stripped with python -O. Also, glob("*.pt") returns files in filesystem-dependent order, which may differ across runs/machines.
Proposed fix
-        assert not data_args.vlm_processor, "Offline data is not supported for VLM."
+        if data_args.vlm_processor:
+            raise ValueError("Offline data is not supported for VLM.")

         offline_data_path = Path(data_args.offline_data_path)
-        dumped_files = [str(p) for p in offline_data_path.glob("*.pt")]
+        dumped_files = sorted(str(p) for p in offline_data_path.glob("*.pt"))
modelopt/torch/speculative/plugins/transformers.py (1)
244-262: torch.load without weights_only — security and deprecation concern.

Line 250: torch.load(draft_vocab_cache) should specify weights_only=True since the expected content is a single tensor. This avoids arbitrary code execution from untrusted files and suppresses the deprecation warning in newer PyTorch versions.
Proposed fix
-            d2t = torch.load(draft_vocab_cache)
+            d2t = torch.load(draft_vocab_cache, weights_only=True)

examples/speculative_decoding/eagle_utils.py

examples/speculative_decoding/main.py

examples/speculative_decoding/README.md

modelopt/torch/speculative/eagle/conversion.py

coderabbitai · 2026-02-10T02:36:05Z

modelopt/torch/utils/plugins/transformers_dataset.py

+        if self.num_streaming_samples is not None:
+            self._raw_samples = []
+            self._stream_samples = shard
+            self._stream_iterator = itertools.cycle(self._stream_samples)
+        else:
+            self._raw_samples = shard


⚠️ Potential issue | 🟠 Major

Streaming dataset grows unboundedly in memory, defeating the purpose of streaming.

When num_streaming_samples is not None, every accessed item is appended to self._raw_samples (line 91) and kept forever. Over the course of an epoch this accumulates all streamed data in memory, which undermines the memory benefit of using streaming mode. Also, itertools.cycle will restart the iterable from the beginning after exhaustion, but since _raw_samples retains all seen items, the second pass through the cycle will trigger next() calls that return already-cached items, creating duplicates in the list.

Consider using an approach that doesn't retain all streamed items, or document this as intentional behavior for datasets that fit in memory.

🤖 Prompt for AI Agents

In `@modelopt/torch/utils/plugins/transformers_dataset.py` around lines 107 - 112, When num_streaming_samples is set, the current logic appends every accessed streamed item into self._raw_samples and builds self._stream_iterator = itertools.cycle(self._stream_samples), which makes memory grow unboundedly and causes duplicate returns on subsequent cycles; change the behavior so streaming mode does not retain all seen items: when self.num_streaming_samples is not None, do not append streamed items into self._raw_samples (remove or stop the append in __getitem__/iteration), keep self._stream_samples as the original shard or an iterator, and build self._stream_iterator from that non-caching iterator (e.g., use itertools.islice/shard iterator or a bounded collections.deque(maxlen=...) if a small cache is needed); ensure the code paths that reference _raw_samples and _stream_iterator (look for __getitem__, _raw_samples, _stream_samples, _stream_iterator) are updated so cycles do not iterate over a growing list and streaming respects memory limits.

coderabbitai · 2026-02-10T02:36:05Z

modelopt/torch/utils/plugins/transformers_dataset.py

+    def _post_process_chat_template(self):
+        # [WAR]: For DeepSeek-V3/R1 tokenizer, we modify the chat_template such that the <think>
+        # tokens are preserved for supervised learning.
+        self.tokenizer.chat_template = self.tokenizer.chat_template.replace(
+            REMOVE_THINK_CHAT_TEMPLATE, ""
+        )


⚠️ Potential issue | 🟡 Minor

_post_process_chat_template crashes with AttributeError when the tokenizer has no chat template.

If the user doesn't supply a chat_template and self.tokenizer.chat_template is None, calling .replace() on None at line 170 raises AttributeError: 'NoneType' object has no attribute 'replace'. This occurs before the friendlier check on line 153–154.

Proposed fix

def _post_process_chat_template(self): + if self.tokenizer.chat_template is None: + return # [WAR]: For DeepSeek-V3/R1 tokenizer, we modify the chat_template such that the <think> # tokens are preserved for supervised learning. self.tokenizer.chat_template = self.tokenizer.chat_template.replace( REMOVE_THINK_CHAT_TEMPLATE, "" )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def _post_process_chat_template(self):

# [WAR]: For DeepSeek-V3/R1 tokenizer, we modify the chat_template such that the <think>

# tokens are preserved for supervised learning.

self.tokenizer.chat_template = self.tokenizer.chat_template.replace(

REMOVE_THINK_CHAT_TEMPLATE, ""

)

def _post_process_chat_template(self):

if self.tokenizer.chat_template is None:

return

# [WAR]: For DeepSeek-V3/R1 tokenizer, we modify the chat_template such that the <think>

# tokens are preserved for supervised learning.

self.tokenizer.chat_template = self.tokenizer.chat_template.replace(

REMOVE_THINK_CHAT_TEMPLATE, ""

)

🤖 Prompt for AI Agents

In `@modelopt/torch/utils/plugins/transformers_dataset.py` around lines 167 - 172, _post_process_chat_template currently calls self.tokenizer.chat_template.replace(...) which raises AttributeError if chat_template is None; guard the operation by checking self.tokenizer.chat_template is not None/falsey before calling replace (or default it to an empty string) and only perform the replace when chat_template is a str. Update the method so it safely handles a missing chat_template (reference: _post_process_chat_template, self.tokenizer.chat_template, REMOVE_THINK_CHAT_TEMPLATE) and preserves existing behavior when chat_template is present.

modelopt/torch/utils/plugins/transformers_dataset.py

yeyu-nvidia · 2026-02-11T22:27:54Z

examples/speculative_decoding/eagle_utils.py

-        )
-    return new_examples

+class OfflineSupervisedDataset(Dataset):


why not move this to modelopt.torch.utils.plugins.transformers_dataset as well?

I think these two classes are specific to eagle3 (E.g. it has "aux_hidden_states) and is not useful to other modules. Contents in torch.utils.plugins.transofrmers_dataset.py are algorithm-agnostic and intend to be reused in the future.

examples/speculative_decoding/eagle_utils.py

modelopt/torch/speculative/eagle/conversion.py

Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>

ChenhanYu self-requested a review December 16, 2025 22:12

h-guo18 force-pushed the haoguo/data-refactor branch from ca7cba3 to 07ddfbe Compare February 6, 2026 03:00

h-guo18 self-assigned this Feb 6, 2026

h-guo18 force-pushed the haoguo/data-refactor branch 2 times, most recently from d61f05e to 6197b58 Compare February 10, 2026 01:42

h-guo18 marked this pull request as ready for review February 10, 2026 02:29

h-guo18 requested review from a team as code owners February 10, 2026 02:29

coderabbitai bot reviewed Feb 10, 2026

View reviewed changes

h-guo18 force-pushed the haoguo/data-refactor branch from b64548b to 7612abe Compare February 10, 2026 08:04

h-guo18 requested a review from yeyu-nvidia February 10, 2026 08:16

yeyu-nvidia reviewed Feb 11, 2026

View reviewed changes

examples/speculative_decoding/eagle_utils.py Show resolved Hide resolved

yeyu-nvidia reviewed Feb 11, 2026

View reviewed changes

modelopt/torch/speculative/eagle/conversion.py Outdated Show resolved Hide resolved

h-guo18 force-pushed the haoguo/data-refactor branch from eeed7bf to 3698f17 Compare February 13, 2026 04:07

h-guo18 added 2 commits February 13, 2026 04:14

squash: data refactor

adb9927

Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>

fix

027ee36

Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>

h-guo18 force-pushed the haoguo/data-refactor branch from cb67d63 to 027ee36 Compare February 13, 2026 04:35

h-guo18 added 2 commits February 13, 2026 04:36

fix

c78299e

Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>

refactor: transformers.py

5d65ada

Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>

h-guo18 marked this pull request as draft February 15, 2026 00:58

h-guo18 added 2 commits February 15, 2026 00:58

debug

d18eabe

Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>

debug

3c8bbac

Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>

h-guo18 marked this pull request as ready for review February 15, 2026 01:39

Conversation

h-guo18 commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Release Notes

Uh oh!

copy-pr-bot bot commented Dec 8, 2025

Uh oh!

copy-pr-bot bot commented Feb 6, 2026

Uh oh!

coderabbitai bot commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Uh oh!

codecov bot commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yeyu-nvidia Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

h-guo18 Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

h-guo18 commented Dec 8, 2025 •

edited

Loading

coderabbitai bot commented Feb 6, 2026 •

edited

Loading

codecov bot commented Feb 6, 2026 •

edited

Loading