Skip to content

error handling for empty kv list#331

Merged
jiashuy merged 5 commits intoNVIDIA:mainfrom
gameofdimension:patch-2
Mar 24, 2026
Merged

error handling for empty kv list#331
jiashuy merged 5 commits intoNVIDIA:mainfrom
gameofdimension:patch-2

Conversation

@gameofdimension
Copy link
Copy Markdown
Contributor

Description

When the table is empty, the key/value list will also be empty, resulting in a RuntimeError: torch.cat(): expected a non-empty list of Tensors

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 19, 2026

Greptile Summary

This PR fixes a RuntimeError: torch.cat(): expected a non-empty list of Tensors that occurred when export_keys_values was called on a BatchedDynamicEmbeddingTablesV2 instance whose backing table had never been written to. The fix adds an early-return guard before the two torch.cat calls and returns correctly-typed empty tensors (torch.int64 keys, 2D values).

Previous review concerns about dtype and tensor dimensionality have been addressed. One remaining issue worth considering:

  • Silent failure on invalid table names (P1): The len(keys_list) == 0 guard is true in two distinct situations — the table exists but is empty (the intended fix) and the table name is not found at all. In the latter case the function now silently returns empty tensors instead of signalling a programming error, which can mask bugs in callers that pass a stale or misspelled table name.
  • Test coverage gap (P2): The new test asserts values.shape[0] == 0 but does not check values.ndim == 2, so a regression back to a 1-D empty tensor (or a shape with the wrong embedding width) would go undetected.

Confidence Score: 3/5

  • The core crash fix is correct, but the guard silently swallows invalid table-name errors, which could mask bugs in production callers.
  • Prior dtype and shape concerns from earlier rounds have been properly addressed. The remaining P1 concern — the early-return guard conflating "table not found" with "table is empty" — is a real regression in error-visibility that could hide programming mistakes in callers. It warrants a targeted fix before merge, keeping the score at 3.
  • batched_dynamicemb_tables.py — the export_keys_values early-return guard needs to distinguish a truly empty table from an unrecognised table name.

Important Files Changed

Filename Overview
corelib/dynamicemb/dynamicemb/batched_dynamicemb_tables.py Adds an early-return guard for an empty keys_list in export_keys_values; correctly addresses dtype (int64) and 2D shape for the empty case, but the guard conflates "table not found" with "table is empty", silently returning empty tensors for invalid table names.
corelib/dynamicemb/test/test_batched_dynamic_embedding_tables_v2.py New test test_export_keys_values_empty_table correctly exercises the empty-table code path and validates keys shape and dtype, but stops short of asserting values.ndim == 2 or the correct embedding dimension.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[export_keys_values called\ntable_name, device, batch_size] --> B[flush embeddings]
    B --> C{Iterate table_names\n& tables}
    C -->|name mismatch| C
    C -->|name match| D[local_max_rows = table.size]
    D --> E{Iterate\nexport_keys_values\nbatches}
    E -->|batch| F[Append keys & embeddings\nto keys_list / values_list]
    F --> E
    E -->|done| G{accumulated_counts\n== local_max_rows?}
    G -->|No| H[raise ValueError]
    G -->|Yes| C
    C -->|loop done| I{len keys_list == 0?}
    I -->|Yes — empty table\nOR invalid name| J["Return\n(empty int64 keys,\nempty 0×0 values)"]
    I -->|No| K["Return\ntorch.cat(keys_list),\ntorch.cat(values_list, dim=0)"]

    style H fill:#f66,color:#fff
    style J fill:#ffa,color:#000
Loading

Comments Outside Diff (1)

  1. corelib/dynamicemb/dynamicemb/batched_dynamicemb_tables.py, line 1211-1214 (link)

    Silent failure on invalid table name

    keys_list remains empty in two distinct situations: (1) the matching table was found and truly has zero entries (the intended fix), and (2) no table with the given table_name exists at all (a programming error). Both paths now return empty tensors silently.

    Before this PR, an invalid name would raise a RuntimeError from torch.cat([]) — a noisy signal that something was wrong. After this change, a caller passing a typo'd or stale table name will receive (tensor([]), tensor(...)) and continue running, potentially propagating silent data-loss bugs downstream.

    Consider distinguishing the two cases:

    table_found = False
    for dynamic_table_name, dynamic_table in zip(self.table_names, self.tables):
        ...
        if table_name != dynamic_table_name:
            continue
        table_found = True
        ...
    
    if not table_found:
        raise ValueError(f"Table '{table_name}' not found in {self.table_names}")
    
    if len(keys_list) == 0:
        return torch.empty(0, dtype=torch.int64, device=device), torch.empty(
            0, 0, device=device
        )
    return torch.cat(keys_list), torch.cat(values_list, dim=0)

Reviews (5): Last reviewed commit: "test: add unit test for export_keys_valu..." | Re-trigger Greptile

gameofdimension and others added 2 commits March 19, 2026 20:15
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
@shijieliu
Copy link
Copy Markdown
Collaborator

hi @jiashuy could you help review it?

Copy link
Copy Markdown
Collaborator

@jiashuy jiashuy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jiashuy
Copy link
Copy Markdown
Collaborator

jiashuy commented Mar 20, 2026

hi @jiashuy could you help review it?

Yes, LGTM, and I will also check if it resolved in #313

@jiashuy
Copy link
Copy Markdown
Collaborator

jiashuy commented Mar 20, 2026

Hi @gameofdimension, is the PR ready for merging?

@gameofdimension
Copy link
Copy Markdown
Contributor Author

Hi @gameofdimension, is the PR ready for merging?

yes

Signed-off-by: felix01.yu <felix01.yu@vipshop.com>
@gameofdimension
Copy link
Copy Markdown
Contributor Author

Hi @gameofdimension, is the PR ready for merging?

done formatting

@jiashuy
Copy link
Copy Markdown
Collaborator

jiashuy commented Mar 23, 2026

Hi @gameofdimension , we are doing some refactor work, so can you provide a test to the "error handling for empty kv list"? so that we can keep it correct after the refactor.
The test can be added into here.

Thanks for your contribution!

Covers the empty keys_list early-return fix in export_keys_values,
ensuring it returns empty tensors instead of crashing on torch.cat([]).
@gameofdimension
Copy link
Copy Markdown
Contributor Author

Hi @gameofdimension , we are doing some refactor work, so can you provide a test to the "error handling for empty kv list"? so that we can keep it correct after the refactor. The test can be added into here.

Thanks for your contribution!

done

@gameofdimension
Copy link
Copy Markdown
Contributor Author

Is there anything else I need to complete? @jiashuy

Copy link
Copy Markdown
Collaborator

@jiashuy jiashuy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jiashuy jiashuy merged commit 0823d62 into NVIDIA:main Mar 24, 2026
@jiashuy
Copy link
Copy Markdown
Collaborator

jiashuy commented Mar 24, 2026

Merged, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants