feat: add num_return_sequences support for rec beam search. by DragonFive · Pull Request #1289 · jd-opensource/xllm

DragonFive · 2026-04-16T01:13:21Z

Summary

add num_return_sequences plumbing for REC beam search across proto, pybind, c_api, and cc_api
align REC multi-round fast-path beam scores with full-logsoftmax semantics used by xllm_rec
expand OneRec beams before the final prune when num_return_sequences > beam_width
keep REC top-k outputs sorted and add targeted tests for beam expansion and fast-path scoring

Testing

git diff --check
python3 -m py_compile xllm/pybind/llm.py xllm/pybind/params.py
added targeted unit coverage in batch_test.cpp and rec_sampler_test.cpp

Notes

branch: feat/rec-num-return-sequences
target: team/main

gemini-code-assist

Code Review

This pull request introduces the num_return_sequences parameter across the C++, Python, and Protobuf APIs, allowing beam search to return more sequences than the beam width. The implementation adds logic to Batch and SequencesGroup to expand results using top-k logprobs and updates the RecSampler for accurate logprob calculation. Feedback identifies a critical runtime bug where the beam_base_logprobs tensor must be flattened to 1-D to prevent a shape mismatch crash during batch processing. Additionally, an include path in batch.cpp requires correction to follow the project's root-relative path style guide.

DragonFive · 2026-04-16T03:19:12Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces the num_return_sequences parameter across the C++, C, Python, and Protobuf APIs, allowing users to specify the number of sequences returned from beam search. The feedback identifies violations of the repository style guide regarding member functions in structs and suggests logic adjustments to correctly support sequence pruning when num_return_sequences is less than the beam width.

DragonFive added 8 commits April 15, 2026 17:19

feat: add num_return_sequences for rec beam search.

b350733

bugfix: expand final beam outputs for num_return_sequences.

5f07944

bugfix: fix const beam output helper build.

4027fdc

bugfix: remove stray const beam output pointer.

8c4ff83

bugfix: align rec fast-path beam logprobs.

d5f0957

bugfix: fix rec sampler incomplete type build.

51bf8ba

bugfix: keep rec topk outputs sorted.

1f26b16

bugfix: expand onerec beams before final prune.

4552613

DragonFive requested review from DongheJin, JimHsiung, RobbieLeung, XuZhang99, liutongxuan, walsonyang and yq33victor as code owners April 16, 2026 01:13

gemini-code-assist bot reviewed Apr 16, 2026

View reviewed changes

Comment thread xllm/core/framework/batch/batch.cpp Outdated

Comment thread xllm/core/runtime/rec_worker_impl.cpp Outdated

Comment thread xllm/core/framework/batch/batch.cpp Outdated

bugfix: align beam base tensor and include path.

d2186a8