[None][fix] Optimize TorchSampler process_logprobs#13380
Conversation
7468dd6 to
33fe7e1
Compare
📝 WalkthroughWalkthroughThe logprob processing in the sampler is refactored to precompute request index partitions in Python, separating non-beam-search and beam-search requests upfront. The top-k operation now uses the precomputed maximum logprobs value instead of computing it from group indices, and beam-search logprob scattering uses precomputed partition lists directly. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
/bot run |
33fe7e1 to
9f7cb9f
Compare
9f7cb9f to
5464f4e
Compare
|
/bot run |
|
PR_Github #45345 [ run ] triggered by Bot. Commit: |
|
PR_Github #45345 [ run ] completed with state
|
Signed-off-by: Yuan Tong <[email protected]>
Signed-off-by: Yuan Tong <[email protected]>
5464f4e to
2087607
Compare
|
/bot run |
|
PR_Github #45625 [ run ] triggered by Bot. Commit: |
|
PR_Github #45625 [ run ] completed with state
|
|
/bot run |
|
PR_Github #45719 [ run ] triggered by Bot. Commit: |
|
PR_Github #45719 [ run ] completed with state |
Signed-off-by: Yuan Tong <[email protected]>
Signed-off-by: Yuan Tong <[email protected]>
Summary by CodeRabbit
Description
The old code
TensorRT-LLM/tensorrt_llm/_torch/pyexecutor/sampler.py
Line 4174 in 54c3915
is accessing the torch tensor
group_req_indicesin a for-each-request loop which is slow. Rewrite the logic to avoid that. Also merged 3 loops into one and simplified the logic.Test Coverage
No functional change, behavior guarded by existed logprobs tests.
PR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.
GitHub Bot Help
To see a list of available CI bot commands, please comment
/bot help.