Skip to content

[None][fix] always sync sampler_event in update_requests#12585

Merged
Funatiq merged 1 commit into
NVIDIA:mainfrom
Funatiq:dev/fix/perf_metrics
Apr 1, 2026
Merged

[None][fix] always sync sampler_event in update_requests#12585
Funatiq merged 1 commit into
NVIDIA:mainfrom
Funatiq:dev/fix/perf_metrics

Conversation

@Funatiq
Copy link
Copy Markdown
Collaborator

@Funatiq Funatiq commented Mar 30, 2026

Summary by CodeRabbit

  • Bug Fixes
    • Fixed sampler synchronization to ensure proper event handling even when processing empty request lists, preventing potential race conditions.

Description

  • Always synchronize the sampler_event at the beginning of update_requests, regardless of whether there are sampling requests or not.
  • This fixes performance metrics, as the GPU events need to be completed for the performance measurements to work.
  • This may introduce a small overhead when there are no sampling requests, but it ensures that the performance metrics are accurate and consistent.

Test Coverage

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

- Always synchronize the sampler_event at the beginning of update_requests, regardless of whether there are sampling requests or not.
- This fixes performance metrics, as the GPU events need to be completed for the performance measurements to work.
- This may introduce a small overhead when there are no sampling requests, but it ensures that the performance metrics are accurate and consistent.

Signed-off-by: Robin Kobus <[email protected]>
@Funatiq
Copy link
Copy Markdown
Collaborator Author

Funatiq commented Mar 30, 2026

/bot run

@Funatiq Funatiq changed the title [fix] always sync sampler_event in update_requests [None][fix] always sync sampler_event in update_requests Mar 30, 2026
@Funatiq Funatiq requested a review from ixlmar March 30, 2026 08:05
@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40695 [ ] completed with state FAILURE. Commit: ``

Link to invocation

@Funatiq
Copy link
Copy Markdown
Collaborator Author

Funatiq commented Mar 30, 2026

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40719 [ run ] triggered by Bot. Commit: fe572dd Link to invocation

Copy link
Copy Markdown
Collaborator

@ixlmar ixlmar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to also add a test case to prevent regression.

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40719 [ run ] completed with state SUCCESS. Commit: fe572dd
/LLM/main/L0_MergeRequest_PR pipeline #31745 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@Funatiq
Copy link
Copy Markdown
Collaborator Author

Funatiq commented Mar 30, 2026

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40773 [ run ] triggered by Bot. Commit: fe572dd Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40773 [ run ] completed with state SUCCESS. Commit: fe572dd
/LLM/main/L0_MergeRequest_PR pipeline #31789 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@Funatiq
Copy link
Copy Markdown
Collaborator Author

Funatiq commented Mar 31, 2026

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40912 [ run ] triggered by Bot. Commit: fe572dd Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40912 [ run ] completed with state SUCCESS. Commit: fe572dd
/LLM/main/L0_MergeRequest_PR pipeline #31913 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@Funatiq
Copy link
Copy Markdown
Collaborator Author

Funatiq commented Mar 31, 2026

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40969 [ run ] triggered by Bot. Commit: fe572dd Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40969 [ run ] completed with state SUCCESS. Commit: fe572dd
/LLM/main/L0_MergeRequest_PR pipeline #31955 completed with status: 'SUCCESS'

CI Report

Link to invocation

@Funatiq Funatiq marked this pull request as ready for review April 1, 2026 07:09
@Funatiq Funatiq requested a review from a team as a code owner April 1, 2026 07:09
@Funatiq Funatiq requested a review from lancelly April 1, 2026 07:09
@Funatiq Funatiq merged commit d036f74 into NVIDIA:main Apr 1, 2026
7 of 10 checks passed
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 1, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 93e82757-dd32-4a1c-810f-0396386f49eb

📥 Commits

Reviewing files that changed from the base of the PR and between 58d6975 and fe572dd.

📒 Files selected for processing (1)
  • tensorrt_llm/_torch/pyexecutor/sampler.py

📝 Walkthrough

Walkthrough

The early-exit condition for empty request lists in update_requests methods is repositioned to occur after synchronizing the sampler event, ensuring proper event synchronization even when no requests are present.

Changes

Cohort / File(s) Summary
Event Synchronization Reordering
tensorrt_llm/_torch/pyexecutor/sampler.py
Moved early-exit condition if not state.requests: return to occur after state.sampler_event synchronization in both TorchSampler.update_requests and TRTLLMSampler.update_requests to ensure CUDA event synchronization happens regardless of request list emptiness.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@Funatiq Funatiq deleted the dev/fix/perf_metrics branch April 1, 2026 07:11
karen-sy pushed a commit to karen-sy/TensorRT-LLM that referenced this pull request Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants