[TRTLLM-12520][perf] Reduce host overhead during scheduling and sampling by tongyuantongyu · Pull Request #13843 · NVIDIA/TensorRT-LLM

tongyuantongyu · 2026-05-07T07:39:32Z

Summary by CodeRabbit

Release Notes

Improvements
- Reduced overhead reading the current beam width
- Enhanced speculative decoding with improved draft token management.
- Optimized sampling and logprob extraction in the generation pipeline.

Description

Removed some high-overhead codes:

.sampling_config.beam_width is 2 binding property access with temporal wrapper object. Cache the value in .py_beam_width.
Avoid computing useless value

Test Coverage

Covered by current tests

PR Checklist

Please review the following before submitting your PR:

PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

coderabbitai · 2026-05-07T07:51:20Z

📝 Walkthrough

Walkthrough

This PR migrates beam-width accesses across five pyexecutor modules from direct sampling_config.beam_width reads to a cached py_beam_width property introduced in LlmRequest. The sampler module undergoes significant refactoring in logits selection and request processing to support this change while improving clarity.

Changes

Unified Beam Width Property Migration

Layer / File(s)	Summary
Request Property Definition `tensorrt_llm/_torch/pyexecutor/llm_request.py`	Introduce `py_beam_width` as an `int` property cached in `LlmRequest.__init__` by casting `sampling_config.beam_width`. Update `create_response` streaming logprob condition to use this property.
Sampler Base Methods `tensorrt_llm/_torch/pyexecutor/sampler.py`	Update `Sampler.beam_width()` property to return cached `py_beam_width` instead of casting `sampling_config.beam_width`.
Finish Reason Handling `tensorrt_llm/_torch/pyexecutor/sampler.py`	Update `_handle_finish_reasons()` and `_handle_first_finish_reasons()` to derive beam width from `py_beam_width`.
Logprobs Storage & Extraction `tensorrt_llm/_torch/pyexecutor/sampler.py`	Update `_store_logprobs_list_to_request()` and `handle_logprobs()` to reference `py_beam_width` for topk tensor handling and beam-dependent logic.
Beam History & Finalization `tensorrt_llm/_torch/pyexecutor/sampler.py`	Update `_prepare_beam_history()` and `_finalize_beam()` to compute beam dimensions from `py_beam_width`.
Beam Search Logic & Completion `tensorrt_llm/_torch/pyexecutor/sampler.py`	Update `_check_beam_search_stop_criteria()` and `update_requests()` to use `py_beam_width > 1` for beam-search gating and draft-path selection.
Logits Selection & Slicing `tensorrt_llm/_torch/pyexecutor/sampler.py`	Refactor `_select_generated_logits()` to explicitly append context-finished and generation requests in two passes, track context-return-context-logits requirement, and gate logits slicing on this flag. Update `_process_logprobs()` to split requests by `py_beam_width == 1`.
Post-Processing `tensorrt_llm/_torch/pyexecutor/sampler.py`	Update `TRTLLMSampler._post_process_request()` to derive `beam_width` and `log_probs_offset` from `py_beam_width`.
Downstream Module Integration `tensorrt_llm/_torch/pyexecutor/model_engine.py`, `py_executor.py`, `resource_manager.py`	Update generation expansion, disaggregated initialization, draft-token assignment, and context-request scheduling to read `py_beam_width` from requests.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check	❓ Inconclusive	The PR description explains the rationale (reducing host overhead by caching beam_width) but lacks detail about which specific code paths benefit and why this change is necessary.	Add more context about the performance impact and which specific high-overhead code paths are being optimized. Clarify how caching py_beam_width reduces binding property access overhead.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main objective: reducing host overhead during scheduling and sampling, which aligns with all file changes that optimize beam_width access patterns.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

tensorrt_llm/_torch/pyexecutor/llm_request.py (1)
691-691: 💤 Low value

Consider int() instead of cast() to guarantee a Python-native int at runtime.

typing.cast is a type-checker annotation that is a no-op at runtime — it does not convert the value. All other py_* cached attributes (e.g., py_min_length, py_prompt_len) use plain assignment without cast. pybind11 ordinarily maps C++ integral types to Python int, so this works in practice, but the inconsistency is worth noting. If the binding ever returns a pybind11 integer wrapper instead of a Python int, downstream code using py_beam_width in arithmetic or isinstance checks could see unexpected behaviour.

Using int(self.sampling_config.beam_width) is a one-character change, guarantees a true Python int, is self-documenting, and is consistent with how other cached scalar attributes are written in this class.
♻️ Suggested alternative
-        self.py_beam_width = cast(int, self.sampling_config.beam_width)
+        self.py_beam_width: int = int(self.sampling_config.beam_width)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tensorrt_llm/_torch/pyexecutor/llm_request.py` at line 691, The cached
attribute py_beam_width is using typing.cast which is a no-op at runtime;
replace the cast usage with an actual conversion by assigning py_beam_width =
int(self.sampling_config.beam_width) so it becomes a native Python int at
runtime (mirror how other cached scalars like py_min_length are set) — update
the assignment in the llm_request class where py_beam_width is initialized to
use int(...) instead of cast(int, ...).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tensorrt_llm/_torch/pyexecutor/llm_request.py`:
- Line 691: The cached attribute py_beam_width is using typing.cast which is a
no-op at runtime; replace the cast usage with an actual conversion by assigning
py_beam_width = int(self.sampling_config.beam_width) so it becomes a native
Python int at runtime (mirror how other cached scalars like py_min_length are
set) — update the assignment in the llm_request class where py_beam_width is
initialized to use int(...) instead of cast(int, ...).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 4d866904-c5e6-4599-954b-9dec27767e40

📥 Commits

Reviewing files that changed from the base of the PR and between cbfb02a and d220f6b.

📒 Files selected for processing (5)

tensorrt_llm/_torch/pyexecutor/llm_request.py
tensorrt_llm/_torch/pyexecutor/model_engine.py
tensorrt_llm/_torch/pyexecutor/py_executor.py
tensorrt_llm/_torch/pyexecutor/resource_manager.py
tensorrt_llm/_torch/pyexecutor/sampler.py

tongyuantongyu · 2026-05-08T02:59:49Z

/bot run

tensorrt-cicd · 2026-05-08T03:06:22Z

PR_Github #47295 [ run ] triggered by Bot. Commit: d220f6b Link to invocation

tensorrt-cicd · 2026-05-08T05:51:30Z

PR_Github #47295 [ run ] completed with state SUCCESS. Commit: d220f6b
/LLM/main/L0_MergeRequest_PR pipeline #37236 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

tongyuantongyu · 2026-05-08T08:17:56Z

/bot run

tensorrt-cicd · 2026-05-08T08:24:14Z

PR_Github #47361 [ run ] triggered by Bot. Commit: d220f6b Link to invocation

tensorrt-cicd · 2026-05-08T16:05:01Z

PR_Github #47361 [ run ] completed with state SUCCESS. Commit: d220f6b
/LLM/main/L0_MergeRequest_PR pipeline #37295 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

CI Report

Link to invocation

tongyuantongyu · 2026-05-13T03:08:42Z

/bot run

tensorrt-cicd · 2026-05-13T03:13:51Z

PR_Github #48084 [ run ] triggered by Bot. Commit: 04e21ea Link to invocation

SimengLiu-nv

LGTM.

eopXD

The caching at Python level will fail silently if setBeamWidth is called and you have already cached it here. Push this caching to C++ level will fool-proof this and prevent us from a silent bug.

tongyuantongyu · 2026-05-13T06:13:17Z

The caching at Python level will fail silently if setBeamWidth is called and you have already cached it here. Push this caching to C++ level will fool-proof this and prevent us from a silent bug.

Are we calling setBeamWidth after creating LlmRequest currently? We can't afford the accumulated overhead of such frequent access to a binding property. This is the reason we have this long list of py_ prefixed properties:

TensorRT-LLM/tensorrt_llm/_torch/pyexecutor/llm_request.py

Lines 630 to 740 in 04e21ea

    
           self.py_sampling_strategy: "Strategy | None" = None 
        
           self.py_logits_post_processors = kwargs.pop("py_logits_post_processors", 
        
                                                       None) 
        
           self.py_lora_path: str | None = kwargs.pop("py_lora_path", None) 
        
           # Multimodal data 
        
           self.py_multimodal_data = kwargs.pop("py_multimodal_data", None) 
        
           if llm_request is not None: 
        
               super().__init__(llm_request) 
        
           else: 
        
               super().__init__( 
        
                   *args, 
        
                   client_id=client_id, 
        
                   return_log_probs=return_log_probs, 
        
                   return_context_logits=False, 
        
                   return_generation_logits=False, 
        
                   return_perf_metrics=return_perf_metrics, 
        
                   stop_words_list=torch.tensor(stop_words_list, dtype=torch.int32) 
        
                   if stop_words_list else None, 
        
                   **kwargs) 
        
           self.py_client_id = client_id 
        
           self.py_request_id = self.request_id 
        
           self.py_llm_request_type = self.llm_request_type 
        
           self.py_end_id = self.end_id 
        
           self.py_prompt_len = self.prompt_len 
        
           self.py_orig_prompt_len = self.orig_prompt_len 
        
           self.py_max_new_tokens = self.max_new_tokens 
        
           self.py_min_length = self.sampling_config.min_length 
        
           # `seqlen_this_rank_cp`, `total_input_len_cp`, and `py_helix_is_inactive_rank` are relevant to helix parallelism. 
        
           self.seqlen_this_rank_cp = self.prompt_len 
        
           self.total_input_len_cp = self.prompt_len 
        
           self.py_helix_is_inactive_rank = False 
        
           self.py_batch_idx = None 
        
           self.py_draft_pages_allocated = 0 
        
           self.py_rewind_len = 0 
        
           self.py_draft_tokens = [] if self.draft_tokens is None else self.draft_tokens 
        
           self.py_last_context_chunk = (None, None) 
        
           self.py_draft_logits = None 
        
           self.py_target_probs = None 
        
           self.py_last_draft_tokens = None 
        
           self.py_num_accepted_draft_tokens = 0 
        
           self.py_num_accepted_draft_tokens_indices = [] 
        
           self.py_rewind_draft_token_separate_adjustment = 0 
        
           self.py_decoding_iter = 0 
        
           self.is_attention_dp_dummy = False 
        
           self.is_cuda_graph_dummy = False 
        
           self.py_kv_transfer_start_time = None 
        
           self.py_kv_transfer_timed_out = False 
        
           # Performance timing info (step metrics, GPU events, context GPU timing) 
        
           # Lazily created only when return_perf_metrics is enabled to avoid 
        
           # overhead for every request. 
        
           self.py_perf_timing: Optional[PerfTimingInfo] = None 
        
           self.py_num_logprobs = num_logprobs 
        
           self.py_return_log_probs = return_log_probs 
        
           self.py_return_context_logits = return_context_logits 
        
           self.py_return_generation_logits = return_generation_logits 
        
           self.py_return_logits_device_memory = return_logits_device_memory 
        
           self.py_additional_outputs = additional_outputs 
        
           self.py_beam_width = cast(int, self.sampling_config.beam_width) 
        
           self.py_is_draft = is_draft 
        
           # The request's sequence slot ID, an index between 0 (inclusive) and max_batch_size (exclusive). 
        
           self.py_seq_slot = seq_slot 
        
           # If the request is a draft request, target_seq_slot is the sequence slot ID of its target request. 
        
           self.py_target_seq_slot = target_seq_slot 
        
           self.use_draft_model = is_draft 
        
           self._cached_tokens = 0 
        
           self._cached_tokens_set = False 
        
           # Whether the request is for the first forward of the draft model. 
        
           self.py_is_first_draft = is_first_draft 
        
           self.d2t = None 
        
           self.py_draft_use_greedy_sampling = False 
        
           self.py_disable_speculative_decoding = False 
        
           # Chunked logits parameters 
        
           self.py_use_chunked_generation_logits = use_chunked_generation_logits 
        
           self.py_logits_chunk_size = logits_chunk_size if not self.streaming else 1 
        
           # TODO: remove this when use DynamicDecodeOp in pytorch flow. 
        
           # currently, keep py_stop_words_list as python list, rather than tensor. 
        
           self.py_stop_words_list = stop_words_list 
        
           self.py_logprobs_mode = LogprobMode( 
        
               logprobs_mode)  # handle passed a raw string 
        
           self.py_disaggregated_params = None 
        
           self.py_num_connector_matched_tokens = 0 
        
           self.py_result = PyResult( 
        
               prompt_len=self.py_prompt_len, 
        
               max_new_tokens=self.py_max_new_tokens, 
        
               use_device_memory=return_logits_device_memory, 
        
               streaming=self.streaming, 
        
               return_log_probs=return_log_probs, 
        
               return_context_logits=return_context_logits, 
        
               return_generation_logits=return_generation_logits, 
        
               exclude_last_generation_logits=exclude_last_generation_logits, 
        
               use_chunked_generation_logits=self.py_use_chunked_generation_logits, 
        
               chunk_size=self.py_logits_chunk_size, 
        
               additional_outputs=additional_outputs) 
        
           self.child_requests = [] 
        
           self._py_embedding_bias_1d: Optional[torch.Tensor] = None 
        
           if hasattr(self, 'embedding_bias') and self.embedding_bias is not None: 
        
               # Pre-squeeze to 1D if needed (remove batch dimension) 
        
               if self.embedding_bias.dim() > 1: 
        
                   self._py_embedding_bias_1d = self.embedding_bias.squeeze(0) 
        
               else: 
        
                   self._py_embedding_bias_1d = self.embedding_bias

tensorrt-cicd · 2026-05-13T06:14:41Z

PR_Github #48084 [ run ] completed with state SUCCESS. Commit: 04e21ea
/LLM/main/L0_MergeRequest_PR pipeline #37915 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

tongyuantongyu · 2026-05-13T06:48:11Z

/bot run

tensorrt-cicd · 2026-05-13T06:54:17Z

PR_Github #48122 [ run ] triggered by Bot. Commit: 04e21ea Link to invocation

tensorrt-cicd · 2026-05-13T07:24:13Z

PR_Github #48122 [ run ] completed with state SUCCESS. Commit: 04e21ea
/LLM/main/L0_MergeRequest_PR pipeline #37949 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

Signed-off-by: Yuan Tong <[email protected]>

tongyuantongyu · 2026-05-14T06:24:07Z

/bot run --disable-fail-fast

github-actions · 2026-05-14T06:29:50Z

👎 Promotion blocked, new vulnerability found

Vulnerability report

Component	Vulnerability	Description	Severity
python-multipart	CVE-2024-53981	python-multipart is a streaming multipart parser for Python. When parsing form data, python-multipart skips line breaks (CR \r or LF \n) in front of the first boundary and any tailing bytes after the last boundary. This happens one byte at a time and emits a log event each time, which may cause excessive logging for certain inputs. An attacker could abuse this by sending a malicious request with lots of data before the first or after the last boundary, causing high CPU load and stalling the processing thread for a significant amount of time. In case of ASGI application, this could stall the event loop and prevent other requests from being processed, resulting in a denial of service (DoS). This vulnerability is fixed in 0.0.18.	HIGH

tongyuantongyu · 2026-05-14T07:37:59Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-05-14T07:44:39Z

PR_Github #48328 [ run ] triggered by Bot. Commit: 6fe4c73 Link to invocation

tensorrt-cicd · 2026-05-14T18:13:50Z

PR_Github #48328 [ run ] completed with state SUCCESS. Commit: 6fe4c73
/LLM/main/L0_MergeRequest_PR pipeline #38136 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

tongyuantongyu · 2026-05-15T08:05:51Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-05-15T08:12:52Z

PR_Github #48561 [ run ] triggered by Bot. Commit: 6fe4c73 Link to invocation

tensorrt-cicd · 2026-05-15T22:42:50Z

PR_Github #48561 [ run ] completed with state FAILURE. Commit: 6fe4c73
/LLM/main/L0_MergeRequest_PR pipeline #38350 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

tongyuantongyu · 2026-05-18T02:53:33Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-05-18T02:59:00Z

PR_Github #48814 [ run ] triggered by Bot. Commit: 6fe4c73 Link to invocation

tongyuantongyu · 2026-05-18T06:46:55Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-05-18T06:53:06Z

PR_Github #48850 [ run ] triggered by Bot. Commit: 6fe4c73 Link to invocation

tensorrt-cicd · 2026-05-18T06:56:48Z

PR_Github #48814 [ run ] completed with state ABORTED. Commit: 6fe4c73

Link to invocation

tensorrt-cicd · 2026-05-18T07:16:19Z

PR_Github #48850 [ run ] completed with state FAILURE. Commit: 6fe4c73
/LLM/main/L0_MergeRequest_PR pipeline #38604 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

tongyuantongyu · 2026-05-19T09:01:28Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-05-19T09:07:03Z

PR_Github #49160 [ run ] triggered by Bot. Commit: 6fe4c73 Link to invocation

tensorrt-cicd · 2026-05-19T11:21:57Z

PR_Github #49160 [ run ] completed with state SUCCESS. Commit: 6fe4c73
/LLM/main/L0_MergeRequest_PR pipeline #38841 completed with status: 'SUCCESS'

CI Report

Link to invocation

…ing (NVIDIA#13843) Signed-off-by: Yuan Tong <[email protected]>

tongyuantongyu requested review from a team as code owners May 7, 2026 07:39

tongyuantongyu requested a review from joyang-nv May 7, 2026 07:39

github-actions Bot assigned tongyuantongyu May 7, 2026

coderabbitai Bot reviewed May 7, 2026

View reviewed changes

longlee0622 requested a review from hyukn May 7, 2026 22:47

Funatiq reviewed May 11, 2026

View reviewed changes

Comment thread tensorrt_llm/_torch/pyexecutor/sampler.py Outdated

Funatiq approved these changes May 11, 2026

View reviewed changes

tongyuantongyu force-pushed the ytong/exec-host-opt branch from d220f6b to 04e21ea Compare May 12, 2026 09:34

SimengLiu-nv approved these changes May 13, 2026

View reviewed changes

eopXD reviewed May 13, 2026

View reviewed changes

[TRTLLM-12520][perf] Reduce host overhead during scheduling and sampling

6fe4c73

Signed-off-by: Yuan Tong <[email protected]>

tongyuantongyu force-pushed the ytong/exec-host-opt branch from 04e21ea to 6fe4c73 Compare May 14, 2026 06:23

tongyuantongyu requested a review from eopXD May 14, 2026 07:38

tongyuantongyu merged commit 4a58dc3 into NVIDIA:main May 20, 2026
7 checks passed

xxi-nv pushed a commit to xxi-nv/TensorRT-LLM that referenced this pull request May 22, 2026

[TRTLLM-12520][perf] Reduce host overhead during scheduling and sampl…

2a6b3a6

…ing (NVIDIA#13843) Signed-off-by: Yuan Tong <[email protected]>

Conversation

tongyuantongyu commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Description

Test Coverage

PR Checklist

GitHub Bot Help

Uh oh!

coderabbitai Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning, 1 inconclusive)

Review ran into problems

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

tongyuantongyu commented May 8, 2026

Uh oh!

tensorrt-cicd commented May 8, 2026

Uh oh!

tensorrt-cicd commented May 8, 2026

Uh oh!

tongyuantongyu commented May 8, 2026

Uh oh!

tensorrt-cicd commented May 8, 2026

Uh oh!

tensorrt-cicd commented May 8, 2026

Uh oh!

Uh oh!

tongyuantongyu commented May 13, 2026

Uh oh!

tensorrt-cicd commented May 13, 2026

Uh oh!

SimengLiu-nv left a comment

Choose a reason for hiding this comment

Uh oh!

eopXD left a comment

Choose a reason for hiding this comment

Uh oh!

tongyuantongyu commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tensorrt-cicd commented May 13, 2026

Uh oh!

tongyuantongyu commented May 13, 2026

Uh oh!

tensorrt-cicd commented May 13, 2026

Uh oh!

tensorrt-cicd commented May 13, 2026

Uh oh!

tongyuantongyu commented May 14, 2026

Uh oh!

github-actions Bot commented May 14, 2026

👎 Promotion blocked, new vulnerability found

Vulnerability report

Uh oh!

tongyuantongyu commented May 14, 2026

Uh oh!

tensorrt-cicd commented May 14, 2026

Uh oh!

tensorrt-cicd commented May 14, 2026

Uh oh!

tongyuantongyu commented May 15, 2026

Uh oh!

tensorrt-cicd commented May 15, 2026

Uh oh!

tensorrt-cicd commented May 15, 2026

Uh oh!

tongyuantongyu commented May 18, 2026

Uh oh!

tensorrt-cicd commented May 18, 2026

Uh oh!

tongyuantongyu commented May 18, 2026

Uh oh!

tensorrt-cicd commented May 18, 2026

tongyuantongyu commented May 7, 2026 •

edited

Loading

coderabbitai Bot commented May 7, 2026 •

edited

Loading

tongyuantongyu commented May 13, 2026 •

edited

Loading