[None][chore] Align perf benchmark output format by yingguo-trt · Pull Request #12067 · NVIDIA/TensorRT-LLM

yingguo-trt · 2026-03-10T05:47:02Z

Summary by CodeRabbit

Tests & Benchmarking
- Enhanced benchmark failure reporting to display total failed requests alongside existing metrics.
- Improved reporting logic to support multiple benchmark output formats with flexible extraction strategies.
- Refined request count aggregation for more accurate failure tracking across benchmarks.

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

Signed-off-by: yingguo-trt <[email protected]>

… in disagg perf tests Signed-off-by: yingguo-trt <[email protected]>

Signed-off-by: yingguo-trt <[email protected]>

…rk backend Signed-off-by: yingguo-trt <[email protected]>

… log reporter Signed-off-by: yingguo-trt <[email protected]>

Signed-off-by: yingguo-trt <[email protected]>

coderabbitai · 2026-03-10T05:52:23Z

📝 Walkthrough

Walkthrough

This PR enhances benchmark failure reporting in disaggregated scenarios by introducing failure request counting capabilities. The shell script outputs failure metrics from JSON results, while the reporting module adds configurable extraction logic to parse failures from either JSON or log sources based on a benchmark configuration flag.

Changes

Cohort / File(s)	Summary
Benchmark Failure Reporting `examples/disaggregated/slurm/benchmark/run_benchmark_nv_sa.sh`	Adds post-benchmark logic to load result.json and output total failed request count calculated as `num_prompts - completed`.
Report Parsing Logic `tests/integration/defs/perf/disagg/reporting/report.py`	Introduces configurable request-count extraction with two new helper methods: log-based and JSON-based parsers. Updated parse() to select extraction strategy based on benchmark.use_nv_sa_benchmark flag. Refined ResultSaver to write headers only on first write and improved docstring consistency.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description is empty; it contains only the template with unfilled sections and no actual explanation of the changes.	Fill in the Description section explaining what changes were made and why. Provide a Test Coverage section listing relevant tests. Complete the PR Checklist appropriately.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title '[None][chore] Align perf benchmark output format' clearly and concisely describes the main change: aligning performance benchmark output format across multiple benchmark scripts and test reporting logic.
Docstring Coverage	✅ Passed	Docstring coverage is 90.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@examples/disaggregated/slurm/benchmark/run_benchmark_nv_sa.sh`:
- Around line 194-201: The failure-suppression at the end of the Python here-doc
(the "2>/dev/null || true" attached to the command `python -
"${output_dir}/result.json" <<-'PYEOF' ... PYEOF`) hides missing or malformed
result.json files; remove the `2>/dev/null || true` and either allow the Python
process to surface errors (so the script fails visibly) or modify the Python
block to catch exceptions, print a clear error message including the bad
filename, and exit with a non‑zero status; keep the existing JSON logic but
ensure the here-doc (`PYEOF`) returns a non-zero exit on parse/open failures so
missing/malformed result.json is not treated as a successful run.

In `@tests/integration/defs/perf/disagg/reporting/report.py`:
- Around line 199-207: The concurrency suffix used when building base_test_name
no longer matches the pattern expected by _get_network_name, so update the
suffix to match the parser: change the construction of base_test_name (the line
that currently uses f"{test_name}_con:{concurrency}") to use the same "-con-<n>"
format that _get_network_name expects (e.g., f"{test_name}-con-{concurrency}"),
ensuring network_name is parsed correctly by _get_network_name.
- Around line 102-114: The code currently only catches json.JSONDecodeError when
reading result_json_path so filesystem races or permission errors during open()
will raise and abort parsing; wrap the open()+json.load() in a try that first
catches OSError (or more specific FileNotFoundError, PermissionError) to log and
continue, and then separately catch json.JSONDecodeError to log and continue,
ensuring you only broaden the except to those specific exceptions and still
update/skip total_requests and failed_requests correctly when data is available;
use the existing symbols result_json_path, logger, data, total_requests, and
failed_requests to locate and implement the changes.
- Around line 137-146: The NV-SA branch is being chosen only when self.config is
a dict and then relies solely on raw_results' "concurrency" entries, causing
valid NV-SA runs (e.g., when self.config is a BenchmarkConfig instance or when
metric extraction returned no matches) to fall back to 0/0; update the gating
and extraction: determine use_nv_sa by checking the NV-SA flag in either a dict
(self.config.get("benchmark", {}).get("use_nv_sa_benchmark")) or a
BenchmarkConfig instance via getattr(self.config, "benchmark", None) and look
for a truthy use_nv_sa_benchmark there, then call
_extract_request_counts_from_json(concurrencies) but if that returns
empty/invalid counts, implement a fallback that scans raw_results for
concurrency_*/result.json files (or their JSON payloads) and parse them to
compute failed_requests and total_requests before falling back to
_extract_request_counts_from_log(log_content); reference variables/functions:
use_nv_sa, self.config, raw_results, _extract_request_counts_from_json, and
_extract_request_counts_from_log.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a2d79d8f-4e76-4ff1-b31e-edb9b681fdc6

📥 Commits

Reviewing files that changed from the base of the PR and between a20de88 and 58d0bb6.

📒 Files selected for processing (2)

examples/disaggregated/slurm/benchmark/run_benchmark_nv_sa.sh
tests/integration/defs/perf/disagg/reporting/report.py

Signed-off-by: yingguo-trt <[email protected]>

yingguo-trt · 2026-03-10T06:55:42Z

/bot skip --comment "Not cover in CI pipelines"

fredricz-20070104

It's okay for me. Please ask for Kaiyu, Xie's review.

tensorrt-cicd · 2026-03-10T07:01:07Z

PR_Github #38399 [ skip ] triggered by Bot. Commit: 5622a58 Link to invocation

tensorrt-cicd · 2026-03-10T08:09:04Z

PR_Github #38399 [ skip ] completed with state SUCCESS. Commit: 5622a58
Skipping testing for commit 5622a58

Link to invocation

Signed-off-by: yingguo-trt <[email protected]>

yingguo-trt added 8 commits March 10, 2026 13:37

test stress test use_nv_sa_benchmark

4b70c72

Signed-off-by: yingguo-trt <[email protected]>

[None][fix] Fix request failure detection for nv_sa_benchmark backend…

a35ab59

… in disagg perf tests Signed-off-by: yingguo-trt <[email protected]>

recover stress test not use_nv_sa_benchmark

afbde9e

Signed-off-by: yingguo-trt <[email protected]>

[None][fix] Print failed request count in bench log for nv_sa_benchma…

81bfc37

…rk backend Signed-off-by: yingguo-trt <[email protected]>

[None][fix] Fix incorrect failed request count parsing in disagg perf…

ac9cac9

… log reporter Signed-off-by: yingguo-trt <[email protected]>

test

5bf5af8

Signed-off-by: yingguo-trt <[email protected]>

recover stress parameter

bdaf013

Signed-off-by: yingguo-trt <[email protected]>

fix pre-commit error

58d0bb6

Signed-off-by: yingguo-trt <[email protected]>

yingguo-trt requested review from a team as code owners March 10, 2026 05:47

yingguo-trt requested review from Tabrizian, chang-l, chuangz0 and kaiyux March 10, 2026 05:47

github-actions Bot assigned yingguo-trt Mar 10, 2026

yingguo-trt removed request for Tabrizian, chang-l and chuangz0 March 10, 2026 05:48

coderabbitai Bot reviewed Mar 10, 2026

View reviewed changes

Comment thread examples/disaggregated/slurm/benchmark/run_benchmark_nv_sa.sh

Comment thread tests/integration/defs/perf/disagg/reporting/report.py

Comment thread tests/integration/defs/perf/disagg/reporting/report.py

Comment thread tests/integration/defs/perf/disagg/reporting/report.py

Use try/except instead of suppressing all errors

5622a58

Signed-off-by: yingguo-trt <[email protected]>

fredricz-20070104 approved these changes Mar 10, 2026

View reviewed changes

kaiyux approved these changes Mar 10, 2026

View reviewed changes

yingguo-trt enabled auto-merge (squash) March 10, 2026 07:28

yingguo-trt merged commit 81350b7 into NVIDIA:main Mar 10, 2026
5 checks passed

yingguo-trt deleted the feat/align-perf-benchmark-output-format branch March 11, 2026 02:24

bmarimuthu-nv pushed a commit to nv-auto-deploy/TensorRT-LLM that referenced this pull request Mar 12, 2026

[None][chore] Align perf benchmark output format (NVIDIA#12067)

af68f96

Signed-off-by: yingguo-trt <[email protected]>

tianyuz-nv pushed a commit to wanqian-nv/TensorRT-LLM that referenced this pull request Mar 19, 2026

[None][chore] Align perf benchmark output format (NVIDIA#12067)

8047f16

Signed-off-by: yingguo-trt <[email protected]>

limin2021 pushed a commit to limin2021/TensorRT-LLM that referenced this pull request Mar 19, 2026

[None][chore] Align perf benchmark output format (NVIDIA#12067)

921d72a

Signed-off-by: yingguo-trt <[email protected]>

longcheng-nv pushed a commit to longcheng-nv/TensorRT-LLM that referenced this pull request Mar 31, 2026

[None][chore] Align perf benchmark output format (NVIDIA#12067)

320558b

Signed-off-by: yingguo-trt <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[None][chore] Align perf benchmark output format#12067

[None][chore] Align perf benchmark output format#12067
yingguo-trt merged 9 commits into
NVIDIA:mainfrom
yingguo-trt:feat/align-perf-benchmark-output-format

yingguo-trt commented Mar 10, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Mar 10, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yingguo-trt commented Mar 10, 2026

Uh oh!

fredricz-20070104 left a comment

Uh oh!

tensorrt-cicd commented Mar 10, 2026

Uh oh!

tensorrt-cicd commented Mar 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

yingguo-trt commented Mar 10, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Description

Test Coverage

PR Checklist

GitHub Bot Help

Uh oh!

coderabbitai Bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yingguo-trt commented Mar 10, 2026

Uh oh!

fredricz-20070104 left a comment

Choose a reason for hiding this comment

Uh oh!

tensorrt-cicd commented Mar 10, 2026

Uh oh!

tensorrt-cicd commented Mar 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yingguo-trt commented Mar 10, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 10, 2026 •

edited

Loading