Skip to content

[TRTLLM-11246][feat] Add tool parser support for GLM-4 models#11986

Merged
Superjomn merged 2 commits into
NVIDIA:mainfrom
JunyiXu-nv:dev-junyix-glm-4.6-tool-parser
Mar 10, 2026
Merged

[TRTLLM-11246][feat] Add tool parser support for GLM-4 models#11986
Superjomn merged 2 commits into
NVIDIA:mainfrom
JunyiXu-nv:dev-junyix-glm-4.6-tool-parser

Conversation

@JunyiXu-nv
Copy link
Copy Markdown
Collaborator

@JunyiXu-nv JunyiXu-nv commented Mar 6, 2026

Summary by CodeRabbit

  • New Features

    • Added GLM-4 tool calling support with real-time streaming XML-to-JSON conversion
    • Implemented type-aware argument extraction with robust fallback parsing strategies
    • Added automatic type inference from tool definitions for improved argument handling and validation
  • Tests

    • Introduced comprehensive test suite for GLM-4 tool parser covering detection, streaming, multi-tool scenarios, and edge cases

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

@JunyiXu-nv JunyiXu-nv requested a review from a team as a code owner March 6, 2026 13:52
@JunyiXu-nv JunyiXu-nv requested a review from Superjomn March 6, 2026 13:52
@JunyiXu-nv JunyiXu-nv requested a review from LinPoly March 6, 2026 14:03
@JunyiXu-nv
Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #38034 [ run ] triggered by Bot. Commit: b08fe7e Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #38034 [ run ] completed with state FAILURE. Commit: b08fe7e
/LLM/main/L0_MergeRequest_PR pipeline #29460 completed with status: 'FAILURE'

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

Signed-off-by: Junyi Xu <[email protected]>
@JunyiXu-nv
Copy link
Copy Markdown
Collaborator Author

/bot run

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 9, 2026

📝 Walkthrough

Walkthrough

Introduces a new GLM-4 tool parser module implementing streaming XML-to-JSON parsing for GLM-4 tool calls. The implementation includes a state machine for incremental parsing, type inference from tool definitions, and argument parsing with multiple fallback strategies integrated into the existing parser factory.

Changes

Cohort / File(s) Summary
GLM-4 Tool Parser Implementation
tensorrt_llm/serve/tool_parser/glm4_parser.py
New module implementing Glm4ToolParser class with StreamState enum and helper functions. Core features: streaming XML-to-JSON conversion via state machine, type inference from tool definitions, argument parsing with JSON/ast.literal_eval/quoting fallbacks, and both one-shot and streaming detection/parsing modes.
Parser Factory Registration
tensorrt_llm/serve/tool_parser/tool_parser_factory.py
Registers new Glm4ToolParser in ToolParserFactory parsers mapping under "glm4" key via import and factory extension.
Type Inference Utilities
tensorrt_llm/serve/tool_parser/utils.py
Adds infer_type_from_json_schema() function to extract parameter types from JSON Schema definitions, supporting type, anyOf/oneOf, enum, allOf, properties, and items structures.
Test Suite
tests/unittest/llmapi/apps/test_tool_parsers.py
Adds comprehensive TestGlm4ToolParser test class covering streaming parsing, multi-tool handling, parameter parsing, number type coercion, format compliance, and structural tagging verification. (Note: test suite appears duplicated in diff.)

Sequence Diagram

sequenceDiagram
    participant Stream as Text Stream
    participant Parser as Glm4ToolParser
    participant State as StreamState
    participant Tools as Tool Definitions
    participant Output as Parsed Result

    Stream->>Parser: parse_streaming_increment(new_text)
    
    loop For Each Character
        Parser->>State: Update state (INIT → BETWEEN → IN_KEY → WAITING_VALUE → IN_VALUE)
        Parser->>Parser: Accumulate XML fragments
    end
    
    alt Tool Call Complete
        Parser->>Parser: Extract func_name & arguments
        Parser->>Tools: get_argument_type(func_name, arg_key)
        Tools-->>Parser: Expected type (string/number/integer/boolean)
        Parser->>Parser: parse_arguments(value, inferred_type) with fallbacks
        Parser->>Output: Emit ToolCallItem with parsed args
    else Partial Data
        Parser->>Output: Return intermediate state
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 78.57% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ⚠️ Warning The PR description is incomplete—it only contains the template structure without filling in the required Description and Test Coverage sections. Fill in the Description section with details about the GLM-4 tool parser implementation and the Test Coverage section listing relevant tests that validate the changes.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: adding tool parser support for GLM-4 models, with appropriate format including JIRA ticket and feature type.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
tensorrt_llm/serve/tool_parser/glm4_parser.py (2)

477-480: Minor: Duplicate logic branches.

Lines 477-478 and 479-480 have identical logic. Consider consolidating:

Suggested simplification
             elif arg_type == "string":
                 if isinstance(parsed_value, str):
                     arguments[arg_key] = parsed_value
                 elif isinstance(parsed_value, (dict, list)):
                     arguments[arg_key] = json.dumps(parsed_value, ensure_ascii=False)
                 else:
                     arguments[arg_key] = str(parsed_value)
-            elif arg_type is None:
-                arguments[arg_key] = parsed_value if is_good_json else arg_value
             else:
+                # Covers arg_type is None and all other types
                 arguments[arg_key] = parsed_value if is_good_json else arg_value
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tensorrt_llm/serve/tool_parser/glm4_parser.py` around lines 477 - 480, The
two branches with identical bodies can be collapsed: remove the duplicate
elif/else and replace with a single assignment that sets arguments[arg_key] =
parsed_value if is_good_json else arg_value (using the existing
arg_type/arg_key/parsed_value/is_good_json/arg_value variables) so the logic is
not repeated in the function containing this code; ensure any needed handling
for arg_type being None is preserved by performing the assignment once in place
of both branches.

162-164: Consider narrowing exception handling.

Per coding guidelines, avoid broad exception handling. While catching all exceptions at parsing boundaries prevents crashes from malformed model output, consider catching more specific exceptions (e.g., re.error, json.JSONDecodeError, KeyError, AttributeError) to avoid masking unexpected bugs.

Suggested narrowing
-        except Exception as e:
+        except (re.error, json.JSONDecodeError, KeyError, AttributeError, IndexError) as e:
             logger.error(f"Error in detect_and_parse: {e}")
             return StreamingParseResult(normal_text=text)

The same applies to lines 440-441 and 455-457.

tensorrt_llm/serve/tool_parser/utils.py (1)

5-5: Consider using built-in generics for Python 3.10+.

Per codebase conventions (Python 3.10+), you can use dict and Optional from builtins instead of importing from typing. However, since Any still requires the typing import, this is a minor nit.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tensorrt_llm/serve/tool_parser/utils.py` at line 5, The import in utils.py
currently pulls Dict and Optional from typing; update it to use built-in
generics and PEP 604 union syntax: import only Any from typing (e.g., "from
typing import Any"), then change any type annotations that used Dict[...] to use
built-in dict[...] and change Optional[T] to T | None (or T | None with Any
where applicable) so you no longer import Dict/Optional from typing; keep Any
imported from typing as needed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tensorrt_llm/serve/tool_parser/glm4_parser.py`:
- Line 1: This file (glm4_parser.py) is missing the required NVIDIA copyright
header and Apache-2.0 license block; add the standard NVIDIA copyright header at
the top of the file (replacing or preceding the existing attribution comment),
include the correct year for the latest meaningful modification, and ensure the
full Apache License 2.0 notice and SPDX identifier are present so the file
header matches other TensorRT-LLM source files.

---

Nitpick comments:
In `@tensorrt_llm/serve/tool_parser/glm4_parser.py`:
- Around line 477-480: The two branches with identical bodies can be collapsed:
remove the duplicate elif/else and replace with a single assignment that sets
arguments[arg_key] = parsed_value if is_good_json else arg_value (using the
existing arg_type/arg_key/parsed_value/is_good_json/arg_value variables) so the
logic is not repeated in the function containing this code; ensure any needed
handling for arg_type being None is preserved by performing the assignment once
in place of both branches.

In `@tensorrt_llm/serve/tool_parser/utils.py`:
- Line 5: The import in utils.py currently pulls Dict and Optional from typing;
update it to use built-in generics and PEP 604 union syntax: import only Any
from typing (e.g., "from typing import Any"), then change any type annotations
that used Dict[...] to use built-in dict[...] and change Optional[T] to T | None
(or T | None with Any where applicable) so you no longer import Dict/Optional
from typing; keep Any imported from typing as needed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: bf922efa-dc3e-4fbd-8698-4b98195daa1a

📥 Commits

Reviewing files that changed from the base of the PR and between 5b0c956 and 59b1f11.

📒 Files selected for processing (4)
  • tensorrt_llm/serve/tool_parser/glm4_parser.py
  • tensorrt_llm/serve/tool_parser/tool_parser_factory.py
  • tensorrt_llm/serve/tool_parser/utils.py
  • tests/unittest/llmapi/apps/test_tool_parsers.py

Comment thread tensorrt_llm/serve/tool_parser/glm4_parser.py
@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #38177 [ run ] triggered by Bot. Commit: 59b1f11 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #38177 [ run ] completed with state SUCCESS. Commit: 59b1f11
/LLM/main/L0_MergeRequest_PR pipeline #29576 completed with status: 'FAILURE'

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@JunyiXu-nv
Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #38228 [ run ] triggered by Bot. Commit: 59b1f11 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #38228 [ run ] completed with state SUCCESS. Commit: 59b1f11
/LLM/main/L0_MergeRequest_PR pipeline #29616 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

Link to invocation

Copy link
Copy Markdown
Collaborator

@Superjomn Superjomn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Superjomn Superjomn merged commit b411e14 into NVIDIA:main Mar 10, 2026
7 checks passed
bmarimuthu-nv pushed a commit to nv-auto-deploy/TensorRT-LLM that referenced this pull request Mar 12, 2026
tianyuz-nv pushed a commit to wanqian-nv/TensorRT-LLM that referenced this pull request Mar 19, 2026
limin2021 pushed a commit to limin2021/TensorRT-LLM that referenced this pull request Mar 19, 2026
longcheng-nv pushed a commit to longcheng-nv/TensorRT-LLM that referenced this pull request Mar 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants