[TRTLLM-11246][feat] Add tool parser support for GLM-4 models#11986
Conversation
Signed-off-by: Junyi Xu <[email protected]>
|
/bot run |
|
PR_Github #38034 [ run ] triggered by Bot. Commit: |
|
PR_Github #38034 [ run ] completed with state
|
Signed-off-by: Junyi Xu <[email protected]>
|
/bot run |
📝 WalkthroughWalkthroughIntroduces a new GLM-4 tool parser module implementing streaming XML-to-JSON parsing for GLM-4 tool calls. The implementation includes a state machine for incremental parsing, type inference from tool definitions, and argument parsing with multiple fallback strategies integrated into the existing parser factory. Changes
Sequence DiagramsequenceDiagram
participant Stream as Text Stream
participant Parser as Glm4ToolParser
participant State as StreamState
participant Tools as Tool Definitions
participant Output as Parsed Result
Stream->>Parser: parse_streaming_increment(new_text)
loop For Each Character
Parser->>State: Update state (INIT → BETWEEN → IN_KEY → WAITING_VALUE → IN_VALUE)
Parser->>Parser: Accumulate XML fragments
end
alt Tool Call Complete
Parser->>Parser: Extract func_name & arguments
Parser->>Tools: get_argument_type(func_name, arg_key)
Tools-->>Parser: Expected type (string/number/integer/boolean)
Parser->>Parser: parse_arguments(value, inferred_type) with fallbacks
Parser->>Output: Emit ToolCallItem with parsed args
else Partial Data
Parser->>Output: Return intermediate state
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
tensorrt_llm/serve/tool_parser/glm4_parser.py (2)
477-480: Minor: Duplicate logic branches.Lines 477-478 and 479-480 have identical logic. Consider consolidating:
Suggested simplification
elif arg_type == "string": if isinstance(parsed_value, str): arguments[arg_key] = parsed_value elif isinstance(parsed_value, (dict, list)): arguments[arg_key] = json.dumps(parsed_value, ensure_ascii=False) else: arguments[arg_key] = str(parsed_value) - elif arg_type is None: - arguments[arg_key] = parsed_value if is_good_json else arg_value else: + # Covers arg_type is None and all other types arguments[arg_key] = parsed_value if is_good_json else arg_value🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tensorrt_llm/serve/tool_parser/glm4_parser.py` around lines 477 - 480, The two branches with identical bodies can be collapsed: remove the duplicate elif/else and replace with a single assignment that sets arguments[arg_key] = parsed_value if is_good_json else arg_value (using the existing arg_type/arg_key/parsed_value/is_good_json/arg_value variables) so the logic is not repeated in the function containing this code; ensure any needed handling for arg_type being None is preserved by performing the assignment once in place of both branches.
162-164: Consider narrowing exception handling.Per coding guidelines, avoid broad exception handling. While catching all exceptions at parsing boundaries prevents crashes from malformed model output, consider catching more specific exceptions (e.g.,
re.error,json.JSONDecodeError,KeyError,AttributeError) to avoid masking unexpected bugs.Suggested narrowing
- except Exception as e: + except (re.error, json.JSONDecodeError, KeyError, AttributeError, IndexError) as e: logger.error(f"Error in detect_and_parse: {e}") return StreamingParseResult(normal_text=text)The same applies to lines 440-441 and 455-457.
tensorrt_llm/serve/tool_parser/utils.py (1)
5-5: Consider using built-in generics for Python 3.10+.Per codebase conventions (Python 3.10+), you can use
dictandOptionalfrom builtins instead of importing fromtyping. However, sinceAnystill requires thetypingimport, this is a minor nit.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tensorrt_llm/serve/tool_parser/utils.py` at line 5, The import in utils.py currently pulls Dict and Optional from typing; update it to use built-in generics and PEP 604 union syntax: import only Any from typing (e.g., "from typing import Any"), then change any type annotations that used Dict[...] to use built-in dict[...] and change Optional[T] to T | None (or T | None with Any where applicable) so you no longer import Dict/Optional from typing; keep Any imported from typing as needed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tensorrt_llm/serve/tool_parser/glm4_parser.py`:
- Line 1: This file (glm4_parser.py) is missing the required NVIDIA copyright
header and Apache-2.0 license block; add the standard NVIDIA copyright header at
the top of the file (replacing or preceding the existing attribution comment),
include the correct year for the latest meaningful modification, and ensure the
full Apache License 2.0 notice and SPDX identifier are present so the file
header matches other TensorRT-LLM source files.
---
Nitpick comments:
In `@tensorrt_llm/serve/tool_parser/glm4_parser.py`:
- Around line 477-480: The two branches with identical bodies can be collapsed:
remove the duplicate elif/else and replace with a single assignment that sets
arguments[arg_key] = parsed_value if is_good_json else arg_value (using the
existing arg_type/arg_key/parsed_value/is_good_json/arg_value variables) so the
logic is not repeated in the function containing this code; ensure any needed
handling for arg_type being None is preserved by performing the assignment once
in place of both branches.
In `@tensorrt_llm/serve/tool_parser/utils.py`:
- Line 5: The import in utils.py currently pulls Dict and Optional from typing;
update it to use built-in generics and PEP 604 union syntax: import only Any
from typing (e.g., "from typing import Any"), then change any type annotations
that used Dict[...] to use built-in dict[...] and change Optional[T] to T | None
(or T | None with Any where applicable) so you no longer import Dict/Optional
from typing; keep Any imported from typing as needed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: bf922efa-dc3e-4fbd-8698-4b98195daa1a
📒 Files selected for processing (4)
tensorrt_llm/serve/tool_parser/glm4_parser.pytensorrt_llm/serve/tool_parser/tool_parser_factory.pytensorrt_llm/serve/tool_parser/utils.pytests/unittest/llmapi/apps/test_tool_parsers.py
|
PR_Github #38177 [ run ] triggered by Bot. Commit: |
|
PR_Github #38177 [ run ] completed with state
|
|
/bot run |
|
PR_Github #38228 [ run ] triggered by Bot. Commit: |
|
PR_Github #38228 [ run ] completed with state |
…#11986) Signed-off-by: Junyi Xu <[email protected]>
…#11986) Signed-off-by: Junyi Xu <[email protected]>
…#11986) Signed-off-by: Junyi Xu <[email protected]>
…#11986) Signed-off-by: Junyi Xu <[email protected]>
Summary by CodeRabbit
New Features
Tests
Description
Test Coverage
PR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.
GitHub Bot Help
To see a list of available CI bot commands, please comment
/bot help.