[None][fix] LlavaNext dtype fallback when text_config.torch_dtype is None#12169
Conversation
Signed-off-by: Indrajit Bhosale <[email protected]>
📝 WalkthroughWalkthroughThe changes add fallback dtype initialization logic to three components in the LlaVA Next model: the input processor, vision model, and LLM model initialization. Each now uses a secondary dtype source (config.torch_dtype) when the primary source is unavailable. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment Tip You can disable sequence diagrams in the walkthrough.Disable the |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tensorrt_llm/_torch/models/modeling_llava_next.py (1)
63-64: Fallback logic is correct, but consider a guard for the double-None edge case.The
orfallback handles the primary scenario (HF omittingtorch_dtypefromtext_config). However, if bothconfig.text_config.torch_dtypeandconfig.torch_dtypeareNone,self._dtypewill remainNoneand could still trigger downstream errors.Consider adding a defensive check or raising an informative error if the final dtype is
None:💡 Optional: Add explicit None guard
self._dtype = (self.config.text_config.torch_dtype or self.config.torch_dtype) + if self._dtype is None: + raise ValueError( + "Could not determine model dtype: both text_config.torch_dtype " + "and config.torch_dtype are None" + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tensorrt_llm/_torch/models/modeling_llava_next.py` around lines 63 - 64, The assignment to self._dtype uses a fallback (self.config.text_config.torch_dtype or self.config.torch_dtype) but can still become None; update the initialization in the class (where self._dtype is set) to guard this double-None case by either raising a clear ValueError indicating both config.text_config.torch_dtype and config.torch_dtype are unset, or by assigning a safe default (e.g., torch.float32); reference the symbols self._dtype, config.text_config.torch_dtype, and config.torch_dtype and apply the check immediately after the current fallback assignment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tensorrt_llm/_torch/models/modeling_llava_next.py`:
- Around line 536-542: The code assigns llm_model_config.pretrained_config =
model_config.pretrained_config.text_config which creates a shared reference and
causes the later mutation (setting torch_dtype) to modify the original
model_config; to fix, assign a deep copy of the text sub-config instead (e.g.,
use copy.deepcopy on model_config.pretrained_config.text_config) before setting
torch_dtype so changes only affect llm_model_config.pretrained_config and do not
mutate model_config used elsewhere (refer to llm_model_config, model_config,
pretrained_config, text_config).
---
Nitpick comments:
In `@tensorrt_llm/_torch/models/modeling_llava_next.py`:
- Around line 63-64: The assignment to self._dtype uses a fallback
(self.config.text_config.torch_dtype or self.config.torch_dtype) but can still
become None; update the initialization in the class (where self._dtype is set)
to guard this double-None case by either raising a clear ValueError indicating
both config.text_config.torch_dtype and config.torch_dtype are unset, or by
assigning a safe default (e.g., torch.float32); reference the symbols
self._dtype, config.text_config.torch_dtype, and config.torch_dtype and apply
the check immediately after the current fallback assignment.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 6b870d6e-f75d-462d-8f1d-dd456f211c06
📒 Files selected for processing (1)
tensorrt_llm/_torch/models/modeling_llava_next.py
|
/bot run |
|
@chang-l @pengbowang-nv any update on this |
|
/bot run --disable-fail-fast |
|
PR_Github #39302 [ run ] triggered by Bot. Commit: |
Signed-off-by: Indrajit Bhosale <[email protected]>
|
PR_Github #39302 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
/bot run --disable-fail-fast |
|
PR_Github #39489 [ run ] triggered by Bot. Commit: |
|
PR_Github #39489 [ run ] completed with state |
|
@chang-l @Wanli-Jiang Can we merge this CI is green? |
…None (NVIDIA#12169) Signed-off-by: Indrajit Bhosale <[email protected]>
…None (NVIDIA#12169) Signed-off-by: Indrajit Bhosale <[email protected]>
Summary by CodeRabbit
Release Notes
[https://nvbugspro.nvidia.com/bug/5790851][fix] Standalone Encoder EPD broken with Llava model
Description
A recent HuggingFace commit (2424fdd) removed
torch_dtype: "bfloat16"from thetext_configsection ofllava-hf/llava-v1.6-mistral-7b-hf'sconfig.json.This causes
text_config.torch_dtypeto resolve toNone, which propagates through model initialization and ultimately triggers aKeyError: Nonein the KV cache manager when it tries to convertNoneto a dtype string viatorch_dtype_to_str().Fix: Fall back to the top-level
config.torch_dtypewhenevertext_config.torch_dtypeisNonein three places:-LlavaNextInputProcessor.__init__—self._dtype-LlavaNextVisionModel.__init__—self.dtype-LlavaNextModel.__init__— propagatetorch_dtypeto the LLM sub-config before constructing the language modelThis is a defensive fix: HF model configs don't always replicate
torch_dtypeinto every sub-config, and TRT-LLM should gracefully inherit from the parent config.Test Coverage
PR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.
GitHub Bot Help
To see a list of available CI bot commands, please comment
/bot help.