Skip to content

[None][fix] LlavaNext dtype fallback when text_config.torch_dtype is None#12169

Merged
chang-l merged 4 commits into
NVIDIA:mainfrom
indrajit96:fix/llava-next-dtype-fallback
Mar 18, 2026
Merged

[None][fix] LlavaNext dtype fallback when text_config.torch_dtype is None#12169
chang-l merged 4 commits into
NVIDIA:mainfrom
indrajit96:fix/llava-next-dtype-fallback

Conversation

@indrajit96
Copy link
Copy Markdown
Collaborator

@indrajit96 indrajit96 commented Mar 12, 2026

Summary by CodeRabbit

Release Notes

  • Bug Fixes
    • Enhanced model initialization with improved handling of data type configurations. The system now implements fallback mechanisms to ensure consistent behavior across input processors, vision models, and language model components. This prevents initialization errors and configuration-related issues, making the model loading process more reliable and stable across different deployment environments and hardware setups.

[https://nvbugspro.nvidia.com/bug/5790851][fix] Standalone Encoder EPD broken with Llava model

Description

A recent HuggingFace commit (2424fdd) removed torch_dtype: "bfloat16" from the text_config section of llava-hf/llava-v1.6-mistral-7b-hf's config.json.
This causes text_config.torch_dtype to resolve to None, which propagates through model initialization and ultimately triggers a KeyError: None in the KV cache manager when it tries to convert None to a dtype string via torch_dtype_to_str().

Fix: Fall back to the top-level config.torch_dtype whenever text_config.torch_dtype is None in three places:- LlavaNextInputProcessor.__init__self._dtype- LlavaNextVisionModel.__init__self.dtype- LlavaNextModel.__init__ — propagate torch_dtype to the LLM sub-config before constructing the language model

This is a defensive fix: HF model configs don't always replicate torch_dtype into every sub-config, and TRT-LLM should gracefully inherit from the parent config.

Test Coverage

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

@indrajit96 indrajit96 requested a review from a team as a code owner March 12, 2026 20:52
@indrajit96 indrajit96 requested a review from Wanli-Jiang March 12, 2026 20:52
@indrajit96
Copy link
Copy Markdown
Collaborator Author

CC @chang-l @2ez4bz for review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 12, 2026

📝 Walkthrough

Walkthrough

The changes add fallback dtype initialization logic to three components in the LlaVA Next model: the input processor, vision model, and LLM model initialization. Each now uses a secondary dtype source (config.torch_dtype) when the primary source is unavailable.

Changes

Cohort / File(s) Summary
LlaVA Next Model Dtype Initialization
tensorrt_llm/_torch/models/modeling_llava_next.py
Added dtype fallback logic to input processor, vision model, and LLM model initialization. Each component now checks a secondary dtype source when the primary text_config.torch_dtype is unavailable.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description check ✅ Passed The PR description is well-structured with all required sections completed, including title format, detailed explanation of the issue and fix, and checklist verification.
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding dtype fallback logic for LlavaNext when text_config.torch_dtype is None, which matches the core fix described in the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

You can disable sequence diagrams in the walkthrough.

Disable the reviews.sequence_diagrams setting to disable sequence diagrams in the walkthrough.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tensorrt_llm/_torch/models/modeling_llava_next.py (1)

63-64: Fallback logic is correct, but consider a guard for the double-None edge case.

The or fallback handles the primary scenario (HF omitting torch_dtype from text_config). However, if both config.text_config.torch_dtype and config.torch_dtype are None, self._dtype will remain None and could still trigger downstream errors.

Consider adding a defensive check or raising an informative error if the final dtype is None:

💡 Optional: Add explicit None guard
         self._dtype = (self.config.text_config.torch_dtype
                        or self.config.torch_dtype)
+        if self._dtype is None:
+            raise ValueError(
+                "Could not determine model dtype: both text_config.torch_dtype "
+                "and config.torch_dtype are None"
+            )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tensorrt_llm/_torch/models/modeling_llava_next.py` around lines 63 - 64, The
assignment to self._dtype uses a fallback (self.config.text_config.torch_dtype
or self.config.torch_dtype) but can still become None; update the initialization
in the class (where self._dtype is set) to guard this double-None case by either
raising a clear ValueError indicating both config.text_config.torch_dtype and
config.torch_dtype are unset, or by assigning a safe default (e.g.,
torch.float32); reference the symbols self._dtype,
config.text_config.torch_dtype, and config.torch_dtype and apply the check
immediately after the current fallback assignment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tensorrt_llm/_torch/models/modeling_llava_next.py`:
- Around line 536-542: The code assigns llm_model_config.pretrained_config =
model_config.pretrained_config.text_config which creates a shared reference and
causes the later mutation (setting torch_dtype) to modify the original
model_config; to fix, assign a deep copy of the text sub-config instead (e.g.,
use copy.deepcopy on model_config.pretrained_config.text_config) before setting
torch_dtype so changes only affect llm_model_config.pretrained_config and do not
mutate model_config used elsewhere (refer to llm_model_config, model_config,
pretrained_config, text_config).

---

Nitpick comments:
In `@tensorrt_llm/_torch/models/modeling_llava_next.py`:
- Around line 63-64: The assignment to self._dtype uses a fallback
(self.config.text_config.torch_dtype or self.config.torch_dtype) but can still
become None; update the initialization in the class (where self._dtype is set)
to guard this double-None case by either raising a clear ValueError indicating
both config.text_config.torch_dtype and config.torch_dtype are unset, or by
assigning a safe default (e.g., torch.float32); reference the symbols
self._dtype, config.text_config.torch_dtype, and config.torch_dtype and apply
the check immediately after the current fallback assignment.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6b870d6e-f75d-462d-8f1d-dd456f211c06

📥 Commits

Reviewing files that changed from the base of the PR and between 5cc0ccd and 268d64b.

📒 Files selected for processing (1)
  • tensorrt_llm/_torch/models/modeling_llava_next.py

Comment thread tensorrt_llm/_torch/models/modeling_llava_next.py
@nv-yna nv-yna requested a review from chang-l March 12, 2026 21:09
@svc-trtllm-gh-bot svc-trtllm-gh-bot added the Community want to contribute PRs initiated from Community label Mar 12, 2026
@pengbowang-nv pengbowang-nv removed the Community want to contribute PRs initiated from Community label Mar 13, 2026
@svc-trtllm-gh-bot svc-trtllm-gh-bot added the Community want to contribute PRs initiated from Community label Mar 13, 2026
@indrajit96
Copy link
Copy Markdown
Collaborator Author

/bot run

@indrajit96
Copy link
Copy Markdown
Collaborator Author

@chang-l @pengbowang-nv any update on this

@chang-l
Copy link
Copy Markdown
Collaborator

chang-l commented Mar 17, 2026

/bot run --disable-fail-fast

@chang-l chang-l changed the title fix: LlavaNext dtype fallback when text_config.torch_dtype is None [None][fix] LlavaNext dtype fallback when text_config.torch_dtype is None Mar 17, 2026
@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #39302 [ run ] triggered by Bot. Commit: 268d64b Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #39302 [ run ] completed with state SUCCESS. Commit: 268d64b
/LLM/main/L0_MergeRequest_PR pipeline #30552 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@indrajit96
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@indrajit96
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #39489 [ run ] triggered by Bot. Commit: 70ba420 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #39489 [ run ] completed with state SUCCESS. Commit: 70ba420
/LLM/main/L0_MergeRequest_PR pipeline #30715 completed with status: 'SUCCESS'

CI Report

Link to invocation

@indrajit96
Copy link
Copy Markdown
Collaborator Author

@chang-l @Wanli-Jiang Can we merge this CI is green?

@chang-l chang-l merged commit 58ec688 into NVIDIA:main Mar 18, 2026
5 checks passed
@chang-l chang-l removed the Community want to contribute PRs initiated from Community label Mar 18, 2026
limin2021 pushed a commit to limin2021/TensorRT-LLM that referenced this pull request Mar 19, 2026
longcheng-nv pushed a commit to longcheng-nv/TensorRT-LLM that referenced this pull request Mar 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants