fix: handle mm_token_type_ids in collator and packing tests by markmochi200 · Pull Request #10397 · hiyouga/LlamaFactory

markmochi200 · 2026-04-16T04:24:59Z

What does this PR do?

Fixes an issue where mm_token_type_ids is missing during training or RoPE computation with newer multimodal models (e.g., Gemma 4, Qwen2VL), causing runtime errors such as:

mm_token_type_ids is required as a model input when training

Changes

preserve and correctly pad mm_token_type_ids from processor outputs in the collator
synthesize zero mm_token_type_ids for Gemma 4 text-only batches when missing
propagate mm_token_type_ids through packed RoPE position-id computation
fix packed per-sample slicing for RoPE computation
update the packing test helper to include mm_token_type_ids when calling get_rope_index() directly

Before submitting

Did you read the [contributor guideline](https://github.com/hiyouga/LLaMA-Factory/blob/main/.github/CONTRIBUTING.md)?
Did you write any new necessary tests?

gemini-code-assist

Code Review

This pull request improves the handling of multi-modal token type IDs in the data collator, specifically for models like Qwen2VL and Gemma4. Key changes include robust extraction of mm_token_type_ids, updated slicing logic for packed sequences, and improved padding alignment. The test suite was also updated to reflect these changes. Review feedback recommends using dictionary comprehensions for feature slicing to enhance code maintainability and conciseness.

markmochi200 · 2026-05-06T10:21:08Z

@Kuangdd01 Please address this PR asap. Thanks

Kuangdd01 · 2026-05-06T15:58:14Z

Thanks! please resolve these conflicts.

Kuangdd01 · 2026-05-07T17:35:28Z

-        if "mm_token_type_ids" in inspect.signature(self.get_rope_func).parameters:
-            image_token_id = getattr(self.model.config, "image_token_id", None)
-            video_token_id = getattr(self.model.config, "video_token_id", None)
-            if image_token_id is not None or video_token_id is not None:
-                mm_token_type_ids = torch.zeros_like(features["input_ids"])
-                if image_token_id is not None:
-                    mm_token_type_ids[features["input_ids"] == image_token_id] = 1
-                if video_token_id is not None:
-                    mm_token_type_ids[features["input_ids"] == video_token_id] = 2
-                rope_index_kwargs["mm_token_type_ids"] = mm_token_type_ids


why we remove this if-condition block?

Kuangdd01 · 2026-05-07T17:37:44Z

+        elif model_type == "gemma4":
+            # Gemma 4 text-only batches still require the field.
+            features["mm_token_type_ids"] = torch.zeros_like(features["input_ids"])
+
+        # Keep token_type_ids present as well for Gemma 4 text-only robustness.
+        if model_type == "gemma4" and "token_type_ids" not in features:
+            features["token_type_ids"] = torch.zeros_like(features["input_ids"])


feel confused about these

Feel free to train gemma4-31b with a text-only JSONL file on pt stage and you'll reproduce the issue

oh! Just remembered that huggingface/transformers#45454 may fix the issue in model forwarding?

fix: handle mm_token_type_ids in collator and packing tests

0bd00da

gemini-code-assist Bot reviewed Apr 16, 2026

View reviewed changes

Comment thread src/llamafactory/data/collator.py

Comment thread src/llamafactory/data/collator.py

markmochi200 temporarily deployed to docker April 27, 2026 02:32 — with GitHub Actions Inactive

Kuangdd01 added the pending This problem is yet to be addressed label Apr 27, 2026

Merge branch 'main' into dev_fix_gemma4_mm_token_type_ids

eb43147

hiyouga requested a review from Kuangdd01 May 6, 2026 16:35

Merge branch 'main' into dev_fix_gemma4_mm_token_type_ids

c7bf0c7

markmochi200 had a problem deploying to docker May 7, 2026 17:32 — with GitHub Actions Failure

markmochi200 temporarily deployed to docker May 7, 2026 17:32 — with GitHub Actions Inactive

Kuangdd01 requested changes May 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: handle mm_token_type_ids in collator and packing tests#10397

fix: handle mm_token_type_ids in collator and packing tests#10397
markmochi200 wants to merge 3 commits into
hiyouga:mainfrom
markmochi200:dev_fix_gemma4_mm_token_type_ids

markmochi200 commented Apr 16, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

markmochi200 commented May 6, 2026

Uh oh!

Kuangdd01 commented May 6, 2026

Uh oh!

Kuangdd01 May 7, 2026

Uh oh!

Kuangdd01 May 7, 2026

Uh oh!

markmochi200 May 8, 2026 •

edited

Loading

Uh oh!

Kuangdd01 May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

markmochi200 commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Changes

Before submitting

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

markmochi200 commented May 6, 2026

Uh oh!

Kuangdd01 commented May 6, 2026

Uh oh!

Kuangdd01 May 7, 2026

Choose a reason for hiding this comment

Uh oh!

Kuangdd01 May 7, 2026

Choose a reason for hiding this comment

Uh oh!

markmochi200 May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Kuangdd01 May 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

markmochi200 commented Apr 16, 2026 •

edited

Loading

markmochi200 May 8, 2026 •

edited

Loading