fix: replace mutable default input_kwargs={} with None in HuggingfaceEngine#10477
fix: replace mutable default input_kwargs={} with None in HuggingfaceEngine#10477kuishou68 wants to merge 1 commit into
Conversation
…ceEngine (Closes hiyouga#10476) Co-authored-by: lingxiu58 <86288566+lingxiu58@users.noreply.github.com> Signed-off-by: Cocoon-Break <54054995+kuishou68@users.noreply.github.com>
There was a problem hiding this comment.
Code Review
This pull request replaces mutable default arguments for input_kwargs with None and adds support for the minicpmv4_6 model type. The review feedback highlights that input_kwargs should be copied to prevent side effects from .pop() calls mutating the caller's dictionary. Additionally, the reviewer suggests making downsample_mode configurable per request via input_kwargs rather than relying on a global environment variable.
| if getattr(model.config, "model_type", None) == "minicpmv4_6": | ||
| gen_kwargs["downsample_mode"] = os.getenv("DOWNSAMPLE_MODE", "16x") |
There was a problem hiding this comment.
There are two issues with this new block:
- Using os.getenv for a generation parameter like downsample_mode makes it globally fixed for the process, preventing per-request configuration. It would be better to allow this to be passed via input_kwargs with the environment variable as a fallback.
- If minicpmv4_6 is a multimodal model architecture similar to minicpmv, it likely requires the same input_ids and tokenizer keys in gen_kwargs that are set in the preceding block (lines 202-208). By keeping this check separate, those necessary configurations are skipped, which may lead to runtime errors or incorrect model behavior.
| if getattr(model.config, "model_type", None) == "minicpmv4_6": | |
| gen_kwargs["downsample_mode"] = os.getenv("DOWNSAMPLE_MODE", "16x") | |
| if getattr(model.config, "model_type", None) == "minicpmv4_6": | |
| gen_kwargs["downsample_mode"] = input_kwargs.pop("downsample_mode", os.getenv("DOWNSAMPLE_MODE", "16x")) |
| if input_kwargs is None: | ||
| input_kwargs = {} |
There was a problem hiding this comment.
While this change correctly addresses the shared mutable default argument issue, it still allows the input_kwargs dictionary to be mutated if one is provided by the caller. Since this method calls .pop() on the dictionary later (e.g., lines 123-133), it will have the side effect of draining the caller's dictionary. It is safer to create a local copy to avoid these side effects.
input_kwargs = input_kwargs.copy() if input_kwargs is not None else {}| if input_kwargs is None: | ||
| input_kwargs = {} |
There was a problem hiding this comment.
| if input_kwargs is None: | ||
| input_kwargs = {} |
There was a problem hiding this comment.
| if input_kwargs is None: | ||
| input_kwargs = {} |
Closes #10476
Problem
Four static methods in
HuggingfaceEngineuseinput_kwargs: Optional[dict[str, Any]] = {}as a default parameter. This is a Python anti-pattern: the same dict object is shared across all invocations that omit the argument. Since each method calls.pop()on it, the shared dict is drained after the first call — subsequent callers relying on the default receive an already-mutated (empty) dict.Affected methods:
_process_args_chat_stream_chat_get_scoresFix
Change the default to
Noneand initialiseinput_kwargs = {}at the top of each function body, following the standard Python idiom.