-
Notifications
You must be signed in to change notification settings - Fork 1.1k
支持int8-quanto量化微调Qwen-Image-Edit-2509,48G显存即可微调🎉,训练脚本添加result_image_field_name参数. #1101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @Deng-Xian-Sheng, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! 此拉取请求的核心目标是优化Qwen-Image-Edit模型的微调过程,使其对硬件资源的需求更低。通过引入int8-quanto量化技术,并提供灵活的配置选项,用户现在可以在48GB显存的GPU上进行模型微调,从而极大地扩展了可用的硬件范围。此外,还对训练脚本进行了改进,增加了对数据集字段命名的支持,提升了用户体验和灵活性。 Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
本次PR引入了对Qwen-Image-Edit模型的int8-quanto量化微调支持,这是一个非常实用的功能,可以显著降低显存占用,让更多用户能够在消费级硬件上进行微调。代码实现上,你将量化逻辑很好地模块化到了diffsynth/utils/quantisation包中,并且通过monkey-patch解决了peft和quanto的兼容性问题,思路清晰。同时,添加result_image_field_name参数也提高了脚本的灵活性和可读性。
整体代码质量不错,我主要针对代码的可维护性和健壮性提出了一些建议,例如简化冗余代码、降低模块间的耦合度等,希望能帮助你进一步完善代码。感谢你的贡献!
| models = [ | ||
| ( | ||
| model, | ||
| { | ||
| "quant_fn": get_quant_fn(args.base_model_precision), | ||
| "model_precision": args.base_model_precision, | ||
| "quantize_activations": getattr(args, "quantize_activations", False), | ||
| }, | ||
| ), | ||
| ( | ||
| controlnet, | ||
| { | ||
| "quant_fn": get_quant_fn(args.base_model_precision), | ||
| "model_precision": args.base_model_precision, | ||
| "quantize_activations": getattr(args, "quantize_activations", False), | ||
| }, | ||
| ), | ||
| ( | ||
| te1, | ||
| { | ||
| "quant_fn": get_quant_fn(args.text_encoder_1_precision), | ||
| "model_precision": args.text_encoder_1_precision, | ||
| "base_model_precision": args.base_model_precision, | ||
| }, | ||
| ), | ||
| ( | ||
| te2, | ||
| { | ||
| "quant_fn": get_quant_fn(args.text_encoder_2_precision), | ||
| "model_precision": args.text_encoder_2_precision, | ||
| "base_model_precision": args.base_model_precision, | ||
| }, | ||
| ), | ||
| ( | ||
| te3, | ||
| { | ||
| "quant_fn": get_quant_fn(args.text_encoder_3_precision), | ||
| "model_precision": args.text_encoder_3_precision, | ||
| "base_model_precision": args.base_model_precision, | ||
| }, | ||
| ), | ||
| ( | ||
| te4, | ||
| { | ||
| "quant_fn": get_quant_fn(args.text_encoder_4_precision), | ||
| "model_precision": args.text_encoder_4_precision, | ||
| "base_model_precision": args.base_model_precision, | ||
| }, | ||
| ), | ||
| ( | ||
| ema, | ||
| { | ||
| "quant_fn": get_quant_fn(args.base_model_precision), | ||
| "model_precision": args.base_model_precision, | ||
| "quantize_activations": getattr(args, "quantize_activations", False), | ||
| }, | ||
| ), | ||
| ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
models 列表的构建方式包含大量重复代码,这使得代码难以阅读和维护。此外,它依赖于调用者(train.py)提供一个包含所有 text_encoder_*_precision 属性的 args 对象,导致了不必要的强耦合(见 train.py 中 fake_args 的实现)。
建议将此部分重构为一个配置列表和循环,并使用 getattr 的默认值来增强函数的健壮性,从而解耦此工具函数与具体的训练脚本。
model_definitions = [
{"model": model, "precision_key": "base_model_precision", "quantize_activations": True},
{"model": controlnet, "precision_key": "base_model_precision", "quantize_activations": True},
{"model": te1, "precision_key": "text_encoder_1_precision"},
{"model": te2, "precision_key": "text_encoder_2_precision"},
{"model": te3, "precision_key": "text_encoder_3_precision"},
{"model": te4, "precision_key": "text_encoder_4_precision"},
{"model": ema, "precision_key": "base_model_precision", "quantize_activations": True},
]
models = []
for definition in model_definitions:
m = definition["model"]
precision_key = definition["precision_key"]
# Use getattr with a default to make this function more robust
model_precision = getattr(args, precision_key, "no_change")
qargs = {
"quant_fn": get_quant_fn(model_precision),
"model_precision": model_precision,
"base_model_precision": args.base_model_precision,
}
if definition.get("quantize_activations"):
qargs["quantize_activations"] = getattr(args, "quantize_activations", False)
models.append((m, qargs))| fake_args = SimpleNamespace( | ||
| base_model_precision=base_model_precision, | ||
| text_encoder_1_precision=text_encoder_1_precision, | ||
| text_encoder_2_precision="no_change", | ||
| text_encoder_3_precision="no_change", | ||
| text_encoder_4_precision="no_change", | ||
| quantize_activations=quantize_activations, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| te1 = te2 = te3 = te4 = None | ||
| if text_encoders is not None: | ||
| if len(text_encoders) > 0: | ||
| te1 = text_encoders[0] | ||
| if len(text_encoders) > 1: | ||
| te2 = text_encoders[1] | ||
| if len(text_encoders) > 2: | ||
| te3 = text_encoders[2] | ||
| if len(text_encoders) > 3: | ||
| te4 = text_encoders[3] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| new_text_encoders = [] | ||
| if te1 is not None: | ||
| new_text_encoders.append(te1) | ||
| if te2 is not None: | ||
| new_text_encoders.append(te2) | ||
| if te3 is not None: | ||
| new_text_encoders.append(te3) | ||
| if te4 is not None: | ||
| new_text_encoders.append(te4) | ||
| if len(new_text_encoders) == 0: | ||
| new_text_encoders = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if self.disable_adapters: | ||
| if self.merged: | ||
| self.unmerge() | ||
| result = self.base_layer(x, *args, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| use_quanto = ( | ||
| (base_model_precision is not None and "quanto" in base_model_precision.lower()) | ||
| or (text_encoder_1_precision is not None and "quanto" in text_encoder_1_precision.lower()) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use_quanto 的检查逻辑可以写得更简洁。使用 any() 和生成器表达式可以让代码更具可读性。
| use_quanto = ( | |
| (base_model_precision is not None and "quanto" in base_model_precision.lower()) | |
| or (text_encoder_1_precision is not None and "quanto" in text_encoder_1_precision.lower()) | |
| ) | |
| use_quanto = any( | |
| p is not None and "quanto" in p.lower() | |
| for p in [base_model_precision, text_encoder_1_precision] | |
| ) |
支持int8-quanto量化微调Qwen-Image-Edit-2509,48G显存即可微调🎉,训练脚本添加result_image_field_name参数.
这是一个启动的例子:
新的参数有:
--result_image_field_name用于指定并代替之前的image字段名,因为它在微调Qwen-Image-Edit模型时语义不是很清晰。你仍可以使用
image作为模型生成的结果图像的字段名,并将:改为:
这和作者的例子是一致的,但是我建议,劳烦你写个脚本,修改你的json文件中的字段名,因为你可以看到之前的字段名有一定的歧义,这不会影响使用,完全取决于你认为的可读性。
--base_model_precision量化DiT,可能是最占显存的部分。--text_encoder_1_precision量化文本编码器,Qwen2.5-VL部分--quantize_activations使用 quanto 时,除了量化权重之外,还要量化激活值。--quantize_vae量化VAE,可能会在非常细腻的纹理上有轻微劣化(通常可接受)我在 4090 48G上测试,使用以下配置:
由于是测试,我将--num_epochs、--dataset_repeat设为了1。
成功得到
./checkpoint_one/epoch-0.safetensors并进行了推理,没有问题。(注意在48G显存下推理需要采取量化推理或者cpu offload +pipe.vae.enable_slicing()措施。)我还测试了以下极端的组合:
跑了大约30步,没啥问题。
int8-quanto量化微调的代码我受到SimpleTuner项目启发,有兴趣大家去看看
https://github.com/bghira/SimpleTuner。我个人认为一个较好的配置是:
DiT似乎挺大,文本编码器Qwen2.5-VL部分也很大,光它就占16G显存,而量化VAE和量化激活值我觉得似乎必要性不大。
量化不仅仅是跑起来,还是在分辨率和rank之间找平衡,更大的分辨率和rank需要更多显存,所以,完全取决于你剩多少显存,你想用什么样的分辨率和rank,当你想用更大的分辨率时,就要从别处挤压显存出来。
最后,才疏学浅,它能跑了,具体实现有没有问题还得看各位前辈分析,谢谢🙏。
可以摆脱这个邪恶的报错了🎉🎉🍺🍺:
如果你想在24G上跑,可以尝试:
我没试过,但它应该是起作用的。
显然,量化是一种哲学,包括Z-Image模型在内,任何想通过少steps实现图像生成的,都是一种节省方法,但这并不是没有代价,图像细节会差点。还有借助强化学习来优化图像生成的,没有被强化学习训练到的部分,效果会变差。