菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-05-26
📄 Abstract - Tail-Aware HiFloat4: W4A4 Post-Training Quantization for Wan2.2

This report describes Tail-Aware HiFloat4, our submission to the low-bit text-to-video generation quantization challenge. Our method adapts the public ViDiT-Q post-training quantization pipeline to Wan2.2 under the HiFloat4 numerical format. We quantize the main linear layers in both Wan2.2 transformer modules with W4A4 HiFloat4 fake quantization, keep numerically sensitive boundary modules in high precision, and introduce an activation-tail-aware percentile calibration module for channel-mask construction. Together with compact PTQ-state restoration, this design reduces the influence of rare calibration outliers while keeping the runtime HiFloat4 arithmetic and sampling pipeline unchanged.

顶级标签: machine learning aigc model training
详细标签: post-training quantization text-to-video activation calibration low-bit quantization wan2.2 或 搜索:

尾感知HiFloat4:面向Wan2.2的W4A4训练后量化方法 / Tail-Aware HiFloat4: W4A4 Post-Training Quantization for Wan2.2


1️⃣ 一句话总结

该论文提出了一种针对Wan2.2文本转视频模型的低比特量化方案,通过引入尾感知的百分位校准模块和边界高精度保留策略,在将模型权重和激活值压缩至4位精度的同时,有效抑制了罕见校准异常值的影响,保持了推理效率。

源自 arXiv: 2605.26628