UniQ-ViT: Optimization-driven uniform quantization for vision transformer acceleration

  • Zhendong Yu
  • , Wenqiang Zhou
  • , Chenwei Tang*
  • , Miqing Li
  • , Liangli Zhen
  • , Jiancheng Lv
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Post-Training Quantization (PTQ) offers a data-efficient approach for compressing neural networks, making it attractive for deployment. While recent PTQ methods for Vision Transformers (ViTs) leverage non-uniform quantizers to preserve accuracy, they incur significant computational costs and deployment complexity. In this study, we introduce UniQ-ViT, a novel optimization-driven PTQ framework for Vision Transformers that achieves superior performance while maintaining full uniformity in quantization. Our framework incorporates two complementary optimization mechanisms: Adaptive Quantization Optimization (AQO) and Scale Reparameterization Optimization (SRO). The AQO component progressively mitigates outlier-induced quantization errors through block-wise parameter refinement. It first establishes locally optimal quantization ranges to initialize parameters. Then it jointly fine-tunes both quantization parameters and weights to restore model performance. Concurrently, SRO addresses the critical challenge of substantial inter-channel variations in post-LayerNorm activations through a decoupling-based two-stage optimization process that significantly reduces quantization error propagation. Extensive empirical evaluations across diverse ViT architectures and multiple computer vision tasks—including image classification, object detection, and instance segmentation—demonstrate that UniQ-ViT consistently outperforms state-of-the-art PTQ methods while maintaining deployment-friendly uniform quantization. Code is available at https://github.com/Dexter-Yu/UniQ-ViT.
Original languageEnglish
Article number132072
Number of pages10
JournalNeurocomputing
Volume663
Early online date7 Nov 2025
DOIs
Publication statusPublished - 28 Jan 2026

Fingerprint

Dive into the research topics of 'UniQ-ViT: Optimization-driven uniform quantization for vision transformer acceleration'. Together they form a unique fingerprint.

Cite this