📝 Publications

(* denotes equal contribution.)

ICCV 2025
sym

First Author

HQCLIP: Leveraging Vision-Language Models to Create High-Quality Image-Text Datasets and CLIP Models (arXiV comming soon) \

Zhixiang Wei*, Guangting Wang*, Xiaoxiao Ma, et al.

GitHub comming soon

  • We generated detailed, bidirectional long-text descriptions for 1.3 billion images and pretrained/fine-tuned CLIP based on this dataset. Building upon this foundation, we propose a novel CLIP training framework that combines both bidirectional supervision and label classification losses. This framework achieves SoTA results on zero-shot classification, retrieval, and other tasks at the same data scale.
CVPR 2024
sym

First Author

Stronger, Fewer, & Superior: Harnessing Vision Foundation Models for Domain Generalized Semantic Segmentation
Zhixiang Wei*, Lin Chen*, Yi Jin*, Xiaoxiao Ma, et al.

[Project page]

  • We propose the Reins framework, which efficiently fine-tunes vision foundation models for the domain generalized semantic segmentation (DGSS) task with just 1% trainable parameters, surprisingly surpassing full parameter fine-tuning. And Reins builds a new SOTA in various DGSS benchmarks.
ICCV 2023
sym

First Author

Disentangle then Parse: Night-time Semantic Segmentation with Illumination Disentanglement
Zhixiang Wei*, Lin Chen*, et al.

  • We propose a novel nigh-time semantic segmentation paradigm, i.e., disentangle then parse (DTP), which explicitly disentangles night-time images into light-invariant reflectance and light-specific illumination components and then recognizes semantics based on their adaptive fusion.
NeurIPS 2022 (Spotlight)
sym

Co-First Author

Deliberated Domain Bridging for Domain Adaptive Semantic Segmentation
Lin Chen*, Zhixiang Wei*, Xin Jin*, et al.

  • We leverage the complementary characteristics of the coarse-wise and fine-wise data mixing techniques to progressively transfer the knowledge from the source to the target domain.
NeurIPS 2024
sym

Co-First Author

Masked Pre-trained Model Enables Universal Zero-shot Denoiser
, Xiaoxiao Ma*, Zhixiang Wei*, et al.

  • MPI is a zero-shot denoising pipeline designed for many types of noise degradations.
CVPR 2022
sym

Co-Author

Reusing the Task-specific Classifier as a Discriminator: Discriminator-free Adversarial Domain Adaptation
Lin Chen, Zhixiang Wei, Xin Jin, Enhong Chen.

  • We reuse the category classifier as a discriminator to form a discriminator-free adversarial learning framework.