SANER: Annotation-free Societal Attribute Neutralizer for Debiasing CLIP

Yusuke Hirota^1,2^* , Min-Hung Chen¹ , Chien-Yi Wang¹ , Yuta Nakashima² , Yu-Chiang Frank Wang^1,3 , Ryo Hachiuma¹

¹ NVIDIA, ² Osaka University, ³ National Taiwan University

ICLR 2025

* Work done during the internship.

Our debiasing method, SANER, overcomes the limitations in existing methods: (a) attribute information is retained after debiasing, and (b) protected attribute annotations are not required for debiasing.

Abstract

Large-scale vision-language models, such as CLIP, are known to contain societal bias regarding protected attributes (e.g., gender, age). This paper aims to address the problems of societal bias in CLIP. Although previous studies have proposed to debias societal bias through adversarial learning or test-time projecting, our comprehensive study of these works identifies two critical limitations: 1) loss of attribute information when it is explicitly disclosed in the input and 2) use of the attribute annotations during debiasing process. To mitigate societal bias in CLIP and overcome these limitations simultaneously, we introduce a simple-yet-effective debiasing method called SANER (societal attribute neutralizer) that eliminates attribute information from CLIP text features only of attribute-neutral descriptions. Experimental results show that SANER, which does not require attribute annotations and preserves original information for attribute-specific descriptions, demonstrates superior debiasing ability than the existing methods. Additionally, we observe that SANER does not require retraining CLIP from scratch with the original dataset. Moreover, the debiased model can be directly applied to the text-to-image generation model by simply replacing the text encoder.

BibTeX

@inproceedings{hirota2024saner,
  title={SANER: Annotation-free Societal Attribute Neutralizer for Debiasing CLIP},
  author={Hirota, Yusuke and Chen, Min-Hung and Wang, Chien-Yi and Nakashima, Yuta and Wang, Yu-Chiang Frank and Hachiuma, Ryo},
  booktitle={The Thirteenth International Conference on Learning Representations},
  year={2025},
  url={https://arxiv.org/pdf/2408.10202}
}