Combining Hard Negative Sampling with Supervised Contrastive Learning

Current image models typically use a two-stage approach: pre-training on large datasets and then fine-tuning using cross-entropy loss. However, studies have shown that cross-entropy loss may not provide optimal generalization and stability. Although supervised contrastive loss addresses some limitations of cross-entropy by focusing on intra-class similarities and inter-class differences, it overlooks the importance of hard negative mining. We propose that models can improve performance by assigning weights to negative samples based on their dissimilarity to positive counterparts. In this paper, we introduce a new supervised contrastive learning objective called SCHaNe, which incorporates hard negative sampling during the fine-tuning phase. Our experimental results show that SCHaNe outperforms the strong baseline BEiT-3 in Top-1 accuracy across various benchmarks, with significant gains of up to 3.32% in few-shot learning settings and 3.41% in full dataset fine-tuning. Notably, our proposed objective achieves a new state-of-the-art accuracy of 86.14% for base models on ImageNet-1k, without requiring specialized architectures, additional data, or extra computational resources. Furthermore, we demonstrate that the proposed objective produces better embeddings and provides an explanation for the observed improvements in our experiments.

Combining Hard Negative Sampling with Supervised Contrastive Learning

Live Search

Posts

Categories

Popular Posts

Contact Info

About Us

Useful Information