Abstract
In real-world scenarios, data distributions often exhibit a long-tailed nature: a few categories (head classes) contain the majority of samples, while the remaining categories (tail classes) suffer from severe data scarcity. Due to the limited number of samples, the observed distributions of tail classes often fail to capture their true underlying distributions, resulting in substantially degraded generalization performance. We observe a strong negative correlation between the distances of feature centers across categories and the similarity of their covariance matrices in the feature space. Motivated by this finding, we propose estimating the true distributional shapes of tail classes by leveraging the covariance structures of head classes. Moreover, our experiments reveal that the observed feature centers of tail classes are frequently biased away from their true centers. Interestingly, this bias direction tends to align closely—with high cosine similarity—with the direction pointing toward the feature center of the nearest head class. Building on this observation, we predict and apply such bias shifts to better approximate the true feature distributions, thereby refining the decision boundaries. This approach leads to a substantial improvement in the generalization performance of tail classes.
| Original language | English |
|---|---|
| Article number | 131805 |
| Journal | Neurocomputing |
| Volume | 659 |
| DOIs | |
| Publication status | Published - 1 Jan 2026 |
Keywords
- Class imbalance
- Knowledge transfer
- Long-tailed classification
- Representational learning