This repository serves as a simple implementation for the ICML 2025 paper
Addressing Concept Mislabeling in Concept Bottleneck Models Through Preference Optimization
tldr: We propose a preference optimization based objective for Concept Bottleneck Models (CBMs) called Concept Preference Optimziation (CPO). We show that the proposed objective leads to better concept AUC, task accuracy and intervention performence for CBMs in standard benchmarks specially when the concept labels are noisy.
In case you want to just use the loss feel free to copy paste the below implementation.
Note it is dependant on inverse_sigmoid which can be found in models_and_dataloader/cpo.py
def cpo_loss_fn(c_sem, c, temperature=1, beta=1, reduction='mean'):
"""
Compute the Concept Preference Optimization (CPO) loss.
Args:
c_sem: Semantic concept predictions
c: Ground truth concept labels
temperature: Temperature parameter for Gumbel softmax
beta: Preference strength parameter
reduction: Reduction method ('mean' or None)
Returns:
Computed CPO loss
"""
# Ensure c_sem is in logit space
if not (torch.all(c_sem >= 0) and torch.all(c_sem <= 1)):
c_sem = inverse_sigmoid(c_sem)
# Create boolean logits for Gumbel softmax
boolens = torch.concat([
c_sem.unsqueeze(-1),
inverse_sigmoid(1 - torch.sigmoid(c_sem.unsqueeze(-1)))
], dim=-1)
# Sample from Gumbel softmax
gumbel_softmax_samples = F.gumbel_softmax(boolens, tau=temperature, hard=True)
gumbel_softmax_samples_pos = gumbel_softmax_samples[:, :, 0].squeeze(-1)
# Compute sigmoid probabilities
sigmoid_probs = torch.sigmoid(c_sem / temperature)
# Calculate log probabilities for sampled concepts
sampled_probs = (
gumbel_softmax_samples_pos * torch.log(sigmoid_probs + 1e-8) +
(1 - gumbel_softmax_samples_pos) * torch.log((1 - sigmoid_probs) + 1e-8)
)
# Calculate log probabilities for ground truth labels
label_probs = (
c * torch.log(sigmoid_probs + 1e-8) +
(1 - c) * torch.log((1 - sigmoid_probs) + 1e-8)
)
# Compute preference loss
preference_loss = -torch.log(
torch.sigmoid(beta * (label_probs - sampled_probs)) + 1e-8
)
# Apply reduction
if reduction == 'mean':
total_loss = preference_loss.mean()
else:
total_loss = preference_loss
return total_lossThis repository is a simplified version of the Concept Embedding Model repository codebase, thus any new models/datasets added to that codebase should be directly executable here as well.
You can either directly execute the model in the cub_cbm_comparison.ipynb, or run the run.py.
Support for the intervention experiments coming soon.
@article{penaloza2025concept,
title = {Addressing Concept Mislabeling in Concept Bottleneck Models Through Preference Optimization},
author = {Emiliano Penaloza and Tianyue H. Zhan and Laurent Charlin and Mateo Espinosa Zarlenga},
journal = {Proceedings of the 42nd International Conference on Machine Learning (ICML)},
year = {2025},
month = apr,
url = {https://icml.cc/},
note = {To appear},
}