Concept Preference Optimization

This repository serves as a simple implementation for the ICML 2025 paper

Addressing Concept Mislabeling in Concept Bottleneck Models Through Preference Optimization

tldr: We propose a preference optimization based objective for Concept Bottleneck Models (CBMs) called Concept Preference Optimziation (CPO). We show that the proposed objective leads to better concept AUC, task accuracy and intervention performence for CBMs in standard benchmarks specially when the concept labels are noisy.

Loss

In case you want to just use the loss feel free to copy paste the below implementation.

Note it is dependant on inverse_sigmoid which can be found in models_and_dataloader/cpo.py

def cpo_loss_fn(c_sem, c, temperature=1, beta=1, reduction='mean'):
    """
    Compute the Concept Preference Optimization (CPO) loss.
    
    Args:
        c_sem: Semantic concept predictions
        c: Ground truth concept labels
        temperature: Temperature parameter for Gumbel softmax
        beta: Preference strength parameter
        reduction: Reduction method ('mean' or None)
    
    Returns:
        Computed CPO loss
    """
    
    # Ensure c_sem is in logit space
    if not (torch.all(c_sem >= 0) and torch.all(c_sem <= 1)):
        c_sem = inverse_sigmoid(c_sem)
    
 
    
    # Create boolean logits for Gumbel softmax
    boolens = torch.concat([
        c_sem.unsqueeze(-1), 
        inverse_sigmoid(1 - torch.sigmoid(c_sem.unsqueeze(-1)))
    ], dim=-1)

    
    # Sample from Gumbel softmax
    gumbel_softmax_samples = F.gumbel_softmax(boolens, tau=temperature, hard=True)
    gumbel_softmax_samples_pos = gumbel_softmax_samples[:, :, 0].squeeze(-1)
    
    # Compute sigmoid probabilities
    sigmoid_probs = torch.sigmoid(c_sem / temperature)
    
    # Calculate log probabilities for sampled concepts
    sampled_probs = (
        gumbel_softmax_samples_pos * torch.log(sigmoid_probs + 1e-8) +
        (1 - gumbel_softmax_samples_pos) * torch.log((1 - sigmoid_probs) + 1e-8)
    )
    
    # Calculate log probabilities for ground truth labels
    label_probs = (
        c * torch.log(sigmoid_probs + 1e-8) +
        (1 - c) * torch.log((1 - sigmoid_probs) + 1e-8)
    )
    
    # Compute preference loss
    preference_loss = -torch.log(
        torch.sigmoid(beta * (label_probs - sampled_probs)) + 1e-8
    )
    
    # Apply reduction
    if reduction == 'mean':
        total_loss = preference_loss.mean()
    else:
        total_loss = preference_loss
    
    return total_loss

Execution

This repository is a simplified version of the Concept Embedding Model repository codebase, thus any new models/datasets added to that codebase should be directly executable here as well.

You can either directly execute the model in the cub_cbm_comparison.ipynb, or run the run.py.

Support for the intervention experiments coming soon.

Citation

@article{penaloza2025concept,
  title     = {Addressing Concept Mislabeling in Concept Bottleneck Models Through Preference Optimization},
  author    = {Emiliano Penaloza and Tianyue H. Zhan and Laurent Charlin and Mateo Espinosa Zarlenga},
  journal = {Proceedings of the 42nd International Conference on Machine Learning (ICML)},
  year      = {2025},
  month     = apr,
  url       = {https://icml.cc/},
  note      = {To appear},
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
models_and_dataloader		models_and_dataloader
.env		.env
.gitignore		.gitignore
README.md		README.md
cub_cbm_comparison.ipynb		cub_cbm_comparison.ipynb
reqs.txt		reqs.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Concept Preference Optimization

Loss

Execution

Citation

About

Uh oh!

Releases

Packages

Languages

Emilianopp/ConceptPreferenceOptimization

Folders and files

Latest commit

History

Repository files navigation

Concept Preference Optimization

Loss

Execution

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages