Selected Publications

More Publications

WARM: On the Benefits of Weight Averaged Reward Models

We introduce a new strategy for reward modeling in alignment via RLHF: we merge multiple reward models into one that’s more reliable and robust, and thus mitigates reward hacking.

Diverse and Efficient Ensembling of Deep Networks

During my PhD, I analyzed how ensembling via weight averaging can improve out-of-distribution generalization and alignment.

Beyond task performance: evaluating and reducing the flaws of large multimodal models with in-context-learning

We investigate large multimodal models and their limitations such as hallucinations and lack of explainability. We then show that multimodal in-context learning can reduce some of these flaws.

Rewarded soups: towards Pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards

We introduce rewarded soup, a new strategy to trade-off between multiple rewards when fine-tuning foundation models with RLHF; we first learn one network for each reward, and then linearly interpolate their weights.

UnIVAL: Unified Model for Image, Video, Audio and Language Tasks

UnIVAL is a 0.25B-parameter unified model that is multitask pretrained on image and video-text data and target image, video and audio-text downstream tasks.

Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization

We propose a new fine-tuning strategy that improves OOD generalization in computer vision by recycling and averaging weights specialized on diverse auxiliary tasks.

Diverse Weight Averaging for Out-of-Distribution Generalization

To improve out-of-distribution generalization on DomainBed, we average diverse weights obtained from different training runs; this strategy is motivated by an extension of the bias-variance theory to weight averaging.

DyTox: Transformers for Continual Learning with DYnamic TOken eXpansion

We propose a new dynamic transformer architecture for continual learning with state-of-the-art performances.

Fishr: Invariant Gradient Variances for Out-of-Distribution Generalization

We introduce and motivate a new regularization that enforces invariance in the domain-level gradient variances across the different training domains in order to improve out-of-distribution generalization.

MixMo: Mixing Multiple Inputs for Multiple Outputs via Deep Subnetworks

We introduce a new generalized framework for learning multi-input multi-output subnetworks and study how to best mix the inputs. We obtain sota on CIFAR and Tiny ImageNet by better leveraging the expressiveness of large networks.


PRAIRIE Artificial Intelligence Summer School: Key Takeaways   Medium
Semi-supervised Learning for Multilingual Sentence Representation   PDF   Video


Deep Learning for Computer Vision
Deep Learning