Selected Publications

More Publications

WARM: On the Benefits of Weight Averaged Reward Models

We introduce a new strategy for reward modeling in alignment via RLHF: we merge multiple reward models into one that’s more reliable and robust, and thus mitigates reward hacking.

Diverse and Efficient Ensembling of Deep Networks

During my PhD, I analyzed how ensembling via weight averaging can improve out-of-distribution generalization and alignment.

Beyond task performance: evaluating and reducing the flaws of large multimodal models with in-context-learning

We investigate large multimodal models and their limitations such as hallucinations and lack of explainability. We then show that multimodal in-context learning can reduce some of these flaws.

Rewarded soups: towards Pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards

We introduce rewarded soup, a new strategy to trade-off between multiple rewards when fine-tuning foundation models with RLHF; we first learn one network for each reward, and then linearly interpolate their weights.

UnIVAL: Unified Model for Image, Video, Audio and Language Tasks

UnIVAL is a 0.25B-parameter unified model that is multitask pretrained on image and video-text data and target image, video and audio-text downstream tasks.

Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization

We propose a new fine-tuning strategy that improves OOD generalization in computer vision by recycling and averaging weights specialized on diverse auxiliary tasks.

Diverse Weight Averaging for Out-of-Distribution Generalization

To improve out-of-distribution generalization on DomainBed, we average diverse weights obtained from different training runs; this strategy is motivated by an extension of the bias-variance theory to weight averaging.

DyTox: Transformers for Continual Learning with DYnamic TOken eXpansion

We propose a new dynamic transformer architecture for continual learning with state-of-the-art performances.

Fishr: Invariant Gradient Variances for Out-of-Distribution Generalization

We introduce and motivate a new regularization that enforces invariance in the domain-level gradient variances across the different training domains in order to improve out-of-distribution generalization.

MixMo: Mixing Multiple Inputs for Multiple Outputs via Deep Subnetworks

We introduce a new generalized framework for learning multi-input multi-output subnetworks and study how to best mix the inputs. We obtain sota on CIFAR and Tiny ImageNet by better leveraging the expressiveness of large networks.

Talks/Posts

PRAIRIE Artificial Intelligence Summer School: Key Takeaways   Medium
Semi-supervised Learning for Multilingual Sentence Representation   PDF   Video

Teaching

Deep Learning for Computer Vision
Deep Learning
Mathematics