Alexandre Ramé
Home
Publications
Talks/Posts
Teaching
Books
WARP: On the Benefits of Weight Averaged Rewarded Policies
**Alexandre Ramé**
,
Johan Ferret
,
Nino Vieillard
,
Robert Dadashi
,
Léonard Hussenot
,
Pierre-Louis Cedoz
,
Pier Giuseppe Sessa
,
Sertan Girgin
,
Arthur Douillard
,
Olivier Bachem
24 June, 2024
WARP: On the Benefits of Weight Averaged Rewarded Policies
**Alexandre Ramé**
,
Johan Ferret
,
Nino Vieillard
,
Robert Dadashi
,
Léonard Hussenot
,
Pierre-Louis Cedoz
,
Pier Giuseppe Sessa
,
Sertan Girgin
,
Arthur Douillard
,
Olivier Bachem
24 June, 2024
Date
June, 2024
Links
PDF
Cite
Slides
Next
Gemma 2: Improving Open Language Models at a Practical Size
Previous
WARM: On the Benefits of Weight Averaged Reward Models
Cite
×