Guilherme Penedo

ML Research Engineer at Hugging Face

Previously a member of the Falcon team, where he was in charge of creating the pretraining dataset for the first iteration of the Falcon LLM: RefinedWeb, Guilherme is now a member of the Hugging Face Science Team, where he works on improving pretraining datasets and led the FineWeb and FineWeb2 projects, two large scale datasets for LLM pretraining. More recently, he's been involved in Open-R1, Hugging Face's fully open effort to replicate the DeepSeek-R1 model.

Sessions

May 6

10:30 - 11:10

Open-R1: A Fully Open Reproduction of DeepSeek-R1

AI Model