The recipe behind OpenAI’s reasoning models has been a well kept secret. That is, until earlier this year, when DeepSeek released their DeepSeek-R1 model and promptly broke the internet. While a detailed technical report was published, many open questions remain, chief among them the training data, which was not released. Open-R1 is Hugging Face's fully open effort to replicate DeepSeek-R1, with a strong focus on reasoning data curation.