🎉 Try the GOSIM Ticket 🎟 Lucky Draw! ✨
Filter

Schedule: May 7

Filter:

All
AI Model
AI Infra
AI Apps
Embodied AI
PyTorch Day France
All
AI Model
AI Infra
AI Apps
Embodied AI
PyTorch Day France
  • May 7

    9:30 - 10:00

    Keynote

    PyTorch Day France
    GOSIM Keynote
  • May 7

    10:00 - 10:30

    Morning Coffee

    PyTorch Day France
  • May 7

    10:30 - 11:10

    Multilingualism of Qwen: From Foundation Model to Applications

    AI Model
    Multilingual and cross-lingual capabilities significantly boost the flexibility and usefulness of large language models (LLMs). Using Qwen as an example, we'll explore methods to enhance multilingual performance in LLMs, including pre-training, post-training, and evaluation strategies. Additionally, we'll examine the real-world applications of these advancements, demonstrating how multilingual capabilities can create practical solutions that overcome language barriers and promote smooth communication.
  • May 7

    10:30 - 11:10

    AI Open Source for Good: Inclusive Access, Equitable Data, and Accessible Compute

    AI Infra
    This talk unveils how open source technologies act as catalysts for equitable AI across three pillars. First, inclusive access: We open-source voice datasets tailored for underrepresented groups—such as children and the elderly—to ensure multimodal AI systems understand diverse linguistic patterns and bridge generational divides. Second, equitable data: we have released nearly 100 globally accessible datasets, amassing over 680,000 downloads, empowering developers from any countries to innovate freely. Third, accessible compute: We present FlagOS, an open-source system software that facilitates AI development and deployment across diverse hardware ecosystems—including legacy GPUs and emerging accelerators—while significantly lowering the cost barrier to AI innovation. Collectively, these open-source efforts transform 'AI for Good' into a shared mission—breaking barriers of age, location, and resources to empower anyone to create and benefit from AI.
  • May 7

    10:30 - 11:10

    Finding the Scaling Law of Agents

    AI Apps
    This talk explores the potential of building scalable techniques to facilitate autonomous cooperation among communicative agents. To address the challenges of achieving autonomous cooperation, we propose a novel communicative agent framework named CAMEL. We showcase how role-playing can be used to generate conversational data for studying the behaviors and capabilities of a society of agents. In particular, we conduct comprehensive studies on cooperation in multi-agent settings. Our contributions include introducing a novel communicative agent framework, offering a scalable approach for studying the cooperative behaviors and capabilities of multi-agent systems, and open-sourcing our library to support research on communicative agents and beyond: https://github.com/camel-ai/camel.
  • May 7

    10:30 - 11:10

    How to Build Your Humanoid

    Embodied AI
    In 2021, desperate need for human connection led me to the creation of a 16-DOF data glove. Open-sourcing it made the glove find its way to Rob Knight, creator of the 16-DOF DexHand, who had just begun developing a humanoid with Rémi Cadène's LeRobot.
  • May 7

    10:30 - 10:50

    Welcome & Opening Remarks

    PyTorch Day France
  • May 7

    11:10 - 11:50

    Open Foundation Models: Scaling Laws and Generalization

    AI Model
    To study transferable learning and generalization, derivation of reproducible scaling laws is crucial. We highlight why open foundation models and datasets are essential for this research and highlight challenges in properly measuring generalization.
  • May 7

    11:10 - 11:50

    The Best Practice of Training and Inferencing on Ascend CANN

    AI Infra
    The AI-oriented, heterogeneous Compute Architecture for Neural Networks (CANN) is a key platform for improving the computing efficiency of Ascend AI processors. It serves as a bridge between upper-layer AI frameworks and lower-layer AI processors and programming. This topic will focus on OpenSource ecosystem about CANN, shows how CANN helps AI sofeware, such as pytorch, vllm and so on, efficiently running on Ascend.
  • May 7

    11:10 - 11:50

    OpenManus: Empowering LLM-based Agent Applications Via Framework and Capability Evolution

    AI Apps
    We introduce OpenManus, a lightweight and versatile LLM-based multi-agent framework evolved from MetaGPT, designed to enhance adaptability, autonomy, and scalability through advanced reasoning, planning, and effective cross-environment operation.
  • May 7

    11:10 - 11:50

    G1 Open Source Dataset and Humanoid Robot from Unitree Robotics

    Embodied AI
    With artificial intelligence technology move very fast in the past two years, humanoid robot have been one of the most import form to realized embodied AI and AGI, Unitree have been working for more than 8 years in leg robot and 1 year in humanoid robot area. There are three most important parts, algorithm, data and computing capability for realized AGI. Those three part will finally running on physical robots, we believe build robust physical humanoid robot system is key for this ecosystem, and World Large-Scale Model (most people called foundation model) is the key to bring Embodied AI for for humanoid robot, we will share the most important progressing have been made on industry and research side in the past one year, and expect and excited for new progressing will happening in next few years soon. In order to promote the development of the global embodied AI industry, the Unitree G1 robot operation data set is open sourced, adapted to a variety of open source solutions, and continuously updated.
  • May 7

    11:30 - 11:50

    Scaling LLM Inference with vLLM: Multi‑Accelerator Serving and Quantized LLMs

    PyTorch Day France
    vLLM has become the community-standard engine for low-latency LLM inference, achieving a 10× increase in usage in 2024 and surpassing 100,000 daily installs by January 2025. Supported by hundreds of contributors and productized through Red Hat AI, vLLM provides a vendor-neutral solution for serving cutting-edge models at scale. This talk outlines a practical blueprint for scaling LLM inference using vLLM, integrating both system-level techniques and model-level optimizations.\r\nWe begin by addressing the challenges of deploying LLMs with chain-of-thought reasoning in production. Leveraging vLLM’s engine architecture, multi-accelerator deployments using tensor parallelism, paged attention scheduling, and prefill–decode disaggregation demonstrate how a single node can efficiently drive multiple AI accelerators, enhancing throughput without compromising latency.\r\nThe second optimization layer focuses on quantization. Based on over 500,000 evaluations across language and vision-language models, we examine the accuracy–speed trade-offs of weight and activation quantization. We introduce new pathways that significantly reduce memory usage while maintaining model quality. Attendees will leave with data-driven insights and ready-to-use configurations for deploying state-of-the-art quantized models in scalable enterprise inference pipelines.
  • May 7

    12:00 - 14:00

    Lunch Break

    PyTorch Day France
  • May 7

    14:00 - 14:40

    Automated Proof Generation for Rust Code Via Self-Evolution

    AI Model
    Ensuring correctness is crucial for code generation. Formal verification offers a definitive assurance of correctness, but demands substantial human effort in proof construction and hence raises a pressing need for automation. The primary obstacle lies in the severe lack of data—there are much fewer proofs than code snippets for Large Language Models (LLMs) to train upon. In this paper, we introduce SAFE, a framework that overcomes the lack of human-written proofs to enable automated proof generation of Rust code. SAFE establishes a self-evolving cycle where data synthesis and fine-tuning collaborate to enhance the model capability, leveraging the definitive power of a symbolic verifier in telling correct proofs from incorrect ones. SAFE also re-purposes the large number of synthesized incorrect proofs to train the self-debugging capability of the fine-tuned models, empowering them to fix incorrect proofs based on the verifier’s feedback. SAFE demonstrates superior efficiency and precision compared to GPT-4o. Through tens of thousands of synthesized proofs and the self-debugging mechanism, we improve the capability of open-source models, initially unacquainted with formal verification, to automatically write proofs for Rust code. This advancement leads to a significant improvement in performance, achieving a 52.52% accuracy rate in a benchmark crafted by human experts, a significant leap over GPT-4o’s performance of 14.39%.
  • May 7

    14:00 - 14:40

    SGLang: Efficient LLM Serving Engine

    AI Infra
    SGLang is a fast serving engine for LLMs and VLMs. It's fully open-source, incubated by LMSYS Org, with 300+ contributors worldwide. In this talk, we will introduce the key features and performance optimizations in SGLang.
  • May 7

    14:00 - 14:40

    Pollen Robotic

    Embodied AI
    To Be Announced
  • May 7

    14:00 - 14:20

    Llama 4

    PyTorch Day France
    This presentation explores the development of Llama 4, a state-of-the-art foundation model designed to excel in various tasks. We will discuss its key features, including long=context and multimodal understanding. We will also examine Llama 4's potential uses in agentic settings, such as autonomous decision-making and human-AI collaboration, through real-world examples and case studies.
  • May 7

    14:40 - 15:20

    Demysifying LLM Training --- Towards Fully Open-source LLM from Pre-training to Reinforcement Learning

    AI Model
    Recently, Large Language Models (LLMs) have undergone a significant transformation, marked by a rapid rise in both their popularity and capabilities. Leading this evolution are proprietary LLMs like GPT-4 and GPT-o1, which have captured widespread attention in the AI community for their power and versatility. Simultaneously, open-source LLMs, such as LLaMA and Mistral, have made great contributions to the ever-increasing popularity of LLMs due to the ease to customize and deploy the models across various applications. Although LLMs offer unprecedented opportunities for research and innovation, its commercialization has raised concerns about transparency, reproducibility, and safety. Many open LLM models lack the necessary components (such as training code and data) for full understanding and reproducibility, and some use restrictive licenses whilst claiming to be “open-source”, which may hinder further innovations on LLMs. To mitigate this issue, we follow the Model Openness Framework (MOF), a ranked classification system that rates machine learning models based on their completeness and openness, following principles of open science, open source, open data, and open access. We present a truly open source LLM Moxin 7B and release pre-training code and configurations, training and fine-tuning data, and intermediate and final checkpoints, aiming to make continuous commitments to fully open-source LLMs. We also finetune the Moxin Base model with SOTA post-training framework and instruction data to obtain Moxin Instruct model. To improve the reasoning capability, we further finetune our model with chain-of-thought data distilled from DeepSeek R1, and then use Group Relative Policy Optimization, an efficient and effective reinforcement learning algorithm following DeepSeek R1, to finetune our model, leading to the Moxin Reasoning model.
  • May 7

    14:40 - 15:20

    Open-source Intelligent Computing Integrated Management and Utilization Foundational Software - SCOW and CraneSched

    AI Infra
    The Peking University Computing Center is dedicated to developing general foundational software for both supercomputing (HPC) and intelligent computing (AI computing). In the field of HPC and AI computing, it has developed several flagship foundational software systems, including SCOW and CraneSched. OpenSCOW (https://github.com/PKUHPC/OpenSCOW) provides a graphical user interface (GUI) that allows developers to flexibly manage supercomputing and AI computing resources for AI model training and inference. It has already been deployed across 56 computing centers, including 34 universities and 12 enterprises in China. CraneSched ( https://github.com/PKUHPC/CraneSched) is a high-performance scheduling and orchestration system for HPC and AI computing tasks. It supports large-scale model training with exceptional performance and has been adopted by 8 universities and 1 enterprise in China.
  • May 7

    14:40 - 15:20

    OAKS: The Open Agentic AI Knowledge Stack

    AI Apps
    In this talk, we present an OSS AI architecture for Agentic AI+Knowledge. Encapsulating business knowledge is key for agents, and focusing on AI memory and scalable frameworks around Knowledge Graphs is a good foundation to build an OSS AI ecosystem for agents.
  • May 7

    14:40 - 15:20

    Learning from Human Demonstrations: A New Paradigm for Scalable Robot Data Acquisition

    Embodied AI
    Acquiring diverse and large-scale real-world robot data remains a critical bottleneck in training generalizable robotic action models. Efficient and scalable data collection has thus emerged as a key research focus in robotics. A widely used method is teleoperation, where humans either wear VR devices or operate a secondary robot to guide actions. While effective, these approaches are limited by hardware-specific constraints and require complex setups, hindering scalability. An emerging alternative is to learn directly from human demonstrations without relying on teleoperation hardware. This paradigm allows robots to acquire task-relevant motion data by observing or interpreting natural human movements, offering a more flexible and hardware-agnostic solution. In this talk, I will introduce a novel framework for robot data acquisition from human demonstrations. I will detail how it bypasses traditional teleoperation limitations and enables scalable learning across varied tasks and environments. By bridging the gap between human intent and robot execution, this method opens a promising direction for general-purpose robotic learning in the real world.
  • May 7

    14:40 - 15:00

    The Ultra-Scale Talk: Scaling Training to Thousands of GPUs

    PyTorch Day France
    Training large language models (LLMs) demands more than just raw compute—it requires infrastructure, strategy, and a deep understanding of parallelism. What begins as a single-GPU prototype must eventually scale across thousands of devices, each step introducing new complexity. This talk dives into the practicalities of ultra-scale training. We'll explore how 5D parallelism—spanning data, tensor, pipeline, context, and expert dimensions—makes it possible to stretch a single training run across massive GPU clusters. Along the way, we’ll cover performance tuning, communication patterns, and architecture choices that impact throughput and hardware efficiency. A key reference for this session is the Ultra-Scale Playbook, which distills best practices and hard-earned lessons from real-world LLM scaling efforts. We’ll walk through highlights of the playbook, tying them into case studies, benchmarks, and hands-on recommendations. Scaling isn’t just about size—it’s about doing more with what you have. This webinar offers a comprehensive look at what it really takes to train state-of-the-art models at scale, designed for engineers, researchers, and practitioners ready to move beyond “it fits on one GPU” toward infrastructure that powers trillion-parameter models—efficiently, and at speed.
  • May 7

    15:00 - 15:20

    Teaching Mistral to Reason: Post-Training with PyTorch and NVIDIA

    PyTorch Day France
    Post-training techniques have become essential as demand for Reasoning AI systems explodes. This talk provides a practical overview of how to enhance the reasoning capabilities of open-weight models—using Mistral as a working example. We’ll explore the full pipeline: sourcing high-quality reasoning datasets, selecting the right model checkpoints, and using tools that extend the functionality of PyTorch like NVIDIA NeMo and TensorRT-LLM. Whether you’re working on chatbots, agents, or task-specific models, you’ll leave with a clear understanding of the tools and workflows to take advantage of open models.
  • May 7

    15:20 - 15:40

    Afternoon Coffee

    PyTorch Day France
  • May 7

    15:40 - 16:20

    Small but Mighty: How MiniCPM Made Breakthroughs in the Global Open-Source AI Landscape

    AI Model
    MiniCPM, nicknamed 'ModelBest's Little Steel Cannon'—which includes the large language model MiniCPM and the multimodal large model MiniCPM-V—has gained widespread recognition in the global AI community due to its highly efficient and cost-effective nature, embodying the principle of 'punching above its weight' These projects have cumulatively received over 26,000 stars on GitHub, with total downloads exceeding 7 million across the web, becoming benchmark works in the field of on-device AI.
  • May 7

    15:40 - 16:20

    Verl: Hybrid Controller-based RLHF System

    AI Infra
    verl is a flexible, efficient and production-ready RL training library for LLMs. This talk will share the ideas in designing a hybrid-controller system and the benefits of this system in efficient large-scale RL training.
  • May 7

    15:40 - 16:20

    Database for Agents Memory, The Right Way

    AI Apps
    In this session, we will explore best practices for leveraging serverless SQL databases to support the sophisticated memory requirements of AI agents. We will delve into essential technical requirements, including schema design considerations, efficient indexing strategies, consistency vs. availability trade-offs, handling real-time updates, and seamless integration with AI workflows. Additionally, we'll discuss common pitfalls, performance optimization techniques, and how to achieve cost-efficiency without sacrificing responsiveness or data integrity. Attendees will gain actionable insights into architecting robust, scalable memory storage solutions that enhance the capability, adaptability, and overall effectiveness of AI agents in production environments.
  • May 7

    15:40 - 16:20

    Human-AI Collaboration at the Edge

    Embodied AI
  • May 7

    15:40 - 16:00

    DeepSpeed – Efficient Training Scalability for Deep Learning Models

    PyTorch Day France
    Deep Learning (DL) is driving unprecedented progress across Artificial Intelligence domains, including natural language processing, vision, speech, and multimodal. Sustaining this rapid pace of AI revolution, however, requires practical solutions to the extreme demands of scaling on the compute, memory, communication, and storage components of modern computing hardware. To address this challenge, we created a deep learning optimization library called DeepSpeed to make distributed model training efficient, effective, and easy on commodity hardware. This talk will focus on DeepSpeed optimizations for improving compute, communication, and I/O of extreme-scale model training.
  • May 7

    16:00 - 16:20

    Advancing Mamba in PyTorch

    PyTorch Day France
    Mamba layers are efficient alternatives to standard attention: their training complexity is linear in sequence length, while inference is sequence-length-independent and only requires a small cache. I will discuss a selection of IBM's ongoing work in advancing the state of mamba training in pytorch, including: context-parallel training for long-sequence data, mamba + mixture-of-expert support with expert parallelism, torch-native associative scan ops, and improved DTensor op support.
  • May 7

    16:20 - 17:00

    Pre-training of Smol and Large LLM

    AI Model
    Explaining what's new in pre-training: optimization tricks, MoE, stability hacks, and handling long contexts—everything you need to build better LLMs.
  • May 7

    16:20 - 17:00

    Datasets and Infrastructure for DeepSeek-R1 Style Reinforcement Learning (GRPO)

    AI Infra
    We will walk through everything you need to know about the latest in reinforcement learning for LLMs, datasets and infrastructure, down to training your own small reasoning LLM that can write code locally.
  • May 7

    16:20 - 17:00

    Agentic Search

    AI Apps
    The talk covers basic concepts and use-cases of agentic search.
  • May 7

    16:20 - 17:00

    AI Empowers IoT Devices to Drive the Dual Engines of Industrial Transformation

    Embodied AI
    Amidst the contemporary surge of digital transformation, the symbiotic convergence of artificial intelligence (AI) and IoT devices has emerged as a pivotal catalyst for industrial evolution. AI's infusion of autonomous learning, intelligent decision-making, and seamless interaction capabilities into intelligent hardware has redefined the paradigm, elevating conventional tools to the status of sophisticated, intelligent collaborators. This technological metamorphosis is evident across a spectrum of applications, from the bespoke experiences delivered by smart home ecosystems to the pinpoint precision of operations within industrial automation frameworks. The ramifications of this fusion extend beyond mere enhancement; it has become a driving force propelling the digital reinvention of traditional industries and the emergence of new sectors. In this presentation, we will delve into the intricate dynamics of the integration trends between AI and IoT devices, explore groundbreaking technological innovations, examine a diverse array of application scenarios, and assess the profound and far-reaching impacts on industrial transformation. By doing so, we aim to peer into the future, where the potential for growth and innovation is boundless, and to chart a course that offers novel insights and strategic directions for the continued advancement of our industry.
  • May 7

    16:20 - 16:40

    Thunder: Supercharged PyTorch for Modern Hardware

    PyTorch Day France
    Modern GPUs like Hopper and Blackwell are fast, but only after careful optimization. Thunder compiles “education-style” PyTorch models into optimized, distributed PyTorch code. Through a composable plugin system, Thunder lets developers layer in kernel fusion, low-precision operations, memory optimizations, and flexible parallelism strategies, to achieve performance and scale while leaving the original PyTorch code unchanged. This talk will cover how Thunder bridges the gap between ease-of-use and peak performance, and enables teams to easily write custom code transformations to scale models efficiently, reduce GPU waste, and stay in control of their stack.
  • May 7

    16:40 - 17:00

    LLM Constrained Generation in PyTorch with Outlines

    PyTorch Day France
    Parsing errors, unexpected outputs. If you've felt the frustration of trying to wrangle LLMs into producing consistently formatted results, you've likely built complex post-processing pipelines and elaborate prompting schemes. What if there was a way to guarantee structured outputs without these workarounds? Enter structured outputs. In this talk, we'll explore how model outputs can be precisely constrained using formal specifications (e.g. JSON Schema), why this dramatically improves reliability, and how it reduces sensitivity to prompt engineering. We'll demonstrate advanced use cases using our open source library Outlines, which add structured outputs the `transformers`, `vllm`, etc inference libraries. By the end of the session, you'll understand how to implement these techniques in your applications today, enabling your models to generate flawless JSON with minimal latency overhead compared to unconstrained generation.
  • May 7

    17:00 - 18:00

    Spotlight Demos

    Embodied AI
    GOSIM AI Spotlight Finalists will present their projects in a short pitch.
  • May 7

    17:20 - 17:40

    To Be Announced

    PyTorch Day France
  • May 7

    17:40 - 18:00

    PyTorch x Transformers Journey: Pythonicity, Autodiff and Modularity Defining Modern AI

    PyTorch Day France
    The HuggingFace Transformers library is a flagship example of what makes PyTorch special: a dynamic, readable, and hackable framework that scales from quick experiments to production-ready architectures. It began as an implementation of BERT, continued to a ""one model, one file"" setup—ideal for iteration—and grew into a modular codebase now defining 315+ models. Transformers has become a reference implementation for the field: a source of truth for model architectures, behaviors, and pretraining conventions. Its evolution reflects PyTorch’s own: grounded in Pythonic values, but pragmatic enough to diverge when needed. PyTorch’s ecosystem has replaced entire toolchains. Scaling models has become simpler: torch.compile brings compiler-level speedups with minimal code changes, and new abstractions like DTensor offer serious performance gains without the low-level complexity. Both PyTorch and Transformers inherit Python’s spirit—clarity, flexibility, expressiveness—without being bound by it. PyTorch leans on ATen and C++ kernels under the hood; Transformers increasingly rely on optimized community kernels and hardware-aware implementations from the hub. Modularity and readability didn’t just improve maintainability—they grew the community. Lowering the barrier to entry encourages experimentation, contributions, and faster innovation. This talk tracks that journey—from how PyTorch enabled Transformers, to how the virtuous cycle of design, performance, and pragmatism continues to shape the tools driving modern AI.
  • May 7

    18:00 - 21:00

    Social Gathering

    Embodied AI
  • May 7

    9:30 - 10:00

    Keynote

    AI Model
  • May 7

    10:00 - 10:30

    Morning Coffee

    AI Model
  • May 7

    10:30 - 11:10

    Multilingualism of Qwen: From Foundation Model to Applications

    AI Model
    Multilingual and cross-lingual capabilities significantly boost the flexibility and usefulness of large language models (LLMs). Using Qwen as an example, we'll explore methods to enhance multilingual performance in LLMs, including pre-training, post-training, and evaluation strategies. Additionally, we'll examine the real-world applications of these advancements, demonstrating how multilingual capabilities can create practical solutions that overcome language barriers and promote smooth communication.
  • May 7

    11:10 - 11:50

    Open Foundation Models: Scaling Laws and Generalization

    AI Model
    To study transferable learning and generalization, derivation of reproducible scaling laws is crucial. We highlight why open foundation models and datasets are essential for this research and highlight challenges in properly measuring generalization.
  • May 7

    12:00 - 14:00

    Lunch Break

    AI Model
  • May 7

    14:00 - 14:40

    Automated Proof Generation for Rust Code Via Self-Evolution

    AI Model
    Ensuring correctness is crucial for code generation. Formal verification offers a definitive assurance of correctness, but demands substantial human effort in proof construction and hence raises a pressing need for automation. The primary obstacle lies in the severe lack of data—there are much fewer proofs than code snippets for Large Language Models (LLMs) to train upon. In this paper, we introduce SAFE, a framework that overcomes the lack of human-written proofs to enable automated proof generation of Rust code. SAFE establishes a self-evolving cycle where data synthesis and fine-tuning collaborate to enhance the model capability, leveraging the definitive power of a symbolic verifier in telling correct proofs from incorrect ones. SAFE also re-purposes the large number of synthesized incorrect proofs to train the self-debugging capability of the fine-tuned models, empowering them to fix incorrect proofs based on the verifier’s feedback. SAFE demonstrates superior efficiency and precision compared to GPT-4o. Through tens of thousands of synthesized proofs and the self-debugging mechanism, we improve the capability of open-source models, initially unacquainted with formal verification, to automatically write proofs for Rust code. This advancement leads to a significant improvement in performance, achieving a 52.52% accuracy rate in a benchmark crafted by human experts, a significant leap over GPT-4o’s performance of 14.39%.
  • May 7

    14:40 - 15:20

    Demysifying LLM Training --- Towards Fully Open-source LLM from Pre-training to Reinforcement Learning

    AI Model
    Recently, Large Language Models (LLMs) have undergone a significant transformation, marked by a rapid rise in both their popularity and capabilities. Leading this evolution are proprietary LLMs like GPT-4 and GPT-o1, which have captured widespread attention in the AI community for their power and versatility. Simultaneously, open-source LLMs, such as LLaMA and Mistral, have made great contributions to the ever-increasing popularity of LLMs due to the ease to customize and deploy the models across various applications. Although LLMs offer unprecedented opportunities for research and innovation, its commercialization has raised concerns about transparency, reproducibility, and safety. Many open LLM models lack the necessary components (such as training code and data) for full understanding and reproducibility, and some use restrictive licenses whilst claiming to be “open-source”, which may hinder further innovations on LLMs. To mitigate this issue, we follow the Model Openness Framework (MOF), a ranked classification system that rates machine learning models based on their completeness and openness, following principles of open science, open source, open data, and open access. We present a truly open source LLM Moxin 7B and release pre-training code and configurations, training and fine-tuning data, and intermediate and final checkpoints, aiming to make continuous commitments to fully open-source LLMs. We also finetune the Moxin Base model with SOTA post-training framework and instruction data to obtain Moxin Instruct model. To improve the reasoning capability, we further finetune our model with chain-of-thought data distilled from DeepSeek R1, and then use Group Relative Policy Optimization, an efficient and effective reinforcement learning algorithm following DeepSeek R1, to finetune our model, leading to the Moxin Reasoning model.
  • May 7

    15:20 - 15:40

    Afternoon Coffee

    AI Model
  • May 7

    15:40 - 16:20

    Small but Mighty: How MiniCPM Made Breakthroughs in the Global Open-Source AI Landscape

    AI Model
    MiniCPM, nicknamed 'ModelBest's Little Steel Cannon'—which includes the large language model MiniCPM and the multimodal large model MiniCPM-V—has gained widespread recognition in the global AI community due to its highly efficient and cost-effective nature, embodying the principle of 'punching above its weight' These projects have cumulatively received over 26,000 stars on GitHub, with total downloads exceeding 7 million across the web, becoming benchmark works in the field of on-device AI.
  • May 7

    16:20 - 17:00

    Pre-training of Smol and Large LLM

    AI Model
    Explaining what's new in pre-training: optimization tricks, MoE, stability hacks, and handling long contexts—everything you need to build better LLMs.
  • May 7

    17:00 - 18:00

    Spotlight Demos

    AI Model
    GOSIM AI Spotlight Finalists will present their projects in a short pitch.
  • May 7

    18:00 - 21:00

    Social Gathering

    AI Model
  • May 7

    9:30 - 10:00

    Keynote

    AI Infra
  • May 7

    10:00 - 10:30

    Morning Coffee

    AI Infra
  • May 7

    10:30 - 11:10

    AI Open Source for Good: Inclusive Access, Equitable Data, and Accessible Compute

    AI Infra
    This talk unveils how open source technologies act as catalysts for equitable AI across three pillars. First, inclusive access: We open-source voice datasets tailored for underrepresented groups—such as children and the elderly—to ensure multimodal AI systems understand diverse linguistic patterns and bridge generational divides. Second, equitable data: we have released nearly 100 globally accessible datasets, amassing over 680,000 downloads, empowering developers from any countries to innovate freely. Third, accessible compute: We present FlagOS, an open-source system software that facilitates AI development and deployment across diverse hardware ecosystems—including legacy GPUs and emerging accelerators—while significantly lowering the cost barrier to AI innovation. Collectively, these open-source efforts transform 'AI for Good' into a shared mission—breaking barriers of age, location, and resources to empower anyone to create and benefit from AI.
  • May 7

    11:10 - 11:50

    The Best Practice of Training and Inferencing on Ascend CANN

    AI Infra
    The AI-oriented, heterogeneous Compute Architecture for Neural Networks (CANN) is a key platform for improving the computing efficiency of Ascend AI processors. It serves as a bridge between upper-layer AI frameworks and lower-layer AI processors and programming. This topic will focus on OpenSource ecosystem about CANN, shows how CANN helps AI sofeware, such as pytorch, vllm and so on, efficiently running on Ascend.
  • May 7

    12:00 - 14:00

    Lunch Break

    AI Infra
  • May 7

    14:00 - 14:40

    SGLang: Efficient LLM Serving Engine

    AI Infra
    SGLang is a fast serving engine for LLMs and VLMs. It's fully open-source, incubated by LMSYS Org, with 300+ contributors worldwide. In this talk, we will introduce the key features and performance optimizations in SGLang.
  • May 7

    14:40 - 15:20

    Open-source Intelligent Computing Integrated Management and Utilization Foundational Software - SCOW and CraneSched

    AI Infra
    The Peking University Computing Center is dedicated to developing general foundational software for both supercomputing (HPC) and intelligent computing (AI computing). In the field of HPC and AI computing, it has developed several flagship foundational software systems, including SCOW and CraneSched. OpenSCOW (https://github.com/PKUHPC/OpenSCOW) provides a graphical user interface (GUI) that allows developers to flexibly manage supercomputing and AI computing resources for AI model training and inference. It has already been deployed across 56 computing centers, including 34 universities and 12 enterprises in China. CraneSched ( https://github.com/PKUHPC/CraneSched) is a high-performance scheduling and orchestration system for HPC and AI computing tasks. It supports large-scale model training with exceptional performance and has been adopted by 8 universities and 1 enterprise in China.
  • May 7

    15:20 - 15:40

    Afternoon Coffee

    AI Infra
  • May 7

    15:40 - 16:20

    Verl: Hybrid Controller-based RLHF System

    AI Infra
    verl is a flexible, efficient and production-ready RL training library for LLMs. This talk will share the ideas in designing a hybrid-controller system and the benefits of this system in efficient large-scale RL training.
  • May 7

    16:20 - 17:00

    Datasets and Infrastructure for DeepSeek-R1 Style Reinforcement Learning (GRPO)

    AI Infra
    We will walk through everything you need to know about the latest in reinforcement learning for LLMs, datasets and infrastructure, down to training your own small reasoning LLM that can write code locally.
  • May 7

    17:00 - 18:00

    Spotlight Demos

    AI Infra
    GOSIM AI Spotlight Finalists will present their projects in a short pitch.
  • May 7

    9:30 - 10:00

    Keynote

    AI Apps
  • May 7

    10:00 - 10:30

    Morning Coffee

    AI Apps
  • May 7

    10:30 - 11:10

    Finding the Scaling Law of Agents

    AI Apps
    This talk explores the potential of building scalable techniques to facilitate autonomous cooperation among communicative agents. To address the challenges of achieving autonomous cooperation, we propose a novel communicative agent framework named CAMEL. We showcase how role-playing can be used to generate conversational data for studying the behaviors and capabilities of a society of agents. In particular, we conduct comprehensive studies on cooperation in multi-agent settings. Our contributions include introducing a novel communicative agent framework, offering a scalable approach for studying the cooperative behaviors and capabilities of multi-agent systems, and open-sourcing our library to support research on communicative agents and beyond: https://github.com/camel-ai/camel.
  • May 7

    11:10 - 11:50

    OpenManus: Empowering LLM-based Agent Applications Via Framework and Capability Evolution

    AI Apps
    We introduce OpenManus, a lightweight and versatile LLM-based multi-agent framework evolved from MetaGPT, designed to enhance adaptability, autonomy, and scalability through advanced reasoning, planning, and effective cross-environment operation.
  • May 7

    12:00 - 14:00

    Lunch Break

    AI Apps
  • May 7

    14:00 - 14:40

    To Be Announced

    AI Apps
  • May 7

    14:40 - 15:20

    OAKS: The Open Agentic AI Knowledge Stack

    AI Apps
    In this talk, we present an OSS AI architecture for Agentic AI+Knowledge. Encapsulating business knowledge is key for agents, and focusing on AI memory and scalable frameworks around Knowledge Graphs is a good foundation to build an OSS AI ecosystem for agents.
  • May 7

    15:20 - 15:40

    Afternoon Coffee

    AI Apps
  • May 7

    15:40 - 16:20

    Database for Agents Memory, The Right Way

    AI Apps
    In this session, we will explore best practices for leveraging serverless SQL databases to support the sophisticated memory requirements of AI agents. We will delve into essential technical requirements, including schema design considerations, efficient indexing strategies, consistency vs. availability trade-offs, handling real-time updates, and seamless integration with AI workflows. Additionally, we'll discuss common pitfalls, performance optimization techniques, and how to achieve cost-efficiency without sacrificing responsiveness or data integrity. Attendees will gain actionable insights into architecting robust, scalable memory storage solutions that enhance the capability, adaptability, and overall effectiveness of AI agents in production environments.
  • May 7

    16:20 - 17:00

    Agentic Search

    AI Apps
    The talk covers basic concepts and use-cases of agentic search.
  • May 7

    17:00 - 18:00

    Spotlight Demos

    AI Apps
    GOSIM AI Spotlight Finalists will present their projects in a short pitch.
  • May 7

    18:00 - 21:00

    Social Gathering

    AI Apps
  • May 7

    9:30 - 10:00

    Keynote

    Embodied AI
  • May 7

    10:00 - 10:30

    Morning Coffee

    Embodied AI
  • May 7

    10:30 - 11:10

    How to Build Your Humanoid

    Embodied AI
    In 2021, desperate need for human connection led me to the creation of a 16-DOF data glove. Open-sourcing it made the glove find its way to Rob Knight, creator of the 16-DOF DexHand, who had just begun developing a humanoid with Rémi Cadène's LeRobot.
  • May 7

    11:10 - 11:50

    G1 Open Source Dataset and Humanoid Robot from Unitree Robotics

    Embodied AI
    With artificial intelligence technology move very fast in the past two years, humanoid robot have been one of the most import form to realized embodied AI and AGI, Unitree have been working for more than 8 years in leg robot and 1 year in humanoid robot area. There are three most important parts, algorithm, data and computing capability for realized AGI. Those three part will finally running on physical robots, we believe build robust physical humanoid robot system is key for this ecosystem, and World Large-Scale Model (most people called foundation model) is the key to bring Embodied AI for for humanoid robot, we will share the most important progressing have been made on industry and research side in the past one year, and expect and excited for new progressing will happening in next few years soon. In order to promote the development of the global embodied AI industry, the Unitree G1 robot operation data set is open sourced, adapted to a variety of open source solutions, and continuously updated.
  • May 7

    12:00 - 14:00

    Lunch Break

    Embodied AI
  • May 7

    14:00 - 14:40

    Pollen Robotic

    Embodied AI
    To Be Announced
  • May 7

    14:40 - 15:20

    Learning from Human Demonstrations: A New Paradigm for Scalable Robot Data Acquisition

    Embodied AI
    Acquiring diverse and large-scale real-world robot data remains a critical bottleneck in training generalizable robotic action models. Efficient and scalable data collection has thus emerged as a key research focus in robotics. A widely used method is teleoperation, where humans either wear VR devices or operate a secondary robot to guide actions. While effective, these approaches are limited by hardware-specific constraints and require complex setups, hindering scalability. An emerging alternative is to learn directly from human demonstrations without relying on teleoperation hardware. This paradigm allows robots to acquire task-relevant motion data by observing or interpreting natural human movements, offering a more flexible and hardware-agnostic solution. In this talk, I will introduce a novel framework for robot data acquisition from human demonstrations. I will detail how it bypasses traditional teleoperation limitations and enables scalable learning across varied tasks and environments. By bridging the gap between human intent and robot execution, this method opens a promising direction for general-purpose robotic learning in the real world.
  • May 7

    15:20 - 15:40

    Afternoon Coffee

    Embodied AI
  • May 7

    15:40 - 16:20

    Human-AI Collaboration at the Edge

    Embodied AI
  • May 7

    16:20 - 17:00

    AI Empowers IoT Devices to Drive the Dual Engines of Industrial Transformation

    Embodied AI
    Amidst the contemporary surge of digital transformation, the symbiotic convergence of artificial intelligence (AI) and IoT devices has emerged as a pivotal catalyst for industrial evolution. AI's infusion of autonomous learning, intelligent decision-making, and seamless interaction capabilities into intelligent hardware has redefined the paradigm, elevating conventional tools to the status of sophisticated, intelligent collaborators. This technological metamorphosis is evident across a spectrum of applications, from the bespoke experiences delivered by smart home ecosystems to the pinpoint precision of operations within industrial automation frameworks. The ramifications of this fusion extend beyond mere enhancement; it has become a driving force propelling the digital reinvention of traditional industries and the emergence of new sectors. In this presentation, we will delve into the intricate dynamics of the integration trends between AI and IoT devices, explore groundbreaking technological innovations, examine a diverse array of application scenarios, and assess the profound and far-reaching impacts on industrial transformation. By doing so, we aim to peer into the future, where the potential for growth and innovation is boundless, and to chart a course that offers novel insights and strategic directions for the continued advancement of our industry.
  • May 7

    17:00 - 18:00

    Spotlight Demos

    Embodied AI
    GOSIM AI Spotlight Finalists will present their projects in a short pitch.
  • May 7

    18:00 - 21:00

    Social Gathering

    Embodied AI
  • May 7

    9:30 - 10:00

    Keynote

    PyTorch Day France
    GOSIM Keynote
  • May 7

    10:00 - 10:30

    Morning Coffee

    PyTorch Day France
  • May 7

    10:30 - 10:50

    Welcome & Opening Remarks

    PyTorch Day France
  • May 7

    10:50 - 11:10

    To Be Announced

    PyTorch Day France
  • May 7

    11:10 - 11:30

    To Be Announced

    PyTorch Day France
  • May 7

    11:30 - 11:50

    Scaling LLM Inference with vLLM: Multi‑Accelerator Serving and Quantized LLMs

    PyTorch Day France
    vLLM has become the community-standard engine for low-latency LLM inference, achieving a 10× increase in usage in 2024 and surpassing 100,000 daily installs by January 2025. Supported by hundreds of contributors and productized through Red Hat AI, vLLM provides a vendor-neutral solution for serving cutting-edge models at scale. This talk outlines a practical blueprint for scaling LLM inference using vLLM, integrating both system-level techniques and model-level optimizations.\r\nWe begin by addressing the challenges of deploying LLMs with chain-of-thought reasoning in production. Leveraging vLLM’s engine architecture, multi-accelerator deployments using tensor parallelism, paged attention scheduling, and prefill–decode disaggregation demonstrate how a single node can efficiently drive multiple AI accelerators, enhancing throughput without compromising latency.\r\nThe second optimization layer focuses on quantization. Based on over 500,000 evaluations across language and vision-language models, we examine the accuracy–speed trade-offs of weight and activation quantization. We introduce new pathways that significantly reduce memory usage while maintaining model quality. Attendees will leave with data-driven insights and ready-to-use configurations for deploying state-of-the-art quantized models in scalable enterprise inference pipelines.
  • May 7

    12:00 - 14:00

    Lunch Break

    PyTorch Day France
  • May 7

    14:00 - 14:20

    Llama 4

    PyTorch Day France
    This presentation explores the development of Llama 4, a state-of-the-art foundation model designed to excel in various tasks. We will discuss its key features, including long=context and multimodal understanding. We will also examine Llama 4's potential uses in agentic settings, such as autonomous decision-making and human-AI collaboration, through real-world examples and case studies.
  • May 7

    14:20 - 14:40

    To Be Announced

    PyTorch Day France
  • May 7

    14:40 - 15:00

    The Ultra-Scale Talk: Scaling Training to Thousands of GPUs

    PyTorch Day France
    Training large language models (LLMs) demands more than just raw compute—it requires infrastructure, strategy, and a deep understanding of parallelism. What begins as a single-GPU prototype must eventually scale across thousands of devices, each step introducing new complexity. This talk dives into the practicalities of ultra-scale training. We'll explore how 5D parallelism—spanning data, tensor, pipeline, context, and expert dimensions—makes it possible to stretch a single training run across massive GPU clusters. Along the way, we’ll cover performance tuning, communication patterns, and architecture choices that impact throughput and hardware efficiency. A key reference for this session is the Ultra-Scale Playbook, which distills best practices and hard-earned lessons from real-world LLM scaling efforts. We’ll walk through highlights of the playbook, tying them into case studies, benchmarks, and hands-on recommendations. Scaling isn’t just about size—it’s about doing more with what you have. This webinar offers a comprehensive look at what it really takes to train state-of-the-art models at scale, designed for engineers, researchers, and practitioners ready to move beyond “it fits on one GPU” toward infrastructure that powers trillion-parameter models—efficiently, and at speed.
  • May 7

    15:00 - 15:20

    Teaching Mistral to Reason: Post-Training with PyTorch and NVIDIA

    PyTorch Day France
    Post-training techniques have become essential as demand for Reasoning AI systems explodes. This talk provides a practical overview of how to enhance the reasoning capabilities of open-weight models—using Mistral as a working example. We’ll explore the full pipeline: sourcing high-quality reasoning datasets, selecting the right model checkpoints, and using tools that extend the functionality of PyTorch like NVIDIA NeMo and TensorRT-LLM. Whether you’re working on chatbots, agents, or task-specific models, you’ll leave with a clear understanding of the tools and workflows to take advantage of open models.
  • May 7

    15:20 - 15:40

    Afternoon Coffee

    PyTorch Day France
  • May 7

    15:40 - 16:00

    DeepSpeed – Efficient Training Scalability for Deep Learning Models

    PyTorch Day France
    Deep Learning (DL) is driving unprecedented progress across Artificial Intelligence domains, including natural language processing, vision, speech, and multimodal. Sustaining this rapid pace of AI revolution, however, requires practical solutions to the extreme demands of scaling on the compute, memory, communication, and storage components of modern computing hardware. To address this challenge, we created a deep learning optimization library called DeepSpeed to make distributed model training efficient, effective, and easy on commodity hardware. This talk will focus on DeepSpeed optimizations for improving compute, communication, and I/O of extreme-scale model training.
  • May 7

    16:00 - 16:20

    Advancing Mamba in PyTorch

    PyTorch Day France
    Mamba layers are efficient alternatives to standard attention: their training complexity is linear in sequence length, while inference is sequence-length-independent and only requires a small cache. I will discuss a selection of IBM's ongoing work in advancing the state of mamba training in pytorch, including: context-parallel training for long-sequence data, mamba + mixture-of-expert support with expert parallelism, torch-native associative scan ops, and improved DTensor op support.
  • May 7

    16:20 - 16:40

    Thunder: Supercharged PyTorch for Modern Hardware

    PyTorch Day France
    Modern GPUs like Hopper and Blackwell are fast, but only after careful optimization. Thunder compiles “education-style” PyTorch models into optimized, distributed PyTorch code. Through a composable plugin system, Thunder lets developers layer in kernel fusion, low-precision operations, memory optimizations, and flexible parallelism strategies, to achieve performance and scale while leaving the original PyTorch code unchanged. This talk will cover how Thunder bridges the gap between ease-of-use and peak performance, and enables teams to easily write custom code transformations to scale models efficiently, reduce GPU waste, and stay in control of their stack.
  • May 7

    16:40 - 17:00

    LLM Constrained Generation in PyTorch with Outlines

    PyTorch Day France
    Parsing errors, unexpected outputs. If you've felt the frustration of trying to wrangle LLMs into producing consistently formatted results, you've likely built complex post-processing pipelines and elaborate prompting schemes. What if there was a way to guarantee structured outputs without these workarounds? Enter structured outputs. In this talk, we'll explore how model outputs can be precisely constrained using formal specifications (e.g. JSON Schema), why this dramatically improves reliability, and how it reduces sensitivity to prompt engineering. We'll demonstrate advanced use cases using our open source library Outlines, which add structured outputs the `transformers`, `vllm`, etc inference libraries. By the end of the session, you'll understand how to implement these techniques in your applications today, enabling your models to generate flawless JSON with minimal latency overhead compared to unconstrained generation.
  • May 7

    17:00 - 17:20

    To Be Announced

    PyTorch Day France
  • May 7

    17:20 - 17:40

    To Be Announced

    PyTorch Day France
  • May 7

    17:40 - 18:00

    PyTorch x Transformers Journey: Pythonicity, Autodiff and Modularity Defining Modern AI

    PyTorch Day France
    The HuggingFace Transformers library is a flagship example of what makes PyTorch special: a dynamic, readable, and hackable framework that scales from quick experiments to production-ready architectures. It began as an implementation of BERT, continued to a ""one model, one file"" setup—ideal for iteration—and grew into a modular codebase now defining 315+ models. Transformers has become a reference implementation for the field: a source of truth for model architectures, behaviors, and pretraining conventions. Its evolution reflects PyTorch’s own: grounded in Pythonic values, but pragmatic enough to diverge when needed. PyTorch’s ecosystem has replaced entire toolchains. Scaling models has become simpler: torch.compile brings compiler-level speedups with minimal code changes, and new abstractions like DTensor offer serious performance gains without the low-level complexity. Both PyTorch and Transformers inherit Python’s spirit—clarity, flexibility, expressiveness—without being bound by it. PyTorch leans on ATen and C++ kernels under the hood; Transformers increasingly rely on optimized community kernels and hardware-aware implementations from the hub. Modularity and readability didn’t just improve maintainability—they grew the community. Lowering the barrier to entry encourages experimentation, contributions, and faster innovation. This talk tracks that journey—from how PyTorch enabled Transformers, to how the virtuous cycle of design, performance, and pragmatism continues to shape the tools driving modern AI.
Paris

Grab your GOSIM AI Paris ticket

Paris, France

May 6-7, 2025

Paris, the City of Light, transforms into the City of Artificial Brilliance this May. GOSIM AI 2025 invites visionaries, disruptors, and pioneers to converge at Station F—a crucible of innovation—to shape the next frontier of AI.