Get Tickets
Filter

Schedule: May 6

Filter:

All
AI Model
AI Infra
AI Apps
Embodied AI
All
AI Model
AI Infra
AI Apps
Embodied AI
  • May 6

    9:30 - 10:00

    Keynote

    Embodied AI
  • May 6

    10:00 - 10:30

    Morning Coffee

    Embodied AI
  • May 6

    10:30 - 11:10

    Open-R1: A Fully Open Reproduction of DeepSeek-R1

    AI Model
    The recipe behind OpenAI’s reasoning models has been a well kept secret. That is, until earlier this year, when DeepSeek released their DeepSeek-R1 model and promptly broke the internet. While a detailed technical report was published, many open questions remain, chief among them the training data, which was not released. Open-R1 is Hugging Face's fully open effort to replicate DeepSeek-R1, with a strong focus on reasoning data curation.
  • May 6

    10:30 - 11:10

    Streamlining AI App Development with Docker: Models and AI Tools That Just Work

    AI Infra
    Discover how Docker’s Model Runner enables fast, local AI inference with GPU support, and how the Docker makes it easy to integrate LLMs and agents using MCP —no complex setup required.
  • May 6

    10:30 - 11:10

    AG2: The Open-Source AgentOS for Agentic AI

    AI Apps
    This presentation will examine the trend of agentic AI and the fundamental design considerations for agentic AI programming frameworks. It will introduce a pioneering initiative, AG2. It will explain the primary concepts and its application across a diverse range of tasks and industries.
  • May 6

    10:30 - 11:10

    Mind, Body and Zenoh

    Embodied AI
    As Robotics and Artificial Intelligence continue to evolve at an unprecedented pace, a critical gap has emerged in their ability to scale and operate seamlessly across diverse environments: the absence of an efficient, intelligent "nervous system." Much like biological organisms rely on their nervous system to connect the body and the brain, autonomous systems require a foundational layer that enables real-time communication, adaptability, and distributed cognition. This talk introduces Zenoh, a cutting-edge Open Source protocol that is rapidly gaining traction in the robotics community and beyond. Zenoh bridges the traditional divide between data in motion, data at rest, and computation, enabling seamless data exchange from the edge to the cloud. Zenoh it's the missing link to unify sensing, actuation, and cognition. It is, in essence, the nervous system for the intelligent robots age.
  • May 6

    11:10 - 11:50

    OpenSeek: Collaborative Innovation for Next-Gen Models

    AI Model
    OpenSeek aims to unite the global open-source community to drive collaborative innovation in algorithms, data, and systems to develop next-generation models that surpass DeepSeek.
  • May 6

    11:10 - 11:50

    Make Your LLMs Serverless

    AI Infra
    LLMs require GPUs, causing scarcity. Overprovisioning them is expensive and a waste. Google Cloud Run now offers serverless GPU support, enabling cost-effective LLM deployment. A live demo will compare Gemma model performance with and without GPUs.
  • May 6

    11:10 - 11:50

    CangjieMagic : New Choices for Developers in the Age of Large Models

    AI Apps
    With the surge in popularity of various AI large models, the trend of agent-oriented development in large model applications has become increasingly evident. Agents are gradually becoming a core element in the development of large model applications. This topic shares an AI large model Agent development framework based on the Cangjie programming language. This framework supports Agent-oriented programming, providing developers with an efficient Agent DSL for Agent programming. Its main features include: support for the MCP protocol to facilitate mutual invocation between Agents and tools, support for modular invocation, and support for intelligent task planning. It enhances the efficiency of developers in creating smart HarmonyOS applications, delivers an exceptional development experience, and explores new paradigms for future large model application development
  • May 6

    11:10 - 11:50

    Distributed Dataflows in Dora Using Zenoh

    Embodied AI
    The Dora framework makes it easy to create dataflows for robotics and AI. In this talk, we look into distributed dataflows that run on multiple machines and communicate through the network. Dora supports complex network topologies by using the Zenoh protocol. This way, it is possible to split Dora dataflows across private networks and cloud machines with minimal configuration.
  • May 6

    12:00 - 14:00

    Lunch Break

    Embodied AI
  • May 6

    14:00 - 14:40

    Decode DeepSeek: the Technological Innovation and Its Influence on AI Ecosystem

    AI Model
    Recently, DeepSeek has attracted a great deal of attention with its outstanding technological innovations and is set to have a profound and extensive impact on the AI industry. This speech is divided into two parts. In the first part, we will top-down walkthrough of DeepSeek’s technological innovations, including the paradigm shift in inference computing led by its open-source reinforcement learning solution, innovations in model architecture such as MLA and MOE, and performance optimizations in system engineering. In the second part, we will explore the transformations and impacts on global AI ecosystem triggered by DeepSeek, including its key influences on aspects such as AI applications、AI Agents, the computing power landscape, and AI open-source initiatives.
  • May 6

    14:00 - 14:40

    Kubetorch: A Modern Kubernetes-Native Approach to ML Execution and Production

    AI Infra
    Mature organizations run ML workloads on Kubernetes, but implementations vary widely, and ML engineers rarely enjoy the streamlined development and deployment experiences that platform engineering teams provide for software engineers. Making small changes takes an hour to test and moving from research to production frequently takes multiple weeks – these unergonomic and inefficient processes are unthinkable for software, but standard in ML. Kubetorch is an introduction of a novel compute platform that is Kubernetes-native that offers a great, iterable, and debuggable interface into powerful compute for developers, without introducing new pitfalls of brittle infrastructure or long deployment times.
  • May 6

    14:00 - 14:40

    TONGYI Lingma: from Coding Copilot to Coding Agent Based on Qwen Models

    AI Apps
    This presentation will take the perspective of intelligent development in the software engineering to outline and introduce the latest technological advancements and product applications of Code LLMs, Coding Copilot, and Coding Agents, as well as analyze and forecast future development trends.
  • May 6

    14:00 - 14:40

    Adversarial Safety-Critical Scenario Generation for Autonomous Driving

    Embodied AI
    valuating the decision-making system is indispensable in developing autonomous vehicles, while realistic and challenging safety-critical test scenarios play a crucial role. Obtaining these scenarios is non-trivial due to the long-tailed distribution, sparsity, and rarity in real-world data sets. To tackle this problem, we introduce a natural adversarial scenario generation solution using naturalistic human driving priors and reinforcement learning. Our experiments on public data sets demonstrate that our proposed model can generate realistic safety critical test scenarios covering both naturalness and adversariality with 44% efficiency gain over the baseline model.
  • May 6

    14:40 - 15:20

    Linear Next: The Evolution of LLM Architecture

    AI Model
    The Transformer architecture, despite its popularity, suffers from quadratic computational complexity. Recent advances in computing hardware, such as the V100 to H200 series, have temporarily alleviated this issue, reducing the immediate need for alternatives in the industry. Linear-complexity solutions for large models are still in the research phase, lacking widespread validation in practical applications. Consequently, Transformer remains the preferred choice. However, as improvements in computing power slow down, the demand for architectures that surpass Transformer in efficiency will grow. Our team has developed Lightning Attention, a novel mechanism based on linear attention. By rearranging the QKV multiplication order (Q(KV)), Lightning Attention achieves linear computational complexity relative to sequence length. Experiments show it significantly outperforms the latest Transformers in both efficiency and performance, validated on a 456B MoE model (MiniMax 01). This innovation paves the way for more efficient large language models, offering new possibilities for future development.
  • May 6

    14:40 - 15:20

    RAGFlow: Leading the Open-Source Revolution in Enterprise-Grade Retrieval-Augmented Generation

    AI Infra
    RAGFlow tackles core RAG challenges—data quality, semantic gaps, low hit rates—with open-source solutions. This talk demonstrates enhanced retrieval, reasoning, and multimodal capabilities for robust enterprise AI applications.
  • May 6

    14:40 - 15:20

    Unifying AI Integration with Model Context Protocol

    AI Apps
    The Model Context Protocol (MCP) standardizes how AI models interact with external tools and resources through a structured client-server architecture, facilitating robust agent development. The engineering community worldwide keeps sharing MCP servers that enable client interactions that open truly remarkable and innovative applications. This talk delves into MCP's core capabilities and how the MCP Java SDK combined with Spring AI MCP can integrate AI with your existing resources and applications. Today's intelligent agents can understand context, guide decisions, and integrate seamlessly with external services. Through live coding and practical examples, we will illustrate how to implement both client and server components. By attending this session, you will gain a practical understanding of MCP's standardized interfaces and architectural best practices, empowering you to build and extend AI-powered applications with agent-like capabilities. Whether you're developing new AI-driven solutions or enhancing existing systems, this talk will equip you with the tools and strategies needed to leverage MCP effectively.
  • May 6

    14:40 - 15:20

    Integrating Feedback and Learning in Closed Loop: Toward General-Purpose Embodied AI Systems

    Embodied AI
    A crucial piece missing from the foundation model for Embodied AI (EAI) is plasticity—the ability of continual learning without human intervention. While the emergence of In-Context Learning (ICL) has been pivotal to the success of Large Language Models (LLMs), its limitations and underlying mechanisms remain underexplored. This study illuminates the potential of large-scale meta-training, which prioritizes acquiring general-purpose ICL capabilities over mastering specific skills. We believe this technique could form a cornerstone of the next generation of general-purpose foundation models for EAI. Additionally, we introduce two open-source projects that are designed to advance the development of these foundation models.
  • May 6

    15:20 - 15:40

    Afternoon Coffee

    Embodied AI
  • May 6

    15:40 - 16:20

    The Curse of Depth in Large Language Models

    AI Model
    Large Language Models (LLMs) have demonstrated impressive achievements. However, recent research has shown that their deeper layers often contribute minimally, with effectiveness diminishing as layer depth increases. This pattern presents significant opportunities for model compression. In the first part of this seminar, we will explore how this phenomenon can be harnessed to improve the efficiency of LLM compression. Despite these opportunities, the underutilization of deeper layers leads to inefficiencies, wasting resources that could be better used to enhance model performance. The second part of the talk will address the root cause of this ineffectiveness in deeper layers and propose a solution. We identify the issue as stemming from the prevalent use of Pre-Layer Normalization (Pre-LN) and introduce LayerNorm Scaling to solve this issue.
  • May 6

    15:40 - 16:20

    Khronos in the World of Open Source and Machine Learning

    AI Infra
  • May 6

    15:40 - 16:20

    Using AI to Vibe Code Rust UI's for Mobile, Web and Mixed Reality

    AI Apps
    In this talk i will show vibecoding makepad UIs and UI shaders with Makepad Studio and an LLM. Makepad Studio is our visual design / code environment and the vision is to bring back Visual Basic, but now for a modern language: Rust.
  • May 6

    15:40 - 16:20

    Building Robotic Applications with Open-source VLA Models

    Embodied AI
    Ville shares the successes and challenges in using open-source Vision-Language-Action (VLA) models on robots, and provides a full-stack "starter guide" for building VLA-powered robotic applications in 2025.
  • May 6

    16:20 - 17:00

    Going Beyond Tokens for Code Large Language Models

    AI Model
    Tokenization in LLMs is the last bit of clunkiness in an otherwise elegant, highly-optimized architecture. This talk presents interesting avenues in tokenizer-free architecture to go "beyond tokens" in order to reduce latency and improve performance.
  • May 6

    16:20 - 17:00

    WGML: the Story of Building a New High-performance, Cross-platform, On-device Inference Framework.

    AI Infra
    Ever dreamed of writing a new low-level LLM inference library? Check out the making of WGML, an open-source high-performance cross-platform GPU inference framework using Rust and WebGPU. We will cover tips for discovering and implementing LLM inference.
  • May 6

    16:20 - 17:00

    Tech Together, Powered by Dify

    AI Apps
    Dify is a next-generation AI-native application development platform that bridges cutting-edge technology with real-world business value. Built on a robust open-source foundation, Dify integrates modern tech stacks—including LLM orchestration, RAG (Retrieval-Augmented Generation), fine-tuning tools, and multi-agent workflows—to simplify the creation, deployment, and scaling of AI applications. Our global developer community has become a hub for innovation, with thousands of contributors and enterprise adopters leveraging Dify to build everything from intelligent chatbots to enterprise-grade automation systems. This talk will highlight: Dify’s open-source ecosystem and its role in accelerating AI adoption; Key technologies powering the platform and how they solve real-world challenges; Success stories from developers and enterprises.
  • May 6

    16:20 - 17:00

    Spatial Reasoning LLM: Enhancing 2D & 3D Understanding for Robotic Manipulation and Navigation

    Embodied AI
    Robotic systems require advanced spatial reasoning for navigation and manipulation. We introduce a research project enhancing LLMs for spatial intelligence: AlphaMaze, solving 2D mazes with self-correction; AlphaSpace, interpreting object positions for robot hand manipulation via language; and AlphaVoxel, using 3D voxel space for object recognition and robot navigation.
  • May 6

    17:00 - 18:00

    Spotlight Demos

    Embodied AI
    GOSIM AI Spotlight Finalists will present their projects in a short pitch.
  • May 6

    18:00 - 21:00

    VIP Dinner

    Embodied AI
  • May 6

    9:30 - 10:00

    Keynote

    AI Model
  • May 6

    10:00 - 10:30

    Morning Coffee

    AI Model
  • May 6

    10:30 - 11:10

    Open-R1: A Fully Open Reproduction of DeepSeek-R1

    AI Model
    The recipe behind OpenAI’s reasoning models has been a well kept secret. That is, until earlier this year, when DeepSeek released their DeepSeek-R1 model and promptly broke the internet. While a detailed technical report was published, many open questions remain, chief among them the training data, which was not released. Open-R1 is Hugging Face's fully open effort to replicate DeepSeek-R1, with a strong focus on reasoning data curation.
  • May 6

    11:10 - 11:50

    OpenSeek: Collaborative Innovation for Next-Gen Models

    AI Model
    OpenSeek aims to unite the global open-source community to drive collaborative innovation in algorithms, data, and systems to develop next-generation models that surpass DeepSeek.
  • May 6

    12:00 - 14:00

    Lunch Break

    AI Model
  • May 6

    14:00 - 14:40

    Decode DeepSeek: the Technological Innovation and Its Influence on AI Ecosystem

    AI Model
    Recently, DeepSeek has attracted a great deal of attention with its outstanding technological innovations and is set to have a profound and extensive impact on the AI industry. This speech is divided into two parts. In the first part, we will top-down walkthrough of DeepSeek’s technological innovations, including the paradigm shift in inference computing led by its open-source reinforcement learning solution, innovations in model architecture such as MLA and MOE, and performance optimizations in system engineering. In the second part, we will explore the transformations and impacts on global AI ecosystem triggered by DeepSeek, including its key influences on aspects such as AI applications、AI Agents, the computing power landscape, and AI open-source initiatives.
  • May 6

    14:40 - 15:20

    Linear Next: The Evolution of LLM Architecture

    AI Model
    The Transformer architecture, despite its popularity, suffers from quadratic computational complexity. Recent advances in computing hardware, such as the V100 to H200 series, have temporarily alleviated this issue, reducing the immediate need for alternatives in the industry. Linear-complexity solutions for large models are still in the research phase, lacking widespread validation in practical applications. Consequently, Transformer remains the preferred choice. However, as improvements in computing power slow down, the demand for architectures that surpass Transformer in efficiency will grow. Our team has developed Lightning Attention, a novel mechanism based on linear attention. By rearranging the QKV multiplication order (Q(KV)), Lightning Attention achieves linear computational complexity relative to sequence length. Experiments show it significantly outperforms the latest Transformers in both efficiency and performance, validated on a 456B MoE model (MiniMax 01). This innovation paves the way for more efficient large language models, offering new possibilities for future development.
  • May 6

    15:20 - 15:40

    Afternoon Coffee

    AI Model
  • May 6

    15:40 - 16:20

    The Curse of Depth in Large Language Models

    AI Model
    Large Language Models (LLMs) have demonstrated impressive achievements. However, recent research has shown that their deeper layers often contribute minimally, with effectiveness diminishing as layer depth increases. This pattern presents significant opportunities for model compression. In the first part of this seminar, we will explore how this phenomenon can be harnessed to improve the efficiency of LLM compression. Despite these opportunities, the underutilization of deeper layers leads to inefficiencies, wasting resources that could be better used to enhance model performance. The second part of the talk will address the root cause of this ineffectiveness in deeper layers and propose a solution. We identify the issue as stemming from the prevalent use of Pre-Layer Normalization (Pre-LN) and introduce LayerNorm Scaling to solve this issue.
  • May 6

    16:20 - 17:00

    Going Beyond Tokens for Code Large Language Models

    AI Model
    Tokenization in LLMs is the last bit of clunkiness in an otherwise elegant, highly-optimized architecture. This talk presents interesting avenues in tokenizer-free architecture to go "beyond tokens" in order to reduce latency and improve performance.
  • May 6

    17:00 - 18:00

    Spotlight Demos

    AI Model
    GOSIM AI Spotlight Finalists will present their projects in a short pitch.
  • May 6

    18:00 - 21:00

    VIP Dinner

    AI Model
  • May 6

    9:30 - 10:00

    Keynote

    AI Infra
  • May 6

    10:00 - 10:30

    Morning Coffee

    AI Infra
  • May 6

    10:30 - 11:10

    Streamlining AI App Development with Docker: Models and AI Tools That Just Work

    AI Infra
    Discover how Docker’s Model Runner enables fast, local AI inference with GPU support, and how the Docker makes it easy to integrate LLMs and agents using MCP —no complex setup required.
  • May 6

    11:10 - 11:50

    Make Your LLMs Serverless

    AI Infra
    LLMs require GPUs, causing scarcity. Overprovisioning them is expensive and a waste. Google Cloud Run now offers serverless GPU support, enabling cost-effective LLM deployment. A live demo will compare Gemma model performance with and without GPUs.
  • May 6

    12:00 - 14:00

    Lunch Break

    AI Infra
  • May 6

    14:00 - 14:40

    Kubetorch: A Modern Kubernetes-Native Approach to ML Execution and Production

    AI Infra
    Mature organizations run ML workloads on Kubernetes, but implementations vary widely, and ML engineers rarely enjoy the streamlined development and deployment experiences that platform engineering teams provide for software engineers. Making small changes takes an hour to test and moving from research to production frequently takes multiple weeks – these unergonomic and inefficient processes are unthinkable for software, but standard in ML. Kubetorch is an introduction of a novel compute platform that is Kubernetes-native that offers a great, iterable, and debuggable interface into powerful compute for developers, without introducing new pitfalls of brittle infrastructure or long deployment times.
  • May 6

    14:40 - 15:20

    RAGFlow: Leading the Open-Source Revolution in Enterprise-Grade Retrieval-Augmented Generation

    AI Infra
    RAGFlow tackles core RAG challenges—data quality, semantic gaps, low hit rates—with open-source solutions. This talk demonstrates enhanced retrieval, reasoning, and multimodal capabilities for robust enterprise AI applications.
  • May 6

    15:20 - 15:40

    Afternoon Coffee

    AI Infra
  • May 6

    15:40 - 16:20

    Khronos in the World of Open Source and Machine Learning

    AI Infra
  • May 6

    16:20 - 17:00

    WGML: the Story of Building a New High-performance, Cross-platform, On-device Inference Framework.

    AI Infra
    Ever dreamed of writing a new low-level LLM inference library? Check out the making of WGML, an open-source high-performance cross-platform GPU inference framework using Rust and WebGPU. We will cover tips for discovering and implementing LLM inference.
  • May 6

    17:00 - 18:00

    Spotlight Demos

    AI Infra
    GOSIM AI Spotlight Finalists will present their projects in a short pitch.
  • May 6

    18:00 - 21:00

    VIP Dinner

    AI Infra
  • May 6

    9:30 - 10:00

    Keynote

    AI Apps
  • May 6

    10:00 - 10:30

    Morning Coffee

    AI Apps
  • May 6

    10:30 - 11:10

    AG2: The Open-Source AgentOS for Agentic AI

    AI Apps
    This presentation will examine the trend of agentic AI and the fundamental design considerations for agentic AI programming frameworks. It will introduce a pioneering initiative, AG2. It will explain the primary concepts and its application across a diverse range of tasks and industries.
  • May 6

    11:10 - 11:50

    CangjieMagic : New Choices for Developers in the Age of Large Models

    AI Apps
    With the surge in popularity of various AI large models, the trend of agent-oriented development in large model applications has become increasingly evident. Agents are gradually becoming a core element in the development of large model applications. This topic shares an AI large model Agent development framework based on the Cangjie programming language. This framework supports Agent-oriented programming, providing developers with an efficient Agent DSL for Agent programming. Its main features include: support for the MCP protocol to facilitate mutual invocation between Agents and tools, support for modular invocation, and support for intelligent task planning. It enhances the efficiency of developers in creating smart HarmonyOS applications, delivers an exceptional development experience, and explores new paradigms for future large model application development
  • May 6

    12:00 - 14:00

    Lunch Break

    AI Apps
  • May 6

    14:00 - 14:40

    TONGYI Lingma: from Coding Copilot to Coding Agent Based on Qwen Models

    AI Apps
    This presentation will take the perspective of intelligent development in the software engineering to outline and introduce the latest technological advancements and product applications of Code LLMs, Coding Copilot, and Coding Agents, as well as analyze and forecast future development trends.
  • May 6

    14:40 - 15:20

    Unifying AI Integration with Model Context Protocol

    AI Apps
    The Model Context Protocol (MCP) standardizes how AI models interact with external tools and resources through a structured client-server architecture, facilitating robust agent development. The engineering community worldwide keeps sharing MCP servers that enable client interactions that open truly remarkable and innovative applications. This talk delves into MCP's core capabilities and how the MCP Java SDK combined with Spring AI MCP can integrate AI with your existing resources and applications. Today's intelligent agents can understand context, guide decisions, and integrate seamlessly with external services. Through live coding and practical examples, we will illustrate how to implement both client and server components. By attending this session, you will gain a practical understanding of MCP's standardized interfaces and architectural best practices, empowering you to build and extend AI-powered applications with agent-like capabilities. Whether you're developing new AI-driven solutions or enhancing existing systems, this talk will equip you with the tools and strategies needed to leverage MCP effectively.
  • May 6

    15:20 - 15:40

    Afternoon Coffee

    AI Apps
  • May 6

    15:40 - 16:20

    Using AI to Vibe Code Rust UI's for Mobile, Web and Mixed Reality

    AI Apps
    In this talk i will show vibecoding makepad UIs and UI shaders with Makepad Studio and an LLM. Makepad Studio is our visual design / code environment and the vision is to bring back Visual Basic, but now for a modern language: Rust.
  • May 6

    16:20 - 17:00

    Tech Together, Powered by Dify

    AI Apps
    Dify is a next-generation AI-native application development platform that bridges cutting-edge technology with real-world business value. Built on a robust open-source foundation, Dify integrates modern tech stacks—including LLM orchestration, RAG (Retrieval-Augmented Generation), fine-tuning tools, and multi-agent workflows—to simplify the creation, deployment, and scaling of AI applications. Our global developer community has become a hub for innovation, with thousands of contributors and enterprise adopters leveraging Dify to build everything from intelligent chatbots to enterprise-grade automation systems. This talk will highlight: Dify’s open-source ecosystem and its role in accelerating AI adoption; Key technologies powering the platform and how they solve real-world challenges; Success stories from developers and enterprises.
  • May 6

    17:00 - 18:00

    Spotlight Demos

    AI Apps
    GOSIM AI Spotlight Finalists will present their projects in a short pitch.
  • May 6

    18:00 - 21:00

    VIP Dinner

    AI Apps
  • May 6

    9:30 - 10:00

    Keynote

    Embodied AI
  • May 6

    10:00 - 10:30

    Morning Coffee

    Embodied AI
  • May 6

    10:30 - 11:10

    Mind, Body and Zenoh

    Embodied AI
    As Robotics and Artificial Intelligence continue to evolve at an unprecedented pace, a critical gap has emerged in their ability to scale and operate seamlessly across diverse environments: the absence of an efficient, intelligent "nervous system." Much like biological organisms rely on their nervous system to connect the body and the brain, autonomous systems require a foundational layer that enables real-time communication, adaptability, and distributed cognition. This talk introduces Zenoh, a cutting-edge Open Source protocol that is rapidly gaining traction in the robotics community and beyond. Zenoh bridges the traditional divide between data in motion, data at rest, and computation, enabling seamless data exchange from the edge to the cloud. Zenoh it's the missing link to unify sensing, actuation, and cognition. It is, in essence, the nervous system for the intelligent robots age.
  • May 6

    11:10 - 11:50

    Distributed Dataflows in Dora Using Zenoh

    Embodied AI
    The Dora framework makes it easy to create dataflows for robotics and AI. In this talk, we look into distributed dataflows that run on multiple machines and communicate through the network. Dora supports complex network topologies by using the Zenoh protocol. This way, it is possible to split Dora dataflows across private networks and cloud machines with minimal configuration.
  • May 6

    12:00 - 14:00

    Lunch Break

    Embodied AI
  • May 6

    14:00 - 14:40

    Adversarial Safety-Critical Scenario Generation for Autonomous Driving

    Embodied AI
    valuating the decision-making system is indispensable in developing autonomous vehicles, while realistic and challenging safety-critical test scenarios play a crucial role. Obtaining these scenarios is non-trivial due to the long-tailed distribution, sparsity, and rarity in real-world data sets. To tackle this problem, we introduce a natural adversarial scenario generation solution using naturalistic human driving priors and reinforcement learning. Our experiments on public data sets demonstrate that our proposed model can generate realistic safety critical test scenarios covering both naturalness and adversariality with 44% efficiency gain over the baseline model.
  • May 6

    14:40 - 15:20

    Integrating Feedback and Learning in Closed Loop: Toward General-Purpose Embodied AI Systems

    Embodied AI
    A crucial piece missing from the foundation model for Embodied AI (EAI) is plasticity—the ability of continual learning without human intervention. While the emergence of In-Context Learning (ICL) has been pivotal to the success of Large Language Models (LLMs), its limitations and underlying mechanisms remain underexplored. This study illuminates the potential of large-scale meta-training, which prioritizes acquiring general-purpose ICL capabilities over mastering specific skills. We believe this technique could form a cornerstone of the next generation of general-purpose foundation models for EAI. Additionally, we introduce two open-source projects that are designed to advance the development of these foundation models.
  • May 6

    15:20 - 15:40

    Afternoon Coffee

    Embodied AI
  • May 6

    15:40 - 16:20

    Building Robotic Applications with Open-source VLA Models

    Embodied AI
    Ville shares the successes and challenges in using open-source Vision-Language-Action (VLA) models on robots, and provides a full-stack "starter guide" for building VLA-powered robotic applications in 2025.
  • May 6

    16:20 - 17:00

    Spatial Reasoning LLM: Enhancing 2D & 3D Understanding for Robotic Manipulation and Navigation

    Embodied AI
    Robotic systems require advanced spatial reasoning for navigation and manipulation. We introduce a research project enhancing LLMs for spatial intelligence: AlphaMaze, solving 2D mazes with self-correction; AlphaSpace, interpreting object positions for robot hand manipulation via language; and AlphaVoxel, using 3D voxel space for object recognition and robot navigation.
  • May 6

    17:00 - 18:00

    Spotlight Demos

    Embodied AI
    GOSIM AI Spotlight Finalists will present their projects in a short pitch.
  • May 6

    18:00 - 21:00

    VIP Dinner

    Embodied AI
Paris

Grab your GOSIM AI Paris ticket

Paris, France

May 6-7, 2025

Paris, the City of Light, transforms into the City of Artificial Brilliance this May. GOSIM AI 2025 invites visionaries, disruptors, and pioneers to converge at Station F—a crucible of innovation—to shape the next frontier of AI.