蔡少斐 · DeepSeek AI 研究员

精选论文 (查看全部 )

Scalable Multi-Task Reinforcement Learning for Generalizable Spatial Intelligence in Visuomotor Agents

蔡少斐*, 牟湛存*, 夏海闻, 张博为, 刘安吉, 梁一韬 (* 共同贡献)

arxiv 2025

Minecraft 世界中首个通过强化学习训练的多任务策略,并展现出对其他 3D 领域的零样本泛化能力。

[Paper] [Code] [Page]

Scalable Multi-Task Reinforcement Learning for Generalizable Spatial Intelligence in Visuomotor Agents

蔡少斐*, 牟湛存*, 夏海闻, 张博为, 刘安吉, 梁一韬 (* 共同贡献)

arxiv 2025

Minecraft 世界中首个通过强化学习训练的多任务策略,并展现出对其他 3D 领域的零样本泛化能力。

[Paper] [Code] [Page]

ROCKET-2: Steering Visuomotor Policy via Cross-View Goal Alignment

蔡少斐, 牟湛存, 刘安吉, 梁一韬

Proceedings of the AAAI Conference on Artificial Intelligence (AAAI'26) 2025

我们旨在开发一种语义清晰、空间敏感、且对用户直观的目标设定方法,以在具身环境中引导智能体交互。具体而言,我们提出一种全新的跨视角目标对齐框架,允许用户使用自己相机视角下的分割掩码(而非智能体的观测)来指定目标物体。

[Paper] [Code] [Page]

ROCKET-2: Steering Visuomotor Policy via Cross-View Goal Alignment

蔡少斐, 牟湛存, 刘安吉, 梁一韬

Proceedings of the AAAI Conference on Artificial Intelligence (AAAI'26) 2025

[Paper] [Code] [Page]

ROCKET-1: Mastering Open-World Interaction with Visual-Temporal Context Prompting

蔡少斐, 王子豪, 连可为, 牟湛存, 马晓健, 刘安吉, 梁一韬

IEEE/CVF Computer Vision and Pattern Recognition (CVPR'25) 2025

我们提出视觉—时序上下文提示,这是一种 VLM 与策略模型之间的全新通信协议。该协议利用过去观测中的目标分割来引导策略与环境的交互。基于此,我们训练了 ROCKET-1——一个低层策略,它根据拼接的视觉观测与分割掩码预测动作,并由 SAM-2 提供的实时目标跟踪支持。

[Paper] [Code] [Demo] [Page] [Video] [Twitter]

ROCKET-1: Mastering Open-World Interaction with Visual-Temporal Context Prompting

蔡少斐, 王子豪, 连可为, 牟湛存, 马晓健, 刘安吉, 梁一韬

IEEE/CVF Computer Vision and Pattern Recognition (CVPR'25) 2025

[Paper] [Code] [Demo] [Page] [Video] [Twitter]

GROOT: Learning to Follow Instructions by Watching Gameplay Videos

蔡少斐, 张博为, 王子豪, 马晓健, 刘安吉, 梁一韬

International Conference on Learning Representations (ICLR'24) 2024 Spotlight Top 6.2%

本文研究在开放世界环境中构建能够遵循开放式指令的控制器这一问题。我们提出以参考视频作为指令,它既能提供富有表达力的目标设定,又免去了昂贵的文本—游戏过程标注。我们推导出一个全新的学习框架,使其能够从游戏视频中学习这类指令跟随控制器,同时产生一个诱导出结构化目标空间的视频指令编码器。

[Paper] [Code] [Page] [Twitter]

GROOT: Learning to Follow Instructions by Watching Gameplay Videos

蔡少斐, 张博为, 王子豪, 马晓健, 刘安吉, 梁一韬

International Conference on Learning Representations (ICLR'24) 2024 Spotlight Top 6.2%

[Paper] [Code] [Page] [Twitter]

Automatic Relation-aware Graph Network Proliferation

蔡少斐, 李亮, 韩歆哲, 罗杰波, 查正军, 黄庆明

IEEE/CVF Computer Vision and Pattern Recognition (CVPR'22) 2022 Oral Top 4.2%

本文提出自动关系感知图网络增殖(ARGNP),以关系引导的消息传递机制高效搜索图神经网络。具体而言,我们首先设计了一个包含节点与关系学习操作的全新双重关系感知图搜索空间,这些操作能提取层次化的节点/关系信息,并为图上的消息传递提供各向异性的引导。其次,类比细胞增殖,我们设计了一种网络增殖搜索范式,通过迭代执行网络分裂与分化,逐步确定 GNN 架构。

[Paper] [Code] [Poster]

Automatic Relation-aware Graph Network Proliferation

蔡少斐, 李亮, 韩歆哲, 罗杰波, 查正军, 黄庆明

IEEE/CVF Computer Vision and Pattern Recognition (CVPR'22) 2022 Oral Top 4.2%

[Paper] [Code] [Poster]

教育经历

工作经历

荣誉与奖项

动态

精选论文 (查看全部 )

Scalable Multi-Task Reinforcement Learning for Generalizable Spatial Intelligence in Visuomotor Agents

Scalable Multi-Task Reinforcement Learning for Generalizable Spatial Intelligence in Visuomotor Agents

ROCKET-2: Steering Visuomotor Policy via Cross-View Goal Alignment

ROCKET-2: Steering Visuomotor Policy via Cross-View Goal Alignment

ROCKET-1: Mastering Open-World Interaction with Visual-Temporal Context Prompting

ROCKET-1: Mastering Open-World Interaction with Visual-Temporal Context Prompting

GROOT: Learning to Follow Instructions by Watching Gameplay Videos

GROOT: Learning to Follow Instructions by Watching Gameplay Videos

Automatic Relation-aware Graph Network Proliferation

Automatic Relation-aware Graph Network Proliferation

全部论文