Xiaofeng Gao

Ph.D. Candidate in Statistics at UCLA

Boelter Hall 9407
580 Portola Plaza
University of California, Los Angeles
Los Angeles, CA, 90095
Email: xfgao at ucla dot edu
[Google Scholar]   [GitHub]


I am a second year Ph.D. candidate in the Department of Statistics, UCLA.

My research lies in the intersection of Robotics, Computer Vision, Machine Learning and Cognitive Science. Currently I'm working in the Center for Vision, Cognition, Learning, and Autonomy (VCLA), under the supervision of Prof. Song-Chun Zhu. Before that, I obtained a bachelor degree of Electronic Engineering at Fudan University.


05/2019: One paper got accepted by ICML workshop on Reinforcement Learning for Real Life. [Link]

03/2019: VRKitchen was covered by TechXplore. [Link]

03/2019: I was invited as a reviewer for IROS 2019.

03/2019: I passed the Oral Qualifying Exam and advanced to candidancy!

02/2019: I gave a poster presentation at the Third Annual Workshop on Naval Applications of Machine Learning .

09/2018: I was TAing for "Stats 10: Introduction to Statistical Reasoning", Fall 2018.

09/2017: I started my Ph.D. studies at UCLA.

03/2017: Our ICRA 2017 work was covered by New Scientist. [Link]


  • Learning Social Affordance Grammar from Videos: Transferring Human Interactions to Human-Robot Interactions ICRA'17

    Tianmin Shu, Xiaofeng Gao, Michael S. Ryoo, Song-Chun Zhu
    IEEE International Conference on Robotics and Automation (ICRA), 2017

    PDF Website
    In this paper, we present a general framework for learning social affordance grammar as a spatiotemporal AND-OR graph (ST-AOG) from RGB-D videos of human interactions, and transfer the grammar to humanoids to enable a real-time motion inference for human-robot interaction (HRI). Based on Gibbs sampling, our weakly supervised grammar learning can automatically construct a hierarchical representation of an interaction with long-term joint sub-tasks of both agents and short term atomic actions of individual agents. Based on a new RGB-D video dataset with rich instances of human interactions, our experiments of Baxter simulation, human evaluation, and real Baxter test demonstrate that the model learned from limited training data successfully generates human-like behaviors in unseen scenarios and outperforms both baselines.
  • VRKitchen: an Interactive 3D Environment for Learning Real Life Cooking Tasks RL4RealLife

    Xiaofeng Gao, Ran Gong, Tianmin Shu, Xu Xie, Shu Wang, Song-Chun Zhu
    ICML workshop on Reinforcement Learning for Real Life (RL4RealLife), 2019

    PDF Website
    One of the main challenges of applying reinforcement learning to real world applications is the lack of realistic and standardized environments for training and testing AI agents. In this work, we design and implement a virtual reality (VR) system, VRKitchen, with integrated functions which i) enable embodied agents to perform real life cooking tasks involving a wide range of object manipulations and state changes, and ii) allow human teachers to provide demonstrations for training agents. We also provide standardized evaluation benchmarks and data collection tools to facilitate a broad use in research on learning real life tasks. Video demos, code, and data will be available on the project website: sites.google.com/view/vr-kitchen.