Employment
NVIDIA
Research Scientist
Research Scientist
Jan 2019 - (current)
Seattle, WA
Seattle, WA
Robotics Institute, Carnegie Mellon University
Visiting Scholar
Visiting Scholar
Nov 2017 - Apr 2018
Pittsburgh, PA
Pittsburgh, PA
Tencent
Software Engineer (Intern)
Software Engineer (Intern)
Jul 2010 - Sep 2010
Shenzhen, China
Shenzhen, China
Education
The Chinese University of Hong Kong
Ph. D in Electronic Engineering, advised by Prof. Xiaogang Wang and Prof. Wanli Ouyang
Ph. D in Electronic Engineering, advised by Prof. Xiaogang Wang and Prof. Wanli Ouyang
Fall 2014 - Spring 2018
Sun Yat-sen University
Master of Science in Computer Science, advised by Prof. Liang Lin
Master of Science in Computer Science, advised by Prof. Liang Lin
Fall 2011 - Spring 2014
Sun Yat-sen University
Bachelor of Engineering in Software Engineering.
Bachelor of Engineering in Software Engineering.
Fall 2007 - Spring 2011
Publications
Conferences
Dexplore: Scalable Neural Control for Dexterous Manipulation from Reference-Scoped Exploration
A scalable neural control framework for dexterous manipulation using reference-scoped exploration.
CoRL, 2025
VT-Refine: Learning Bimanual Assembly with Visuo-Tactile Feedback via Simulation Fine-Tuning
Learning bimanual assembly tasks using visuo-tactile feedback through simulation fine-tuning.
CoRL, 2025
Slot-Level Robotic Placement via Visual Imitation from Single Human Video
Teaching robots slot-level pick-place manipulation from a single human video.
In Submission
Cosmos World Foundation Model Platform for Physical AI
A world foundation model platform to help developers build customized world models for Physical AI applications with video curation pipeline, pre-trained models, and post-training examples.
Whitepaper, 2025
FoundationPose: Unified 6d pose estimation and tracking of novel objects
A foundation model for 6D pose estimation and tracking of novel objects.
CVPR, 2024
SynH2R: Synthesizing Hand-Object Motions for Learning Human-to-Robot Handovers
Using synthetic data to learn human-to-robot handovers.
ICRA, 2024
AnyTeleop: A General Vision-Based Dexterous Robot Arm-Hand Teleoperation System
A general vision-based teleoperation system for dexterous robot manipulation.
RSS, 2023
Learning Human-to-Robot Handovers from Point Clouds
Learning human-to-robot handovers directly from point cloud observations.
CVPR, 2023
Learning Robust Real-World Dexterous Grasping Policies via Implicit Shape Augmentation
Using implicit shape augmentation to learn robust dexterous grasping policies.
CoRL, 2022
Learning Perceptual Concepts by Bootstrapping from Human Queries
Learning perceptual concepts through human interaction.
IROS, 2022
Model Predictive Control for Fluid Human-to-Robot Handovers
Using MPC for smooth human-to-robot handovers.
ICRA, 2022
HandoverSim: A Simulation Framework and Benchmark for Human-to-Robot Object Handovers
A simulation framework for human-to-robot handovers.
ICRA, 2022
Goal-Auxiliary Actor-Critic for 6D Robotic Grasping with Point Clouds
Using goal-auxiliary learning for 6D robotic grasping.
CoRL, 2021
DexYCB: A Benchmark for Capturing Hand Grasping of Objects
A benchmark dataset for hand-object interaction.
CVPR, 2021
Reactive Human-to-Robot Handovers of Arbitrary Objects
A reactive approach for human-to-robot handovers.
ICRA, 2021
Human Grasp Classification for Reactive Human-to-Robot Handovers
A grasp classification approach for reactive human-to-robot handovers.
IROS, 2020
Collaborative Interaction Models for Optimized Human-Robot Teamwork
Optimizing human-robot teamwork through collaborative interaction models.
IROS, 2020
DexPilot: Vision Based Teleoperation of Dexterous Robotic Hand-Arm System
A vision-based teleoperation system for dexterous manipulation.
ICRA, 2020
Visual Semantic Navigation using Scene Priors
Using scene priors for visual semantic navigation.
ICLR, 2019
3D Human Pose Estimation in the Wild by Adversarial Learning
Using adversarial learning for 3D human pose estimation in unconstrained environments.
CVPR, 2018
Learning Feature Pyramids for Human Pose Estimation
A pyramid network architecture for human pose estimation.
ICCV, 2017
Identity-Aware Textual-Visual Matching with Latent Co-attention
Using latent co-attention for identity-aware textual-visual matching.
ICCV, 2017
Towards Multi-Person Pose Tracking: Bottom-up and Top-down Methods
Bottom-up and top-down methods for multi-person pose tracking.
ICCV PoseTrack Workshop, 2017
Multi-Context Attention for Human Pose Estimation
Using multi-context attention mechanisms for human pose estimation.
CVPR, 2017
End-to-End Learning of Deformable Mixture of Parts and Deep Convolutional Neural Networks for Human Pose Estimation
Learning deformable mixture of parts and CNNs end-to-end for pose estimation.
CVPR, 2016
Multi-task Recurrent Neural Network for Immediacy Prediction
Using multi-task RNNs for immediacy prediction.
ICCV, 2015
Clothing Co-Parsing by Joint Image Segmentation and Labeling
Joint image segmentation and labeling for clothing co-parsing.
CVPR, 2014
Data-Driven Scene Understanding by Adaptive Exemplar Retrieval
Using adaptive exemplar retrieval for scene understanding.
ICME, 2014
Learning Contour-Fragment-based Shape Model with And-Or Tree Representation
Using And-Or tree representation for shape modeling.
CVPR, 2012
Interactive CT image segmentation with online discriminative learning
Interactive medical image segmentation with online learning.
ICIP, 2011
Journals
Journal of Open Source Software (JOSS), January 2025
Pattern Recognition (PR), 2019
IEEE Transactions on Multimedia (T-MM), 2016
IEEE Transactions on Cybernetics (T-Cybernetics), 2015
IEEE Multimedia, 2015
IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), 37(5): 959-972, 2015
Mentoring
Mentored and worked with interns at NVIDIA:
- Sirui Xu, PhD at University of Illinois at Urbana-Champaign
- Binghao Huang, PhD at Columbia University
- Dandan Shan, PhD at University of Michigan
- Sammy Christen, Postdoc at Disney Research
- Yuzhe Qin, Co-founder of Dexmate
- Qiuyu Chen, University of Washington
- Andreea Bobu, Assistant Professor at MIT
- Tao Chen, Co-founder of Dexmate
- Lirui Wang, Researcher at OpenAI
- Adam Fishman, Technical Staff at OpenAI
Professional Activities
I serviced as a reviewer for the following conferences and journals:
- Computer Vision and Pattern Recognition (CVPR), 2018-2024
- European Conference on Computer Vision (ECCV), 2018, 2020
- International Conference on Computer Vision (ICCV), 2017, 2019, 2021, 2025
- Conference on Robot Learning (CoRL), 2024, 2025
- Robotics: Science and Systems (RSS), 2025
- IEEE International Conference on Robotics and Automation (ICRA), 2021, 2022, 2024, 2025
- IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020, 2024
- IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), 2022
- Asian Conference on Computer Vision (ACCV), 2018
- IEEE Conference on Virtual Reality and 3D User Interfaces (VR), 2018
- International Joint Conference on Artificial Intelligence (IJCAI), 2017
- IEEE Robotics and Automation Letters (IEEE R-AL)
- IEEE Transactions on Robotics (IEEE T-RO)
- IEEE Transactions on Circuits and Systems for Video Technology (TPAMI)
- IEEE Transactions on Multimedia (TMM)
- IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)
- IEEE Transaction on Cybernetics (TCYB)
- IEEE Transactions on Artificial Intelligence (TAI)
- International Journal of Computer Vision (IJCV)
- Elsevier Journal of Neurocomputing (NEUCOM)
- Elsevier Journal of Pattern Recognition (PR)
- Elsevier Journal of Computer Vision and Image Understanding (CVIU)
- IET Image Processing
Teaching
Teaching assistant at CUHK for the following courses:
- 2017, Spring. Introduction to Deep learning (ELEG 5491).
- 2016, Fall. Complex Analysis and Differential Equations (ENGG 2420A).
- 2016, Spring. Probability and Statistics for Engineers (ENGG 2430D).
- 2015, Fall. Complex Analysis and Differential Equations for Engieers (ENGG 2420A).
- 2015, Summer. Solidworks.
- 2014, Fall. Digital Circuits and Systems (ELEG2201).
Selected Awards
- 2021 IEEE ICRA Best Paper Award on Human-Robot Interaction, 2021
- PoseTrack Challenge 2017, 2nd place, 2017.
- Tutor with Commendation, The Chinese University of Hong Kong, 2016/17.
- Green Walkers Award, The Chinese University of Hong Kong, July 2017.
- Scholarships
- National Scholarship, 2012.
- The Third Prize Scholarship, 2010.
- The Second Prize Scholarship, 2008-2009.
- Amway University IT Project Competition, Silver Medal, 2011.
- Computer Programming Competition of Sun Yat-sen University, Third prize, 2009.
Talks
- Vision-Based Human-to-Robot Object Handovers(中文), TechBeat, 2021
- Human Pose Estimation with Deep Learning, SIAT, Shenzhen, China, 2018
- Human Pose Estimation with Deep Learning, VALSE, virtual, 2018