Employment

NVIDIA
Research Scientist
Jan 2019 - (current)
Seattle, WA
Robotics Institute, Carnegie Mellon University
Visiting Scholar
Nov 2017 - Apr 2018
Pittsburgh, PA
Tencent
Software Engineer (Intern)
Jul 2010 - Sep 2010
Shenzhen, China

Education

The Chinese University of Hong Kong
Ph. D in Electronic Engineering, advised by Prof. Xiaogang Wang and Prof. Wanli Ouyang
Fall 2014 - Spring 2018
Sun Yat-sen University
Master of Science in Computer Science, advised by Prof. Liang Lin
Fall 2011 - Spring 2014
Sun Yat-sen University
Bachelor of Engineering in Software Engineering.
Fall 2007 - Spring 2011

Publications

Conferences

Dexplore: Scalable Neural Control for Dexterous Manipulation from Reference-Scoped Exploration A scalable neural control framework for dexterous manipulation using reference-scoped exploration.
CoRL, 2025
VT-Refine: Learning Bimanual Assembly with Visuo-Tactile Feedback via Simulation Fine-Tuning Learning bimanual assembly tasks using visuo-tactile feedback through simulation fine-tuning.
CoRL, 2025
Slot-Level Robotic Placement via Visual Imitation from Single Human Video Teaching robots slot-level pick-place manipulation from a single human video.
In Submission
Cosmos World Foundation Model Platform for Physical AI A world foundation model platform to help developers build customized world models for Physical AI applications with video curation pipeline, pre-trained models, and post-training examples.
Whitepaper, 2025
FoundationPose: Unified 6d pose estimation and tracking of novel objects A foundation model for 6D pose estimation and tracking of novel objects.
CVPR, 2024
ICRA, 2024
AnyTeleop: A General Vision-Based Dexterous Robot Arm-Hand Teleoperation System A general vision-based teleoperation system for dexterous robot manipulation.
RSS, 2023
Learning Human-to-Robot Handovers from Point Clouds Learning human-to-robot handovers directly from point cloud observations.
CVPR, 2023
Learning Robust Real-World Dexterous Grasping Policies via Implicit Shape Augmentation Using implicit shape augmentation to learn robust dexterous grasping policies.
CoRL, 2022
IROS, 2022
ICRA, 2022
Goal-Auxiliary Actor-Critic for 6D Robotic Grasping with Point Clouds Using goal-auxiliary learning for 6D robotic grasping.
CoRL, 2021
Reactive Human-to-Robot Handovers of Arbitrary Objects A reactive approach for human-to-robot handovers.
ICRA, 2021
Human Grasp Classification for Reactive Human-to-Robot Handovers A grasp classification approach for reactive human-to-robot handovers.
IROS, 2020
Collaborative Interaction Models for Optimized Human-Robot Teamwork Optimizing human-robot teamwork through collaborative interaction models.
IROS, 2020
ICRA, 2020
Visual Semantic Navigation using Scene Priors Using scene priors for visual semantic navigation.
ICLR, 2019
3D Human Pose Estimation in the Wild by Adversarial Learning Using adversarial learning for 3D human pose estimation in unconstrained environments.
CVPR, 2018
Learning Feature Pyramids for Human Pose Estimation A pyramid network architecture for human pose estimation.
ICCV, 2017
Identity-Aware Textual-Visual Matching with Latent Co-attention
S. Li, T. Xiao, H. Li, W. Yang, X. Wang
Using latent co-attention for identity-aware textual-visual matching.
ICCV, 2017
Towards Multi-Person Pose Tracking: Bottom-up and Top-down Methods Bottom-up and top-down methods for multi-person pose tracking.
ICCV PoseTrack Workshop, 2017
Multi-Context Attention for Human Pose Estimation Using multi-context attention mechanisms for human pose estimation.
CVPR, 2017
End-to-End Learning of Deformable Mixture of Parts and Deep Convolutional Neural Networks for Human Pose Estimation
W. Yang, W. Ouyang, H. Li, X. Wang
Learning deformable mixture of parts and CNNs end-to-end for pose estimation.
CVPR, 2016
Multi-task Recurrent Neural Network for Immediacy Prediction
X. Chu, W. Ouyang, W. Yang, X. Wang
Using multi-task RNNs for immediacy prediction.
ICCV, 2015
Clothing Co-Parsing by Joint Image Segmentation and Labeling
W. Yang, P. Luo, L. Lin
Joint image segmentation and labeling for clothing co-parsing.
CVPR, 2014
Data-Driven Scene Understanding by Adaptive Exemplar Retrieval
X. Liu, W. Yang, Q. Wang, L. Lin, J. Lai
Using adaptive exemplar retrieval for scene understanding.
ICME, 2014
Learning Contour-Fragment-based Shape Model with And-Or Tree Representation
L. Lin, X. Wang, W. Yang, J. Lai
Using And-Or tree representation for shape modeling.
CVPR, 2012
Interactive CT image segmentation with online discriminative learning
W. Yang, X. Wang, L. Lin, C. Gao
Interactive medical image segmentation with online learning.
ICIP, 2011

Journals

IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), 37(5): 959-972, 2015

Mentoring

Mentored and worked with interns at NVIDIA:

Professional Activities

I serviced as a reviewer for the following conferences and journals:

  • Computer Vision and Pattern Recognition (CVPR), 2018-2024
  • European Conference on Computer Vision (ECCV), 2018, 2020
  • International Conference on Computer Vision (ICCV), 2017, 2019, 2021, 2025
  • Conference on Robot Learning (CoRL), 2024, 2025
  • Robotics: Science and Systems (RSS), 2025
  • IEEE International Conference on Robotics and Automation (ICRA), 2021, 2022, 2024, 2025
  • IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020, 2024
  • IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), 2022
  • Asian Conference on Computer Vision (ACCV), 2018
  • IEEE Conference on Virtual Reality and 3D User Interfaces (VR), 2018
  • International Joint Conference on Artificial Intelligence (IJCAI), 2017
  • IEEE Robotics and Automation Letters (IEEE R-AL)
  • IEEE Transactions on Robotics (IEEE T-RO)
  • IEEE Transactions on Circuits and Systems for Video Technology (TPAMI)
  • IEEE Transactions on Multimedia (TMM)
  • IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)
  • IEEE Transaction on Cybernetics (TCYB)
  • IEEE Transactions on Artificial Intelligence (TAI)
  • International Journal of Computer Vision (IJCV)
  • Elsevier Journal of Neurocomputing (NEUCOM)
  • Elsevier Journal of Pattern Recognition (PR)
  • Elsevier Journal of Computer Vision and Image Understanding (CVIU)
  • IET Image Processing

Teaching

Teaching assistant at CUHK for the following courses:

  • 2017, Spring. Introduction to Deep learning (ELEG 5491).
  • 2016, Fall. Complex Analysis and Differential Equations (ENGG 2420A).
  • 2016, Spring. Probability and Statistics for Engineers (ENGG 2430D).
  • 2015, Fall. Complex Analysis and Differential Equations for Engieers (ENGG 2420A).
  • 2015, Summer. Solidworks.
  • 2014, Fall. Digital Circuits and Systems (ELEG2201).

Selected Awards

  • 2021 IEEE ICRA Best Paper Award on Human-Robot Interaction, 2021
  • PoseTrack Challenge 2017, 2nd place, 2017.
  • Tutor with Commendation, The Chinese University of Hong Kong, 2016/17.
  • Green Walkers Award, The Chinese University of Hong Kong, July 2017.
  • Scholarships
    • National Scholarship, 2012.
    • The Third Prize Scholarship, 2010.
    • The Second Prize Scholarship, 2008-2009.
  • Amway University IT Project Competition, Silver Medal, 2011.
  • Computer Programming Competition of Sun Yat-sen University, Third prize, 2009.

Talks

Certifications

Control of Mobile Robots certificate
Control of Mobile Robots
Georgia Institute of Technology
July 2020
Fundamentals of Agents certificate
Fundamentals of Agents
Hugging Face
May 2025