Motion Imitation from Videos for Humanoid Robots
This project implements an end-to-end pipeline for converting video sequences of human motions into executable robot trajectories for humanoid robots. Inspired by SLoMo and VideoMimic, the system extracts human pose from videos using 3D pose estimation, retargets the motion to robot joints using motion imitation techniques, and deploys policies on real hardware using RoboJuDo. The motion imitation training is performed in mjlab (Isaac Lab API with MuJoCo-Warp), enabling efficient GPU-accelerated policy learning.
Video → 3D Pose → Motion Retargeting → Policy Training → Hardware Deployment
Demonstrated on Bharatnatyam dance, pick & place, and painting tasks
Deployed on Unitree G1 humanoid using RoboJuDo framework
Below are demonstrations of the motion imitation system on various tasks. The top row shows reference videos from which motions are extracted, and the bottom row shows the resulting robot motions in simulation.
The pipeline consists of three main stages:
Extract 3D human pose from monocular videos using state-of-the-art pose estimation models (similar to PromptHMR, VideoMimic approaches)
Use mjlab's motion imitation framework to train tracking policies. Reference motions are preprocessed and converted to NPZ format for GPU-accelerated training
Export trained policies to ONNX and deploy using RoboJuDo's motion tracking controller for real robot execution
End-to-end pipeline converting generative videos (Veo, Sora) to humanoid robot motions. Provides video to robot motion conversion with pose extraction and retargeting.
CoRL 2025 Best Student Paper. Visual Imitation Enables Contextual Humanoid Control. Real-to-sim pipeline for motion capture and humanoid policy training.
A General System for Legged Robot Motion Imitation from Casual Videos. Converts in-the-wild videos to robot motion primitives.
Isaac Lab API powered by MuJoCo-Warp for RL and robotics research. Provides GPU-accelerated motion imitation training pipeline.
Plug-and-play deployment framework for humanoid robots. Supports multiple policies and enables real hardware execution.