STRIDE: Physics-Guided Generative Dynamics

Abstract

We present STRIDE, a dynamics learning framework that explicitly separates conservative rigid-body mechanics from uncertain, stochastic non-conservative interaction effects. The structured component uses a Lagrangian Neural Network (LNN) to preserve energy-consistent inertial dynamics, while residual interaction forces are modeled using Conditional Flow Matching (CFM) to capture multi-modal contact phenomena. The two components are trained jointly end-to-end, enabling the model to retain physical structure while representing complex stochastic behavior.

20% Reduction

Improvement in long-horizon prediction error compared to deterministic baselines

30% Reduction

Improvement in contact force prediction error

Real-Time

3ms inference time, enabling 50Hz control frequency on hardware

Method

STRIDE decomposes robot dynamics into two components:

q̇ = f_LNN(q, q̇, τ) + M^-1(q) ε_CFM

Figure 1: STRIDE combines a structured Lagrangian prior with a stochastic residual to capture interaction uncertainty while preserving physical consistency.

Lagrangian Neural Network

Preserves energy-consistent dynamics using learned kinetic/potential energy with positive-definite mass matrix via Cholesky factorization

Flow Matching Residual

Captures stochastic contact forces (friction, impacts) efficiently using continuous transport maps instead of iterative diffusion

Results

83%

Rollout Error Reduction (Go1)

53%

Rollout Error Reduction (G1)

Long-Horizon Prediction Stability

30-Step Rollout Error

We evaluate multi-step rollout accuracy over H=30 steps. The black-box MLP (ONN) shows rapid exponential error growth. DeLaN (Lagrangian network) improves stability with approximately linear error growth due to physical structure. STRIDE further reduces drift by capturing stochastic contact-induced variability, achieving 83% error reduction on Go1 and 53% on G1 compared to ONN. Cumulative RMSE over 30-step horizon. STRIDE (red) maintains lowest error compared to ONN (black-box MLP) and DeLaN (structured baseline) on both robots.

Contact Force Prediction

Accurate contact force modeling is critical for legged locomotion. STRIDE captures sharp discontinuities at impact and swing-stance transitions, achieving ~30% force error improvement over DeLaN. Below: predicted vs ground-truth vertical ground reaction forces.

Unitree Go1 Quadruped

Contact forces across multiple gaits (trot, pronk, bound, pace). STRIDE accurately captures timing and magnitude of stance-swing transitions.

Unitree G1 Humanoid

Walking gait contact forces. STRIDE preserves sharp impact discontinuities that deterministic baselines smooth over.

Phase Portrait Quality

We analyze a 1-DoF pendulum near the unstable upright equilibrium—a sensitive region where deterministic predictors exhibit averaging bias. STRIDE preserves correct topology while baselines show distortions.

Pendulum Phase Space

Near the unstable equilibrium (top), STRIDE captures the saddle structure correctly. DeLaN shows deviations while ONN produces noisy flows. STRIDE also preserves elliptical orbits around stable equilibrium (bottom).

Hardware Deployment

We deploy STRIDE within a Dreamer-MPC pipeline on Unitree Go1. The system runs in 3ms inference at 50Hz control. We demonstrate zero-shot adaptation to 4 unseen terrains without retraining.

Real-World Deployment

Demonstrates velocity tracking (0-2 m/s), elevation adaptation (up to 20°), and zero-shot terrain adaptation: high/low friction, 20° slopes, muddy and grassy surfaces.

Hardware Demo Video

🐘

Unitree Go1

Quadruped

🤖

Unitree G1

Humanoid

Key Contributions

Novel architecture: First framework combining Lagrangian Neural Networks with Flow Matching residuals for robotics
Physical consistency: Preserves energy structure and positive-definite inertia while capturing stochastic effects
Efficient inference: Flow matching achieves ~10x faster sampling than diffusion alternatives
Hardware validation: Demonstrates real-time control at 50Hz on quadruped with zero-shot terrain adaptation
Scalability: Validated on systems from 1-DoF pendulum to 23-DoF humanoid