ET-SEED:Trajectory-Level SE(3) Equivariant Diffusion Model for Robot Manipulation

ET-SEED is a visual imitation learning algorithm that marries SE(3) equivariant visual representations with diffusion policies. (a) ET-SEED achieve surprising efficiency and spatial generalization than baselines. (b) When the input object observation is rotated or translated, the output action sequence change equivariantly. (c) Visualizations of simulation environments.

Abstract


Imitation learning, e.g., diffusion policy, has proven effective in various robotic manipulation tasks. However, extensive demonstrations are required for policy robustness and generalization. To reduce the demonstration reliance, we leverage spatial symmetry and propose ET-SEED, an efficient trajectory-level SE(3) equivariant diffusion model for proposing action sequences in complex robot manipulation tasks. Further, previous equivariant diffusion models require the per-step equivariance in the Markov process, making it difficult to learn policy under such strong constraints. We theoretically extend equivariant Markov kernels and simplify the condition of equivariant diffusion process, thereby significantly improving training efficiency for trajectory-level SE(3) equivariant diffusion policy in an end-to-end manner. We evaluate ET-SEED on representative robotic manipulation tasks, involving rigid body, articulated and deformable object. Experiments demonstrate superior data efficiency and manipulation proficiency of our proposed method, as well as its ability to generalize to unseen configurations with only a few demonstrations.

Video



Method


Full Pipeline

Results


We evaluate our method on standard robot manipulation tasks in both simulation and real world settings. The results demonstrate that our approach out performs all baselines in terms of data efficiency and spatial generalization.


Simulation Results

Simulation Results

Real World Results

Related Projects


BibTeX