RL in Robotics (Unitree A1)

Sim2Real Transfer Research with Wang Group (UCSD)

Description

This project attempts to address the problem of:

Learning vision-guided locomotion in simulation using reinforcement learning,
Learn the ability to handle new situations never seen before in simulation
Eventually transfer this ability from simulation to real life

Achieving a robust policy that is able to execute complex
tasks after sim2real transfer has been one of the most challenging aspects of robotics since the dynamics of the system
that the agent is training on in the simulation are often
different than in the real world. To address the gap in transferring
the policy a technique called Domain Randomization has
been introduced. This method introduces variability in the
environment parameters to make the policy train on a wide
range of possible dynamics. This method has been widely
used in deploying robotic controllers in the real world due
to its computational simplicity and efficacy. An important
aspect of domain randomization recently explored by our group is delay randomization. In real robotics systems, the
computational resources are limited and the controller has
to be able to perform in real-time. All steps involved in
the control process of the real agent, starting from sensing and
ending on execution, introduce a non-deterministic latency
that impacts the execution of the policy. To account for the such
delay between sensing and execution, a Multi-Modal Delay
Randomization (MMDR) technique that models real-world
delay in the simulation environment has been introduced. In
general, however, while domain randomization helps to create
a robust controller able to bridge the reality gap, the trade-
off consists of learning a conservative behavior that does not
allow for complex tasks. Additionally, training an agent with
multiple varying parameters decreases sample efficiency and
can even prevent the policy from converging…

More info:

MMDR paper (IROS 2022, Tokyo): https://arxiv.org/pdf/2109.14549.pdf

MMDR + PAD paper: ECE 276C Final Report

Skills Used

Networking
Embedded Linux
Nvidia Isaac Simulation
PyTorch
Reinforcement Learning
Controls
LCM
Docker
Kubernetes
Sim2Real Transfer
Technical Documentation
Scientific Method
Hardware Debugging
ROS
C++
Computer Vision
Teamwork