Supported AlgorithmsΒΆ
OmniSafe offers a highly modular framework that integrates an extensive collection of algorithms specifically designed for Safe Reinforcement Learning (SafeRL) in various domains. The Adapter module in OmniSafe allows for easily expanding different types of SafeRL algorithms.
| Domains | Types | Algorithms Registry |
|---|---|---|
| On Policy | Primal Dual | TRPOLag; PPOLag; PDO; RCPO |
| TRPOPID; CPPOPID | ||
| Convex Optimization | CPO; PCPO; FOCOPS; CUP | |
| Penalty Function | IPO; P3O | |
| Primal | OnCRPO | |
| Off Policy | Primal-Dual | DDPGLag; TD3Lag; SACLag |
| DDPGPID; TD3PID; SACPID | Control Barrier Function | DDPGCBF, SACRCBF, CRABS |
| Model-based | Online Plan | SafeLOOP; CCEPETS; RCEPETS |
| Pessimistic Estimate | CAPPETS | Offline | Q-Learning Based | BCQLag; C-CRR |
| DICE Based | COptDICE | |
| Other Formulation MDP | ET-MDP | PPOEarlyTerminated; TRPOEarlyTerminated |
| SauteRL | PPOSaute; TRPOSaute | |
| SimmerRL | PPOSimmerPID; TRPOSimmerPID |
Table 1: OmniSafe supports varieties of SafeRL algorithms. From the perspective of classic RL, OmniSafe includes on-policy, off-policy, offline, and model-based algorithms; From the perspective of the SafeRL learning paradigm, OmniSafe supports primal-dual, projection, penalty function, primal, etc.