Supported AlgorithmsΒΆ

OmniSafe offers a highly modular framework that integrates an extensive collection of algorithms specifically designed for Safe Reinforcement Learning (SafeRL) in various domains. The Adapter module in OmniSafe allows for easily expanding different types of SafeRL algorithms.

Domains Types Algorithms Registry
On Policy Primal Dual TRPOLag; PPOLag; PDO; RCPO
TRPOPID; CPPOPID
Convex Optimization CPO; PCPO; FOCOPS; CUP
Penalty Function IPO; P3O
Primal OnCRPO
Off Policy Primal-Dual DDPGLag; TD3Lag; SACLag
DDPGPID; TD3PID; SACPID
Control Barrier Function DDPGCBF, SACRCBF, CRABS
Model-based Online Plan SafeLOOP; CCEPETS; RCEPETS
Pessimistic Estimate CAPPETS
Offline Q-Learning Based BCQLag; C-CRR
DICE Based COptDICE
Other Formulation MDP ET-MDP PPOEarlyTerminated; TRPOEarlyTerminated
SauteRL PPOSaute; TRPOSaute
SimmerRL PPOSimmerPID; TRPOSimmerPID

Table 1: OmniSafe supports varieties of SafeRL algorithms. From the perspective of classic RL, OmniSafe includes on-policy, off-policy, offline, and model-based algorithms; From the perspective of the SafeRL learning paradigm, OmniSafe supports primal-dual, projection, penalty function, primal, etc.