Supported Algorithms¶

OmniSafe offers a highly modular framework that integrates an extensive collection of algorithms specifically designed for Safe Reinforcement Learning (SafeRL) in various domains. The Adapter module in OmniSafe allows for easily expanding different types of SafeRL algorithms.

Domains	Types	Algorithms Registry
On Policy	Primal Dual	TRPOLag; PPOLag; PDO; RCPO
	Primal Dual	TRPOPID; CPPOPID
	Convex Optimization	CPO; PCPO; FOCOPS; CUP
	Penalty Function	IPO; P3O
	Primal	OnCRPO
Off Policy	Primal-Dual	DDPGLag; TD3Lag; SACLag
	Primal-Dual	DDPGPID; TD3PID; SACPID
	Control Barrier Function	DDPGCBF, SACRCBF, CRABS
Model-based	Online Plan	SafeLOOP; CCEPETS; RCEPETS
Model-based	Pessimistic Estimate	CAPPETS
Offline	Q-Learning Based	BCQLag; C-CRR
Offline	DICE Based	COptDICE
Other Formulation MDP	ET-MDP	PPOEarlyTerminated; TRPOEarlyTerminated
	SauteRL	PPOSaute; TRPOSaute
	SimmerRL	PPOSimmerPID; TRPOSimmerPID

Table 1: OmniSafe supports varieties of SafeRL algorithms. From the perspective of classic RL, OmniSafe includes on-policy, off-policy, offline, and model-based algorithms; From the perspective of the SafeRL learning paradigm, OmniSafe supports primal-dual, projection, penalty function, primal, etc.