Model-based Algorithms

The OmniSafe Navigation Benchmark for model-based algorithms evaluates the effectiveness of OmniSafe’s model-based algorithms across two different environments from the Safety-Gymnasium task suite. For each supported algorithm and environment, we offer the following:

  • Default hyperparameters used for the benchmark and scripts that enable result replication.

  • Graphs and raw data that can be utilized for research purposes.

  • Detailed logs obtained during training.

Supported algorithms are listed below:

Safety-Gymnasium

We highly recommend using Safety-Gymnasium to run the following experiments. To install, in a linux machine, type:

pip install safety_gymnasium

Run the Benchmark

You can set the main function of examples/benchmarks/experiment_grid.py as:

if __name__ == '__main__':
    eg = ExperimentGrid(exp_name='Model-Based-Benchmarks')

    # set up the algorithms.
    model_based_base_policy = ['LOOP', 'PETS']
    model_based_safe_policy = ['SafeLOOP', 'CCEPETS', 'CAPPETS', 'RCEPETS']
    eg.add('algo', model_based_base_policy + model_based_safe_policy)

    # you can use wandb to monitor the experiment.
    eg.add('logger_cfgs:use_wandb', [False])
    # you can use tensorboard to monitor the experiment.
    eg.add('logger_cfgs:use_tensorboard', [True])
    eg.add('train_cfgs:total_steps', [1000000])

    # set up the environment.
    eg.add('env_id', [
        'SafetyPointGoal1-v0-modelbased',
        'SafetyCarGoal1-v0-modelbased',
        ])
    eg.add('seed', [0, 5, 10, 15, 20])

    # total experiment num must can be divided by num_pool
    # meanwhile, users should decide this value according to their machine
    eg.run(train, num_pool=5)

After that, you can run the following command to run the benchmark:

cd examples/benchmarks
python run_experiment_grid.py

You can set the path of examples/benchmarks/experiment_grid.py : example:

path ='omnisafe/examples/benchmarks/exp-x/Model-Based-Benchmarks'

You can also plot the results by running the following command:

cd examples
python analyze_experiment_results.py

For a detailed usage of OmniSafe statistics tool, please refer to this tutorial.

OmniSafe Benchmark

To demonstrate the high reliability of the algorithms implemented, OmniSafe offers performance insights within the Safety-Gymnasium environment. It should be noted that all data is procured under the constraint of cost_limit=1.00. The results are presented in Table 1 and Figure 1.

Performance Table

PETS LOOP SafeLOOP
Environment Reward Cost Reward Cost Reward Cost
SafetyCarGoal1-v0 33.07 ±1.33 61.20 ±7.23 25.41 ±1.23 62.64 ±8.34 22.09 ±0.30 0.16 ±0.15
SafetyPointGoal1-v0 27.66 ±0.07 49.16 ±2.69 25.08 ±1.47 55.23 ±2.64 22.94 ±0.72 0.04 ±0.07
CCEPETS RCEPETS CAPPETS
Environment Reward Cost Reward Cost Reward Cost
SafetyCarGoal1-v0 27.60 ±1.21 1.03 ±0.29 29.08 ±1.63 1.02 ±0.88 23.33 ±6.34 0.48 ±0.17
SafetyPointGoal1-v0 24.98 ±0.05 1.87 ±1.27 25.39 ±0.28 2.46 ±0.58 9.45 ±8.62 0.64 ±0.77

Table 1: The performance of OmniSafe model-based algorithms, encompassing both reward and cost, was assessed within the Safety-Gymnasium environments. It is crucial to highlight that all model-based algorithms underwent evaluation following 1e6 training steps.

Performance Curves


SafetyCarGoal1-v0

SafetyPointGoal1-v0

Figure 1: Training curves in Safety-Gymnasium environments, covering classical reinforcement learning algorithms and safe learning algorithms mentioned in Table 1.