OpenAI_gym的官网案例

日期：2024-12-18 移动：https://sicmodule.kub2b.com/mobile/quote/7461.html

创建，渲染，随机选择动作
当然这只是gym的一个游戏，还有一些如： MountainCar-v0, MsPacman-v0 (requires the Atari dependency), or Hopper-v1 (requires the MuJoCo dependencies). Environments all descend from the Env base class.

import gym
env = gym.make('CartPole-v0')
env.reset()
for _ in range(1000):
env.render()
env.step(env.action_space.sample()) # take a random action

环境重置，返回动作，奖励，状态，是否终止
往环境输入一个动作后返回，环境执行完该动作后的一些信息env.step(action)

import gym
env = gym.make('CartPole-v0')
for i_episode in range(20):
observation = env.reset()
for t in range(100):
env.render()
print(observation)
action = env.action_space.sample()
observation, reward, done, info = env.step(action)
if done:
print("Episode finished after {} timesteps".format(t+1))
break

动作空间和状态空间
打印动作空间和状态空间：

OpenAI_gym的官网案例

Discrete(2)表示该环境的动作空间为离散的动作空间（0,1）
Box(4,)表示该状态空间是一个一维向量构成
import gym
env = gym.make('CartPole-v0')
print(env.action_space)
#> Discrete(2)
print(env.observation_space)
#> Box(4,)

同时可以获取状态空间的每一维度的最值

print(env.observation_space.high)
#> array([ 2.4 , inf, 0.20943951, inf])
print(env.observation_space.low)
#> array([-2.4 , -inf, -0.20943951, -inf])

gym提供了自定义的空间

from gym import spaces
space = spaces.Discrete(8) # Set with 8 elements {0, 1, 2, ..., 7}
x = space.sample()
assert space.contains(x)
assert space.n == 8

gym自带所有的环境
返回所有环境

本文地址：https://sicmodule.kub2b.com/quote/7461.html 企库往 https://sicmodule.kub2b.com/ , 查看更多

特别提示：本信息由相关用户自行提供，真实性未证实，仅供参考。请谨慎采用，风险自负。

0 条相关评论

相关最新动态

推荐最新动态

点击排行