Specification¶
Import |
|
---|---|
Actions |
Discrete & Deterministic |
Observations |
Discrete and fully observed with private observations |
Parallel API |
Yes |
Manual Control |
No |
Agent Names |
[ |
# Agents |
|
Action Shape |
( |
Action Values |
[ |
Observation Shape |
TensorDict: { |
Observation Values |
self: |
Usage¶
Parallel API¶
from free_range_zoo.envs import rideshare_v0
main_logger = logging.getLogger(__name__)
# Initialize and reset environment to initial state
env = rideshare_v0.parallel_env(render_mode="human")
observations, infos = env.reset()
# Initialize agents and give initial observations
agents = []
cumulative_rewards = {agent: 0 for agent in env.agents}
current_step = 0
while not torch.all(env.finished):
agent_actions = {
agent_name: torch.stack([agents[agent_name].act()])
for agent_name in env.agents
} # Policy action determination here
observations, rewards, terminations, truncations, infos = env.step(agent_actions)
rewards = {agent_name: rewards[agent_name].item() for agent_name in env.agents}
for agent_name, agent in agents.items():
agent.observe(observations[agent_name][0]) # Policy observation processing here
cumulative_rewards[agent_name] += rewards[agent_name]
main_logger.info(f"Step {current_step}: {rewards}")
current_step += 1
env.close()
AEC API¶
from free_range_zoo.envs import rideshare_v0
main_logger = logging.getLogger(__name__)
# Initialize and reset environment to initial state
env = rideshare_v0.parallel_env(render_mode="human")
observations, infos = env.reset()
# Initialize agents and give initial observations
agents = []
cumulative_rewards = {agent: 0 for agent in env.agents}
current_step = 0
while not torch.all(env.finished):
for agent in env.agent_iter():
observations, rewards, terminations, truncations, infos = env.last()
# Policy action determination here
action = env.action_space(agent).sample()
env.step(action)
rewards = {agent: rewards[agent].item() for agent in env.agents}
cumulative_rewards[agent] += rewards[agent]
current_step += 1
main_logger.info(f"Step {current_step}: {rewards}")
env.close()
Configuration¶
Agent settings for rideshare.
- Variables:
start_positions (torch.IntTensor) – torch.IntTensor - Starting positions of the agents
pool_limit (int) – int - Maximum number of passengers that can be in a car
use_diagonal_travel (bool) – bool - whether to enable diagonal travel for agents
use_fast_travel (bool) – bool - whether to enable fast travel for agents
Task settings for rideshare.
- Variables:
schedule (torch.IntTensor) – torch.IntTensor: tensor in the shape of <tasks, (timestep, batch, y, x, y_dest, x_dest, fare)> where batch can be set to -1 to indicate a wildcard for all batches
Reward settings for rideshare.
- Variables:
pick_cost (float) – torch.FloatTensor - Cost of picking up a passenger
move_cost (float) – torch.FloatTensor - Cost of moving to a new location
drop_cost (float) – torch.FloatTensor - Cost of dropping off a passenger
noop_cost (float) – torch.FloatTensor - Cost of taking no action
accept_cost (float) – torch.FloatTensor - Cost of accepting a passenger
pool_limit_cost (float) – torch.FloatTensor - Cost of exceeding the pool limit
use_variable_move_cost (bool) – torch.BoolTensor - Whether to use the variable move cost
use_variable_pick_cost – torch.BoolTensor - Whether to use the variable pick cost
use_waiting_costs (bool) – torch.BoolTensor - Whether to use waiting costs
wait_limit (torch.IntTensor) – List[int] - List of wait limits for each state of the passenger [unaccepted, accepted, riding]
long_wait_time (int) – int - Time after which a passenger is considered to be waiting for a long time (default maximum of wait_limit)
general_wait_cost (float) – torch.FloatTensor - Cost of waiting for a passenger
long_wait_cost (float) – torch.FloatTensor - Cost of waiting for a passenger for a long time (added to wait cost)
Configuration settings for rideshare environment.
- Variables:
grid_height (int) – int - grid height for the rideshare environment space.
grid_width (int) – int - grid width for the rideshare environment space.
agent_config (free_range_zoo.envs.rideshare.env.structures.configuration.AgentConfiguration) – AgentConfiguration - Agent settings for the rideshare environment.
passenger_config (free_range_zoo.envs.rideshare.env.structures.configuration.PassengerConfiguration) – PassengerConfiguration - Passenger settings for the rideshare environment.
reward_config (free_range_zoo.envs.rideshare.env.structures.configuration.RewardConfiguration) – RewardConfiguration - Reward configuration for the rideshare environment.
API¶
AEC wrapped version of the rideshare environment.
- Parameters:
wrappers – List[Callable[[BatchedAECEnv], BatchedAECEnv]] - the wrappers to apply to the environment
- Returns:
BatchedAECEnv – the rideshare environment
Implementation of the dynamic rideshare environment.
Initialize the simulation.