Q-Learning Agent playing Taxi-v3 🚕

This is a trained Q-Learning agent playing Taxi-v3. This model was trained as part of the Hugging Face Deep RL Course Unit 2.

Training Hyperparameters

n_training_episodes = 25000
learning_rate = 0.7
gamma = 0.95
max_epsilon = 1.0
min_epsilon = 0.05
decay_rate = 0.0005

Evaluation Results

Mean reward: 8.23 +/- 2.49

Usage

import numpy as np
import gymnasium as gym

# Load the Q-table
qtable = np.load("qtable.npy")

# Create environment
env = gym.make('Taxi-v3')

# Run an episode
state, _ = env.reset()
done = False
total_reward = 0

while not done:
    action = np.argmax(qtable[state])
    state, reward, terminated, truncated, _ = env.step(action)
    done = terminated or truncated
    total_reward += reward

print(f"Total reward: {total_reward}")

Downloads last month: -; Downloads are not tracked for this model. How to track

Video Preview

Reinforcement Learning

Evaluation results

mean_reward on Taxi-v3
self-reported

8.23 +/- 2.49