Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog.
[1.0.0] - 2026-01-XX
Added
Self-Play Training: New training framework for achieving 70%+ win rates
Population-based training with opponent pool
Curriculum learning from random to self-play
Checkpoint management for diverse opponents
Multiplayer Support: Extended game to 2-4 players
MultiplayerUnoEnv with proper turn direction
25-dimensional observation including opponent hand sizes
Skip and Reverse mechanics for 3-4 players
Model Battle Arena: New GUI for comparing models
2-4 player battle support
Batch evaluation (10-1000 games)
CSV export functionality
Multiple model selectors
GUI Improvements
ModelSelector dropdown in main menu
Model discovery from filesystem
Multiplayer launcher button
Glassmorphism design updates
Documentation
Complete ReadTheDocs documentation
LaTeX report with methodology
Presentation slides (Beamer)
API reference documentation
Changed
Updated
uno_gui.pywith model selection and multiplayer buttonExtended
model_battle_gui.pyfor 2-4 playersAdded Self-Play Champion to all comparison scripts
Improved README with model performance table
Fixed
Fixed
gymnasiumimport in train_selfplay.pyCorrected action masking for invalid plays
Fixed card rendering for special cards
[0.2.0] - 2025-12-XX
Added
Recurrent PPO Implementation
LSTM-based policies for partial observability
60% win rate achievement
Multiple training configurations
Training Scripts
train_recurrent_ppo.py
train_best_recurrent_ppo.py
train_optimal_recurrent_ppo.py
Model Comparison
compare_models.py for batch evaluation
CSV result export
TensorBoard logging
Changed
Switched from gym to gymnasium
Updated stable-baselines3 to v2.0+
Improved reward shaping
[0.1.0] - 2025-11-XX
Added
Core Game Engine
Complete UNO rules implementation
Card and Deck classes
Game state management
Basic RL Environment
Gymnasium-compatible UnoEnv
17-dimensional observation
9 discrete actions
Initial Agents
Q-Learning (tabular)
DQN agent
PPO and A2C via stable-baselines3
Main GUI
Pygame-based interface
Human vs AI mode
AI vs AI spectator mode
Training Infrastructure
train_rl.py general trainer
train_sb3.py for SB3 models
Evaluation callbacks