========= Changelog ========= All notable changes to this project will be documented in this file. The format is based on `Keep a Changelog `_. [1.0.0] - 2026-01-XX ==================== Added ----- - **Self-Play Training**: New training framework for achieving 70%+ win rates - Population-based training with opponent pool - Curriculum learning from random to self-play - Checkpoint management for diverse opponents - **Multiplayer Support**: Extended game to 2-4 players - MultiplayerUnoEnv with proper turn direction - 25-dimensional observation including opponent hand sizes - Skip and Reverse mechanics for 3-4 players - **Model Battle Arena**: New GUI for comparing models - 2-4 player battle support - Batch evaluation (10-1000 games) - CSV export functionality - Multiple model selectors - **GUI Improvements** - ModelSelector dropdown in main menu - Model discovery from filesystem - Multiplayer launcher button - Glassmorphism design updates - **Documentation** - Complete ReadTheDocs documentation - LaTeX report with methodology - Presentation slides (Beamer) - API reference documentation Changed ------- - Updated ``uno_gui.py`` with model selection and multiplayer button - Extended ``model_battle_gui.py`` for 2-4 players - Added Self-Play Champion to all comparison scripts - Improved README with model performance table Fixed ----- - Fixed ``gymnasium`` import in train_selfplay.py - Corrected action masking for invalid plays - Fixed card rendering for special cards [0.2.0] - 2025-12-XX ==================== Added ----- - **Recurrent PPO Implementation** - LSTM-based policies for partial observability - 60% win rate achievement - Multiple training configurations - **Training Scripts** - train_recurrent_ppo.py - train_best_recurrent_ppo.py - train_optimal_recurrent_ppo.py - **Model Comparison** - compare_models.py for batch evaluation - CSV result export - TensorBoard logging Changed ------- - Switched from gym to gymnasium - Updated stable-baselines3 to v2.0+ - Improved reward shaping [0.1.0] - 2025-11-XX ==================== Added ----- - **Core Game Engine** - Complete UNO rules implementation - Card and Deck classes - Game state management - **Basic RL Environment** - Gymnasium-compatible UnoEnv - 17-dimensional observation - 9 discrete actions - **Initial Agents** - Q-Learning (tabular) - DQN agent - PPO and A2C via stable-baselines3 - **Main GUI** - Pygame-based interface - Human vs AI mode - AI vs AI spectator mode - **Training Infrastructure** - train_rl.py general trainer - train_sb3.py for SB3 models - Evaluation callbacks