ROOK-LM-124M
A 124M parameter language model for chess with chain-of-thought reasoning, trained with synthetic explanations from Stockfish 16.1.
Model Details
Model Description
ROOK-LM generates chess moves with detailed reasoning traces, incorporating position analysis, candidate evaluation, and move selection in a chain-of-thought format.
- Developed by: Jonathan Rahn, Jenia Jitsev (LAION/JSC), Qi Sun (Tokyo Tech/Sakana AI)
- Model type: GPT-2 (autoregressive language model)
- Language(s): Chess notation with natural language explanations
- License: MIT
- Repository: GitHub
- Paper: LAION Research Note
- Logs: Weights & Biases
Model Architecture
- Parameters: 124M
- Architecture: GPT-2 family
- Context Length: up to 2048 tokens
- Training Framework: llm.c (training); HF scripts in this repo support experiments
Uses
Direct Use
- Chess move generation with explanations
- Chess position analysis
- Educational chess tutoring
- Research on reasoning in language models
Downstream Use
- Fine-tuning for specific chess styles
- Integration with chess interfaces
- Building chess teaching assistants
Training Details
Training Data
- Dataset: rook-40m
- Size: 40M positions (6B tokens)
- Generation: Stockfish 16.1 on Tsubame 4.0 supercomputer
- Format: FEN position → reasoning → move
Chain-of-Thought Format
ROOK-LM uses a structured format with position, candidate moves, evaluations, and best move:
<FEN position>
M: <candidate moves in UCI notation>
E: <evaluation scores for each candidate>
B: <best move in UCI notation>
Concrete Training Example:
rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1
M: e2e4 d2d4 g1f3 c2c4 g2g3
E: 0.3 0.3 0.2 0.1 0.0
B: e2e4
Breakdown:
- Position in FEN notation (padded to 90 chars for consistency)
- M: Top 5 candidate moves from Stockfish analysis (UCI format, padded to 30 chars)
- E: Evaluation scores for each candidate move (centipawns/100, padded to 40 chars)
- B: Best move selected by Stockfish
Generation Example (Inference):
prompt = "r1bqkbnr/pppp1ppp/2n5/4p3/4P3/5N2/PPPP1PPP/RNBQKB1R w KQkq - 2 3"
output = "M: d2d4 b1c3 f1c4 f1b5 d2d3 E: 0.6 0.5 0.4 0.3 0.2 B: d2d4"
The model learns to:
- Analyze the position
- Generate plausible candidate moves
- Evaluate each candidate
- Select the best move based on evaluations
Training Procedure
- Hardware: 2x NVIDIA RTX 4090
- Framework: llm.c (karpathy/llm.c)
- Trained for multiple epochs on rook-40m with llm.c; typical sequence length up to 2048
Evaluation
Performance Metrics
- Action accuracy (rook-40m, 3 epochs): 22.2%
- BIG-bench Checkmate-in-One: 24.4%
- Values from the LAION research note
Reasoning Quality
The model generates coherent chess analysis including:
- Position evaluation
- Tactical motif identification
- Strategic planning
- Move justification
Technical Details
Tokenization
Custom chess tokenizer combining:
- FEN notation tokens
- UCI move notation
- Natural language vocabulary
- Special tokens for structure
Integration with llm.c
The model uses the llm.c framework for efficient training:
./train_gpt2 \
--input_bin data/rook_train.bin \
--val_bin data/rook_val.bin \
--model_file log/model.bin \
--batch_size 512 \
--sequence_length 2048
Limitations
- Computation: No deep search capabilities
- Tactics: May miss complex combinations
- Consistency: Reasoning may not always align with move choice
- Context: Limited by 2048 token context window
Related Models
Citation
@article{rook2024,
title={ROOK: Strategic Reasoning in Chess Without Search},
author={Rahn, Jonathan and Jitsev, Jenia and Sun, Qi},
journal={LAION Research Notes},
year={2024},
url={https://laion.ai/notes/rook/}
}
Model Card Contact
Jonathan Rahn - GitHub | Research Page
Metrics Source
LAION research note: https://laion.ai/notes/rook/