Instructions to use jrahn/ROOK-LM-124m with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use jrahn/ROOK-LM-124m with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="jrahn/ROOK-LM-124m")# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("jrahn/ROOK-LM-124m") model = AutoModelForMultimodalLM.from_pretrained("jrahn/ROOK-LM-124m") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use jrahn/ROOK-LM-124m with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "jrahn/ROOK-LM-124m" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jrahn/ROOK-LM-124m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/jrahn/ROOK-LM-124m
- SGLang
How to use jrahn/ROOK-LM-124m with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "jrahn/ROOK-LM-124m" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jrahn/ROOK-LM-124m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "jrahn/ROOK-LM-124m" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jrahn/ROOK-LM-124m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use jrahn/ROOK-LM-124m with Docker Model Runner:
docker model run hf.co/jrahn/ROOK-LM-124m
ROOK-LM-124M
A 124M parameter language model for chess with chain-of-thought reasoning, trained with synthetic explanations from Stockfish 16.1.
Model Details
Model Description
ROOK-LM generates chess moves with detailed reasoning traces, incorporating position analysis, candidate evaluation, and move selection in a chain-of-thought format.
- Developed by: Jonathan Rahn, Jenia Jitsev (LAION/JSC), Qi Sun (Tokyo Tech/Sakana AI)
- Model type: GPT-2 (autoregressive language model)
- Language(s): Chess notation with natural language explanations
- License: MIT
- Repository: GitHub
- Paper: LAION Research Note
- Logs: Weights & Biases
Model Architecture
- Parameters: 124M
- Architecture: GPT-2 family
- Context Length: up to 2048 tokens
- Training Framework: llm.c (training); HF scripts in this repo support experiments
Uses
Direct Use
- Chess move generation with explanations
- Chess position analysis
- Educational chess tutoring
- Research on reasoning in language models
Downstream Use
- Fine-tuning for specific chess styles
- Integration with chess interfaces
- Building chess teaching assistants
Training Details
Training Data
- Dataset: rook-40m
- Size: 40M positions (6B tokens)
- Generation: Stockfish 16.1 on Tsubame 4.0 supercomputer
- Format: FEN position → reasoning → move
Chain-of-Thought Format
ROOK-LM uses a structured format with position, candidate moves, evaluations, and best move:
<FEN position>
M: <candidate moves in UCI notation>
E: <evaluation scores for each candidate>
B: <best move in UCI notation>
Concrete Training Example:
rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1
M: e2e4 d2d4 g1f3 c2c4 g2g3
E: 0.3 0.3 0.2 0.1 0.0
B: e2e4
Breakdown:
- Position in FEN notation (padded to 90 chars for consistency)
- M: Top 5 candidate moves from Stockfish analysis (UCI format, padded to 30 chars)
- E: Evaluation scores for each candidate move (centipawns/100, padded to 40 chars)
- B: Best move selected by Stockfish
Generation Example (Inference):
# Input prompt
prompt = "r1bqkbnr/pppp1ppp/2n5/4p3/4P3/5N2/PPPP1PPP/RNBQKB1R w KQkq - 2 3"
# Model generates continuation (stripped padding)
output = "M: d2d4 b1c3 f1c4 f1b5 d2d3 E: 0.6 0.5 0.4 0.3 0.2 B: d2d4"
The model learns to:
- Analyze the position
- Generate plausible candidate moves
- Evaluate each candidate
- Select the best move based on evaluations
Training Procedure
- Hardware: 2x NVIDIA RTX 4090
- Framework: llm.c (karpathy/llm.c)
- Trained for multiple epochs on rook-40m with llm.c; typical sequence length up to 2048
Evaluation
Performance Metrics
- Action accuracy (rook-40m, 3 epochs): 22.2%
- BIG-bench Checkmate-in-One: 24.4%
- Values from the LAION research note
Reasoning Quality
The model generates coherent chess analysis including:
- Position evaluation
- Tactical motif identification
- Strategic planning
- Move justification
Technical Details
Tokenization
Custom chess tokenizer combining:
- FEN notation tokens
- UCI move notation
- Natural language vocabulary
- Special tokens for structure
Integration with llm.c
The model uses the llm.c framework for efficient training:
# Training command
./train_gpt2 \
--input_bin data/rook_train.bin \
--val_bin data/rook_val.bin \
--model_file log/model.bin \
--batch_size 512 \
--sequence_length 2048
Limitations
- Computation: No deep search capabilities
- Tactics: May miss complex combinations
- Consistency: Reasoning may not always align with move choice
- Context: Limited by 2048 token context window
Related Models
- ROOK-CLF-9M: Classification approach
- RookWorld-LM-124M: Unified agent+environment model
Citation
@article{rook2024,
title={ROOK: Strategic Reasoning in Chess Without Search},
author={Rahn, Jonathan and Jitsev, Jenia and Sun, Qi},
journal={LAION Research Notes},
year={2024},
url={https://laion.ai/notes/rook/}
}
Model Card Contact
Jonathan Rahn - GitHub | Research Page
Metrics Source
LAION research note: https://laion.ai/notes/rook/
- Downloads last month
- 11