ViT-Beatrix: Fractal PE + Geometric Simplex Vision Transformer
This repository contains Vision Transformers integrating Devil's Staircase positional encoding with geometric simplex features for vision tasks.
Key Features
- Fractal Positional Encoding: Devil's Staircase multi-scale position embeddings
- Geometric Simplex Features: k-simplex vertex computations from Cantor measure
- SimplexFactory Initialization: Pre-initialized simplices with geometrically meaningful shapes (regular/random/uniform)
- Adaptive Augmentation: Progressive augmentation escalation to prevent overfitting
- Beatrix Formula Suite: Flow alignment, hierarchical coherence, and multi-scale consistency losses
Simplex Initialization
Instead of random initialization, the model uses SimplexFactory to create geometrically sound starting configurations:
- Regular (default): All edges equal length, perfectly balanced symmetric structure
- Random: QR decomposition ensuring affine independence
- Uniform: Hypercube sampling with perturbations
Regular simplices provide the most stable and mathematically meaningful initialization, giving the model a better starting point for learning geometric features.
Adaptive Augmentation System
The trainer includes an intelligent augmentation system that monitors train/validation accuracy gap and progressively enables more augmentation:
- Baseline: RandomCrop + RandomHorizontalFlip
- Stage 1: + ColorJitter
- Stage 2: + RandomRotation
- Stage 3: + RandomAffine
- Stage 4: + RandomErasing
- Stage 5: + AutoAugment (CIFAR policy)
- Stage 6: Enable Mixup (α=0.2)
- Stage 7: Enable CutMix (α=1.0) - Final stage
When train accuracy exceeds validation accuracy by 2% or more, the system automatically escalates to the next augmentation stage.
Available Models (Best Checkpoints Only)
| Model Name | Training Session | Accuracy | Epoch | Weights Path | Logs Path |
|---|---|---|---|---|---|
| beatrix-cifar100 | 20251007_182851 | 0.5819 | 42 | weights/beatrix-cifar100/20251007_182851 |
N/A |
| beatrix-simplex4-patch4-512d-flow | 20251008_115206 | 0.5674 | 87 | weights/beatrix-simplex4-patch4-512d-flow/20251008_115206 |
logs/beatrix-simplex4-patch4-512d-flow/20251008_115206 |
| beatrix-simplex7-patch4-256d-ce | 20251008_034231 | 0.5372 | 77 | weights/beatrix-simplex7-patch4-256d-ce/20251008_034231 |
logs/beatrix-simplex7-patch4-256d-ce/20251008_034231 |
| beatrix-simplex7-patch4-256d | 20251008_020048 | 0.5291 | 89 | weights/beatrix-simplex7-patch4-256d/20251008_020048 |
logs/beatrix-simplex7-patch4-256d/20251008_020048 |
| beatrix-cifar100 | 20251007_215344 | 0.5161 | 41 | weights/beatrix-cifar100/20251007_215344 |
logs/beatrix-cifar100/20251007_215344 |
| beatrix-cifar100 | 20251007_195812 | 0.4701 | 42 | weights/beatrix-cifar100/20251007_195812 |
logs/beatrix-cifar100/20251007_195812 |
| beatrix-cifar100 | 20251008_002950 | 0.4363 | 49 | weights/beatrix-cifar100/20251008_002950 |
logs/beatrix-cifar100/20251008_002950 |
| beatrix-cifar100 | 20251007_203741 | 0.4324 | 40 | weights/beatrix-cifar100/20251007_203741 |
logs/beatrix-cifar100/20251007_203741 |
| beatrix-simplex7-patch4-45d | 20251008_010524 | 0.2917 | 95 | weights/beatrix-simplex7-patch4-45d/20251008_010524 |
logs/beatrix-simplex7-patch4-45d/20251008_010524 |
| beatrix-4simplex-45d | 20251007_231008 | 0.2916 | 85 | weights/beatrix-4simplex-45d/20251007_231008 |
logs/beatrix-4simplex-45d/20251007_231008 |
| beatrix-cifar100 | 20251007_193112 | 0.2802 | 10 | weights/beatrix-cifar100/20251007_193112 |
N/A |
| beatrix-4simplex-45d | 20251008_001147 | 0.1382 | 10 | weights/beatrix-4simplex-45d/20251008_001147 |
logs/beatrix-4simplex-45d/20251008_001147 |
Latest Updated Model: beatrix-simplex4-patch4-512d-flow (Session: 20251008_115206)
Model Details
- Architecture: Vision Transformer with fractal positional encoding
- Dataset: CIFAR-100 (100 classes)
- Embedding Dimension: 512
- Depth: 8 layers
- Patch Size: 4x4
- PE Levels: 12
- Simplex Dimension: 4-simplex
- Simplex Initialization: regular (scale=1.0)
Training Details
- Training Session: 20251008_115206
- Best Accuracy: 0.5674
- Epochs Trained: 87
- Batch Size: 512
- Learning Rate: 0.0001
- Adaptive Augmentation: Enabled
Loss Configuration
- Task Loss Weight: 0.5
- Flow Alignment Weight: 1.0
- Coherence Weight: 0.3
- Multi-Scale Weight: 0.2
TensorBoard Logs
Training logs are available in the repository at:
logs/beatrix-simplex4-patch4-512d-flow/20251008_115206
To view them locally:
# Clone the repo
git clone https://huggingface.co/AbstractPhil/vit-beatrix
# View logs in TensorBoard
tensorboard --logdir vit-beatrix/logs/beatrix-simplex4-patch4-512d-flow/20251008_115206
Usage
Installation
For Google Colab:
# Install for Colab
try:
!pip uninstall -qy geometricvocab
except:
pass
!pip install -q git+https://github.com/AbstractEyes/lattice_vocabulary.git
For local environments:
# install the repo into your environment
pip install -q git+https://github.com/AbstractEyes/lattice_vocabulary.git
Loading Models
from geovocab2.train.model.core.vit_beatrix import SimplifiedGeometricClassifier
from safetensors.torch import load_file
from huggingface_hub import hf_hub_download
import json
# Download and view manifest to see all available models
manifest_path = hf_hub_download(
repo_id="AbstractPhil/vit-beatrix",
filename="manifest.json"
)
with open(manifest_path, 'r') as f:
manifest = json.load(f)
# List all available models sorted by accuracy
for key, info in sorted(manifest.items(), key=lambda x: x[1]['accuracy'], reverse=True):
print(f"{info['model_name']} ({info['timestamp']}): {info['accuracy']:.4f}")
# Download weights for the latest training session of beatrix-simplex4-patch4-512d-flow
weights_path = hf_hub_download(
repo_id="AbstractPhil/vit-beatrix",
filename="weights/beatrix-simplex4-patch4-512d-flow/20251008_115206/model.safetensors"
)
# Load model
model = SimplifiedGeometricClassifier(
num_classes=100,
img_size=32,
embed_dim=512,
depth=8
)
# Load weights
state_dict = load_file(weights_path)
model.load_state_dict(state_dict)
model.eval()
# Inference
output = model(images)
Citation
@misc{vit-beatrix,
author = {AbstractPhil},
title = {ViT-Beatrix: Fractal Positional Encoding with Geometric Simplices},
year = {2025},
url = {https://github.com/AbstractEyes/lattice_vocabulary}
}
License
MIT License
- Downloads last month
- 495