ViT-Beatrix: Fractal PE + Geometric Simplex Vision Transformer

This repository contains Vision Transformers integrating Devil's Staircase positional encoding with geometric simplex features for vision tasks.

Key Features

  • Fractal Positional Encoding: Devil's Staircase multi-scale position embeddings
  • Geometric Simplex Features: k-simplex vertex computations from Cantor measure
  • SimplexFactory Initialization: Pre-initialized simplices with geometrically meaningful shapes (regular/random/uniform)
  • Adaptive Augmentation: Progressive augmentation escalation to prevent overfitting
  • Beatrix Formula Suite: Flow alignment, hierarchical coherence, and multi-scale consistency losses

Simplex Initialization

Instead of random initialization, the model uses SimplexFactory to create geometrically sound starting configurations:

  • Regular (default): All edges equal length, perfectly balanced symmetric structure
  • Random: QR decomposition ensuring affine independence
  • Uniform: Hypercube sampling with perturbations

Regular simplices provide the most stable and mathematically meaningful initialization, giving the model a better starting point for learning geometric features.

Adaptive Augmentation System

The trainer includes an intelligent augmentation system that monitors train/validation accuracy gap and progressively enables more augmentation:

  1. Baseline: RandomCrop + RandomHorizontalFlip
  2. Stage 1: + ColorJitter
  3. Stage 2: + RandomRotation
  4. Stage 3: + RandomAffine
  5. Stage 4: + RandomErasing
  6. Stage 5: + AutoAugment (CIFAR policy)
  7. Stage 6: Enable Mixup (α=0.2)
  8. Stage 7: Enable CutMix (α=1.0) - Final stage

When train accuracy exceeds validation accuracy by 2% or more, the system automatically escalates to the next augmentation stage.

Available Models (Best Checkpoints Only)

Model Name Training Session Accuracy Epoch Weights Path Logs Path
beatrix-cifar100 20251007_182851 0.5819 42 weights/beatrix-cifar100/20251007_182851 N/A
beatrix-simplex4-patch4-512d-flow 20251008_115206 0.5674 87 weights/beatrix-simplex4-patch4-512d-flow/20251008_115206 logs/beatrix-simplex4-patch4-512d-flow/20251008_115206
beatrix-simplex7-patch4-256d-ce 20251008_034231 0.5372 77 weights/beatrix-simplex7-patch4-256d-ce/20251008_034231 logs/beatrix-simplex7-patch4-256d-ce/20251008_034231
beatrix-simplex7-patch4-256d 20251008_020048 0.5291 89 weights/beatrix-simplex7-patch4-256d/20251008_020048 logs/beatrix-simplex7-patch4-256d/20251008_020048
beatrix-cifar100 20251007_215344 0.5161 41 weights/beatrix-cifar100/20251007_215344 logs/beatrix-cifar100/20251007_215344
beatrix-cifar100 20251007_195812 0.4701 42 weights/beatrix-cifar100/20251007_195812 logs/beatrix-cifar100/20251007_195812
beatrix-cifar100 20251008_002950 0.4363 49 weights/beatrix-cifar100/20251008_002950 logs/beatrix-cifar100/20251008_002950
beatrix-cifar100 20251007_203741 0.4324 40 weights/beatrix-cifar100/20251007_203741 logs/beatrix-cifar100/20251007_203741
beatrix-simplex7-patch4-45d 20251008_010524 0.2917 95 weights/beatrix-simplex7-patch4-45d/20251008_010524 logs/beatrix-simplex7-patch4-45d/20251008_010524
beatrix-4simplex-45d 20251007_231008 0.2916 85 weights/beatrix-4simplex-45d/20251007_231008 logs/beatrix-4simplex-45d/20251007_231008
beatrix-cifar100 20251007_193112 0.2802 10 weights/beatrix-cifar100/20251007_193112 N/A
beatrix-4simplex-45d 20251008_001147 0.1382 10 weights/beatrix-4simplex-45d/20251008_001147 logs/beatrix-4simplex-45d/20251008_001147

Latest Updated Model: beatrix-simplex4-patch4-512d-flow (Session: 20251008_115206)

Model Details

  • Architecture: Vision Transformer with fractal positional encoding
  • Dataset: CIFAR-100 (100 classes)
  • Embedding Dimension: 512
  • Depth: 8 layers
  • Patch Size: 4x4
  • PE Levels: 12
  • Simplex Dimension: 4-simplex
  • Simplex Initialization: regular (scale=1.0)

Training Details

  • Training Session: 20251008_115206
  • Best Accuracy: 0.5674
  • Epochs Trained: 87
  • Batch Size: 512
  • Learning Rate: 0.0001
  • Adaptive Augmentation: Enabled

Loss Configuration

  • Task Loss Weight: 0.5
  • Flow Alignment Weight: 1.0
  • Coherence Weight: 0.3
  • Multi-Scale Weight: 0.2

TensorBoard Logs

Training logs are available in the repository at:

logs/beatrix-simplex4-patch4-512d-flow/20251008_115206

To view them locally:

# Clone the repo
git clone https://huggingface.co/AbstractPhil/vit-beatrix

# View logs in TensorBoard
tensorboard --logdir vit-beatrix/logs/beatrix-simplex4-patch4-512d-flow/20251008_115206

Usage

Installation

For Google Colab:

# Install for Colab
try:
  !pip uninstall -qy geometricvocab
except:
  pass

!pip install -q git+https://github.com/AbstractEyes/lattice_vocabulary.git

For local environments:

# install the repo into your environment
pip install -q git+https://github.com/AbstractEyes/lattice_vocabulary.git

Loading Models

from geovocab2.train.model.core.vit_beatrix import SimplifiedGeometricClassifier
from safetensors.torch import load_file
from huggingface_hub import hf_hub_download
import json

# Download and view manifest to see all available models
manifest_path = hf_hub_download(
    repo_id="AbstractPhil/vit-beatrix",
    filename="manifest.json"
)

with open(manifest_path, 'r') as f:
    manifest = json.load(f)
    
# List all available models sorted by accuracy
for key, info in sorted(manifest.items(), key=lambda x: x[1]['accuracy'], reverse=True):
    print(f"{info['model_name']} ({info['timestamp']}): {info['accuracy']:.4f}")

# Download weights for the latest training session of beatrix-simplex4-patch4-512d-flow
weights_path = hf_hub_download(
    repo_id="AbstractPhil/vit-beatrix",
    filename="weights/beatrix-simplex4-patch4-512d-flow/20251008_115206/model.safetensors"
)

# Load model
model = SimplifiedGeometricClassifier(
    num_classes=100,
    img_size=32,
    embed_dim=512,
    depth=8
)

# Load weights
state_dict = load_file(weights_path)
model.load_state_dict(state_dict)
model.eval()

# Inference
output = model(images)

Citation

@misc{vit-beatrix,
  author = {AbstractPhil},
  title = {ViT-Beatrix: Fractal Positional Encoding with Geometric Simplices},
  year = {2025},
  url = {https://github.com/AbstractEyes/lattice_vocabulary}
}

License

MIT License

Downloads last month
495
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including AbstractPhil/vit-beatrix