ssingh22's picture
Upload README.md with huggingface_hub
e2241c6 verified

LLAMIAFlux - Pretrained Model

This model was pretrained on the coyo-hd-11m-llavanext dataset to predict CLIP embeddings from text descriptions.

  • Number of image generation heads: 4
  • Training parameters:
    • Batch size: 32
    • Learning rate: 3e-05
    • Weight decay: 0.01
    • Epochs: 2
    • Model base: ssingh22/LLAMIAFlux-7b-unprojector-inverted
    • Trained on 9438926 image-caption pairs from coyo-hd-11m-llavanext
    • Trained on 9438926 image-caption pairs from coyo-hd-11m-llavanext