Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
jekunz
/
smollm-135m-lora-fineweb-faroese
like
0
Safetensors
HuggingFaceFW/fineweb-2
Faroese
License:
apache-2.0
Model card
Files
Files and versions
Community
LoRA setup:
Rank: 256
Alpha: 512
Target modules: ["up_proj", "down_proj", "gate_proj", "o_proj"]
Training:
1 Epoch
Learning rate: 8e-4
LR scheduler: Cosine
Warmup ratio: 0.05
Batch size: 1
4 A100 (40GB) GPUs
Gradient accumulation steps: 64
Effective batch size: 256
Max. context length: 8192 tokens
(renamed from jekunz/smollm-135m-lora-fineweb-fao-test3)
Downloads last month
-
Downloads are not tracked for this model.
How to track
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.
Model tree for
jekunz/smollm-135m-lora-fineweb-faroese
Base model
HuggingFaceTB/SmolLM2-135M
Quantized
HuggingFaceTB/SmolLM2-135M-Instruct
Finetuned
(
65
)
this model
Dataset used to train
jekunz/smollm-135m-lora-fineweb-faroese
HuggingFaceFW/fineweb-2
Viewer
•
Updated
22 days ago
•
12.5B
•
71k
•
398
Collection including
jekunz/smollm-135m-lora-fineweb-faroese
SmolLM CPT LoRA
Collection
5 items
•
Updated
about 12 hours ago