---
library_name: transformers
license: mit
model-index:
- name: arshLlama
  results: []
pipeline_tag: text-generation
datasets:
- ajibawa-2023/Children-Stories-Collection
---

This model is a Llama architecture based model with 500m parameters created to write stories.
It is pretrained for 4-5 hours on a small dataset using t4 gpu.
I've got 2.9 training loss after training.
This model shouldn't be used as a project itself, It must be pretrained on some larger datasets. Then, It must be post trained on conversational datasets.


# License
This model is licensed under MIT.