--- library_name: transformers license: mit model-index: - name: arshLlama results: [] pipeline_tag: text-generation datasets: - ajibawa-2023/Children-Stories-Collection --- This model is a Llama architecture based model with 500m parameters created to write stories. It is pretrained for 4-5 hours on a small dataset using t4 gpu. I've got 2.9 training loss after training. This model shouldn't be used as a project itself, It must be pretrained on some larger datasets. Then, It must be post trained on conversational datasets. # License This model is licensed under MIT.