GPT-2 Medium Catalan-English Model

The model is still being trained, and I will be making updates. Please do not expect great results just yet. 😀

Model Overview

This model is a GPT-2 Medium architecture trained from scratch, meaning it does not inherit any weights from existing models. It has been trained using Catalan dataset, specifically ELiRF/dacsa and projecte-aina/CATalog.

License and Usage

This model is free to use under the MIT license. However, proper credit must be given when using it in research, applications, or any derived work.

Tokenizer

The model utilizes a 52,000-token vocabulary, using gpt2 config, specifically trained to handle Catalan, the tokenizer is also available in "Marxx01/gpt2-catalan-tokenizer".

How to Use

To use this model for text generation, you can load it with the transformers library as follows:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Marxx01/test_gpt2_catalan"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

text = "Hola, com estàs?"
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Downloads last month
36
Safetensors
Model size
357M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for Marxx01/test_gpt2_catalan

Finetuned
(1400)
this model

Datasets used to train Marxx01/test_gpt2_catalan