mhrtrk/xlm-roberta-xl-german-skill-classifier

Model Details

Model Description

Developed by: Mahir Yilmaz TURK
Funded by Anonymous Client
Model type: XLM-Roberta-XL
Language(s) (NLP): German
Finetuned from model XLM-Roberta-XL

Uses

This model is fine tuned with the intention to classify a given German Task/Skill name into one of the 5 classes:

label_names = ["cognitive", "administrative", "social", "manual", "digital"]

Direct Use

import torch
import numpy as np
from transformers import AutoTokenizer, AutoModelForSequenceClassification, AutoConfig

MODEL_NAME = "mhrtrk/xlm-roberta-xl-german-skill-classifier"
label_names = ["cognitive", "administrative", "social", "manual", "digital"]
MAX_LENGTH = 128

config = AutoConfig.from_pretrained(
    MODEL_NAME,
    num_labels=len(label_names),
    problem_type="multi_label_classification",
)
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = AutoModelForSequenceClassification.from_pretrained(MODEL_NAME, config=config)

def predict(text):
    inputs = tokenizer(
        text,
        truncation=True,
        padding="max_length",
        max_length=MAX_LENGTH,
        return_tensors="pt"
    )
    
    outputs = model(**inputs)
    logits = outputs.logits
    
    probs = torch.sigmoid(logits)
    
    preds = (probs > 0.5).numpy().astype(int)
    
    predicted_labels = [label for label, flag in zip(label_names, preds[0]) if flag == 1]
    return predicted_labels, probs.detach().numpy()

example_text = "d) Kraftübertragungssysteme, insbesondere Schaltgetriebe,\nAutomatikgetriebe und Allradsysteme, instand\nsetzen"
predicted_labels, probabilities = predict(example_text)
print("Predicted Labels:", predicted_labels)
print("Probabilities:", probabilities)

Metrics

Epoch	Training Loss	Validation Loss	F1	Exact Match
1	0.221100	0.194390	0.853914	0.727347
2	0.164300	0.172580	0.863857	0.745306
3	0.140200	0.181491	0.865734	0.744490
4	0.118300	0.185243	0.865450	0.752653

Model Card Authors

Mahir Yilmaz TURK

Model Card Contact

[email protected]

mhrtrk
/

xlm-roberta-xl-german-skill-classifier