MNLP_M3_mcqa_model

This model is a fine-tuned version of Thimphou/MNLP_M3_SFT_code_5percent for Multiple Choice Question Answering (MCQA) tasks.

Model Details

Base Model: Thimphou/MNLP_M3_SFT_code_5percent
Task: Multiple Choice Question Answering
Model Type: Classic
Training Context: With context
Evaluation Context: Without context
Fine-tuning Method: Causal Language Modeling

Training Details

Epochs: 3
Learning Rate: 5e-05
Batch Size: 2
Training Framework: Transformers + PyTorch

Performance

Metric	Baseline	Fine-tuned	Improvement
Accuracy	48.00%	54.00%	+6.00%

Training Data

The model was fine-tuned on a custom MCQA dataset with the following characteristics:

Format: Multiple choice questions with 4 options (A, B, C, D)
Context: Included during training
Evaluation: Without context

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("MNLP_M3_mcqa_model", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("MNLP_M3_mcqa_model", trust_remote_code=True)

# For MCQA tasks, provide the question and options, then generate the answer
prompt = "Question: What is the capital of France?\nA) London\nB) Berlin\nC) Paris\nD) Madrid\nAnswer:"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=5)
answer = tokenizer.decode(outputs[0], skip_special_tokens=True)

Downloads last month: 4

Safetensors

Model size

596M params

Tensor type

F32

Model tree for Thimphou/MNLP_M3_mcqa_model

Base model

Thimphou/MNLP_M3_SFT_code_5percent

Finetuned

(1)

this model