deberta-v3-large-finetuned-squadv2
This model is a version of microsoft/deberta-v3-large fine-tuned on the SQuAD version 2.0 dataset. Fine-tuning & evaluation on a NVIDIA Titan RTX - 24GB GPU took 15 hours.
Results from 2023 ICLR paper, "DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing", by Pengcheng He, et. al.
- 'EM' : 89.0
- 'F1' : 91.5
Results calculated with:
metrics = evaluate.load("squad_v2")
squad_v2_metrics = metrics.compute(predictions = formatted_predictions, references = references)
for this fine-tuning:
- 'exact' : 88.70,
- 'f1' : 91.52,
- 'total' : 11873,
- 'HasAns_exact' : 83.70,
- 'HasAns_f1' : 89.35,
- 'HasAns_total' : 5928,
- 'NoAns_exact' : 93.68,
- 'NoAns_f1' : 93.68,
- 'NoAns_total' : 5945,
- 'best_exact' : 88.70,
- 'best_exact_thresh' : 0.0,
- 'best_f1' : 91.52,
- 'best_f1_thresh' : 0.0}
Model description
For the authors' models, code & detailed information see: https://github.com/microsoft/DeBERTa
Intended uses
Extractive question answering on a given context
Fine-tuning hyperparameters
The following hyperparameters, as suggested by the 2023 ICLR paper noted above, were used during fine-tuning:
- learning_rate : 1e-05
- train_batch_size : 8
- eval_batch_size : 8
- seed : 42
- gradient_accumulation_steps : 8
- total_train_batch_size : 64
- optimizer : Adam with betas = (0.9, 0.999) and epsilon = 1e-06
- lr_scheduler_type : linear
- lr_scheduler_warmup_steps : 1000
- training_steps : 5200
Framework versions
- Transformers : 4.35.0.dev0
- Pytorch : 2.1.0+cu121
- Datasets : 2.14.5
- Tokenizers : 0.14.0
System
- CPU : Intel(R) Core(TM) i9-9900K - 32GB RAM
- Python version : 3.11.5 [GCC 11.2.0] (64-bit runtime)
- Python platform : Linux-5.15.0-86-generic-x86_64-with-glibc2.35
- GPU : NVIDIA TITAN RTX - 24GB Memory
- CUDA runtime version : 12.1.105
- Nvidia driver version : 535.113.01
Fine-tuning (Training) results before/after the best model (Step 3620)
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
0.5323 | 1.72 | 3500 | 0.5860 |
0.5129 | 1.73 | 3520 | 0.5656 |
0.5441 | 1.74 | 3540 | 0.5642 |
0.5624 | 1.75 | 3560 | 0.5873 |
0.4645 | 1.76 | 3580 | 0.5891 |
0.5577 | 1.77 | 3600 | 0.5816 |
0.5199 | 1.78 | 3620 | 0.5579 |
0.5061 | 1.79 | 3640 | 0.5837 |
0.484 | 1.79 | 3660 | 0.5721 |
0.5095 | 1.8 | 3680 | 0.5821 |
0.5342 | 1.81 | 3700 | 0.5602 |
- Downloads last month
- 1,088
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for ahotrod/deberta-v3-large-finetuned-squadv2
Base model
microsoft/deberta-v3-largeDataset used to train ahotrod/deberta-v3-large-finetuned-squadv2
Evaluation results
- eval_exact on SQuAD2.0self-reported88.697
- eval_f1 on SQuAD2.0self-reported91.516
- HasAns_exact on SQuAD2.0self-reported83.704
- HasAns_f1 on SQuAD2.0self-reported89.349
- HasAns_total on SQuAD2.0self-reported5928.000
- NoAns_exact on SQuAD2.0self-reported93.675
- NoAns_f1 on SQuAD2.0self-reported93.675
- NoAns_total on SQuAD2.0self-reported5945.000