medical_condition_classification
This model is a fine-tuned version of distilbert-base-uncased on an Drugs.com dataset. It achieves the following results on the test data set:
- Loss: 0.8930
- Accuracy: 0.7951
Model description
The Goal of the model is to predict the medical condition based on the review of the drug. There're 751
classes.
Intended uses & limitations
More information needed
Training and evaluation data
The training, evaluation & testing data can be found under samsaara/medical_condition_classification of the 🤗 Datasets
and the process itself can be found in the modeling.ipynb
notebook.
By default, the dataset has train, test
splits. train
is then further divided into train, validation
splits with 0.8, 0.2
ratio. Final results shown are on the test
dataset.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 24
- eval_batch_size: 24
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 5
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
1.8625 | 0.4329 | 2000 | 1.7199 | 0.6397 |
1.459 | 0.8658 | 4000 | 1.3696 | 0.6890 |
1.1737 | 1.2987 | 6000 | 1.2131 | 0.7172 |
1.042 | 1.7316 | 8000 | 1.1014 | 0.7329 |
0.8431 | 2.1645 | 10000 | 1.0322 | 0.7510 |
0.8012 | 2.5974 | 12000 | 0.9889 | 0.7587 |
0.7312 | 3.0303 | 14000 | 0.9497 | 0.7727 |
0.6561 | 3.4632 | 16000 | 0.9338 | 0.7805 |
0.6132 | 3.8961 | 18000 | 0.9073 | 0.7875 |
0.5195 | 4.3290 | 20000 | 0.9011 | 0.7929 |
0.5015 | 4.7619 | 22000 | 0.8930 | 0.7951 |
Framework versions
- Transformers 4.45.2
- Pytorch 2.4.1
- Datasets 3.0.1
- Tokenizers 0.20.1
- Downloads last month
- 53
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
Model tree for samsaara/medical_condition_classification
Base model
distilbert/distilbert-base-uncased