Model save

Browse files

Files changed (8) hide show

README.md +20 -19
all_results.json +5 -5
model.safetensors +1 -1
runs/Jun15_00-53-11_92b2e0e6fb20/events.out.tfevents.1749948792.92b2e0e6fb20.2194.12 +3 -0
runs/Jun15_00-58-48_92b2e0e6fb20/events.out.tfevents.1749949130.92b2e0e6fb20.2194.13 +3 -0
train_results.json +5 -5
training_args.bin +1 -1
training_summary.json +45 -0

README.md CHANGED Viewed

@@ -16,15 +16,15 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [maximuspowers/bert-philosophy-adapted](https://huggingface.co/maximuspowers/bert-philosophy-adapted) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.5291
-- Exact Match Accuracy: 0.4
-- Macro Precision: 0.1658
-- Macro Recall: 0.1265
-- Macro F1: 0.1410
-- Micro Precision: 0.92
-- Micro Recall: 0.4035
-- Micro F1: 0.5610
-- Hamming Loss: 0.0529
 ## Model description
@@ -52,22 +52,23 @@ The following hyperparameters were used during training:
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 100
-- num_epochs: 50
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Exact Match Accuracy | Macro Precision | Macro Recall | Macro F1 | Micro Precision | Micro Recall | Micro F1 | Hamming Loss |
 |:-------------:|:-----:|:----:|:---------------:|:--------------------:|:---------------:|:------------:|:--------:|:---------------:|:------------:|:--------:|:------------:|
-| 1.7889        | 5.0   | 100  | 1.0021          | 0.0                  | 0.0             | 0.0          | 0.0      | 0.0             | 0.0          | 0.0      | 0.0853       |
-| 1.156         | 10.0  | 200  | 0.8631          | 0.0                  | 0.0             | 0.0          | 0.0      | 0.0             | 0.0          | 0.0      | 0.0838       |
-| 0.8775        | 15.0  | 300  | 0.9324          | 0.05                 | 0.0588          | 0.0267       | 0.0368   | 1.0             | 0.0877       | 0.1613   | 0.0765       |
-| 0.7747        | 20.0  | 400  | 0.7537          | 0.1                  | 0.1092          | 0.0615       | 0.0784   | 0.875           | 0.1228       | 0.2154   | 0.075        |
-| 0.7074        | 25.0  | 500  | 0.8191          | 0.175                | 0.1487          | 0.0845       | 0.1056   | 0.7857          | 0.1930       | 0.3099   | 0.0721       |
-| 0.6281        | 30.0  | 600  | 0.8507          | 0.275                | 0.1574          | 0.1134       | 0.1298   | 0.8421          | 0.2807       | 0.4211   | 0.0647       |
-| 0.5506        | 35.0  | 700  | 0.7439          | 0.25                 | 0.1563          | 0.1075       | 0.1256   | 0.8333          | 0.2632       | 0.4      | 0.0662       |
-| 0.5091        | 40.0  | 800  | 0.7972          | 0.275                | 0.1574          | 0.1134       | 0.1298   | 0.8421          | 0.2807       | 0.4211   | 0.0647       |
-| 0.5038        | 45.0  | 900  | 0.8156          | 0.275                | 0.1574          | 0.1134       | 0.1298   | 0.8421          | 0.2807       | 0.4211   | 0.0647       |
 ### Framework versions

 This model is a fine-tuned version of [maximuspowers/bert-philosophy-adapted](https://huggingface.co/maximuspowers/bert-philosophy-adapted) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.7948
+- Exact Match Accuracy: 0.225
+- Macro Precision: 0.2908
+- Macro Recall: 0.1502
+- Macro F1: 0.1930
+- Micro Precision: 0.7083
+- Micro Recall: 0.2982
+- Micro F1: 0.4198
+- Hamming Loss: 0.0691
 ## Model description
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 100
+- num_epochs: 500
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Exact Match Accuracy | Macro Precision | Macro Recall | Macro F1 | Micro Precision | Micro Recall | Micro F1 | Hamming Loss |
 |:-------------:|:-----:|:----:|:---------------:|:--------------------:|:---------------:|:------------:|:--------:|:---------------:|:------------:|:--------:|:------------:|
+| 1.796         | 5.0   | 100  | 0.9528          | 0.0                  | 0.0588          | 0.0053       | 0.0098   | 1.0             | 0.0175       | 0.0345   | 0.0824       |
+| 1.142         | 10.0  | 200  | 0.8632          | 0.0                  | 0.0588          | 0.0053       | 0.0098   | 1.0             | 0.0175       | 0.0345   | 0.0824       |
+| 0.8805        | 15.0  | 300  | 0.9825          | 0.05                 | 0.0490          | 0.0267       | 0.0346   | 0.8333          | 0.0877       | 0.1587   | 0.0779       |
+| 0.7442        | 20.0  | 400  | 0.7654          | 0.1                  | 0.1046          | 0.0668       | 0.0804   | 0.8             | 0.1404       | 0.2388   | 0.075        |
+| 0.6332        | 25.0  | 500  | 0.8304          | 0.175                | 0.1433          | 0.0904       | 0.1080   | 0.7059          | 0.2105       | 0.3243   | 0.0735       |
+| 0.5572        | 30.0  | 600  | 0.7903          | 0.225                | 0.1597          | 0.0968       | 0.1200   | 0.8667          | 0.2281       | 0.3611   | 0.0676       |
+| 0.4788        | 35.0  | 700  | 0.7919          | 0.25                 | 0.2151          | 0.1173       | 0.1424   | 0.8421          | 0.2807       | 0.4211   | 0.0647       |
+| 0.418         | 40.0  | 800  | 0.7885          | 0.2                  | 0.3301          | 0.1355       | 0.1810   | 0.8421          | 0.2807       | 0.4211   | 0.0647       |
+| 0.3975        | 45.0  | 900  | 0.8244          | 0.225                | 0.2291          | 0.1261       | 0.1554   | 0.7273          | 0.2807       | 0.4051   | 0.0691       |
+| 0.3431        | 50.0  | 1000 | 0.7948          | 0.225                | 0.2908          | 0.1502       | 0.1930   | 0.7083          | 0.2982       | 0.4198   | 0.0691       |
 ### Framework versions

all_results.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-    "epoch": 45.0,
     "eval_exact_match_accuracy": 0.4,
     "eval_hamming_loss": 0.052941176470588235,
     "eval_loss": 0.5290737152099609,
@@ -13,8 +13,8 @@
     "eval_samples_per_second": 188.615,
     "eval_steps_per_second": 23.577,
     "total_flos": 0.0,
-    "train_loss": 0.9705644819471572,
-    "train_runtime": 232.6541,
-    "train_samples_per_second": 67.912,
-    "train_steps_per_second": 4.298
 }

 {
+    "epoch": 50.0,
     "eval_exact_match_accuracy": 0.4,
     "eval_hamming_loss": 0.052941176470588235,
     "eval_loss": 0.5290737152099609,
     "eval_samples_per_second": 188.615,
     "eval_steps_per_second": 23.577,
     "total_flos": 0.0,
+    "train_loss": 0.8574352493286133,
+    "train_runtime": 257.7927,
+    "train_samples_per_second": 612.896,
+    "train_steps_per_second": 38.791
 }

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:890f9065c802bc97c554e035af9eaa8ef8da20f13c0f284d224585cdb51a36aa
 size 441154988

 version https://git-lfs.github.com/spec/v1
+oid sha256:9f93741bca9dca2d72ab973d0b940e200dc08392794fd15ca078df27cb31350e
 size 441154988

runs/Jun15_00-53-11_92b2e0e6fb20/events.out.tfevents.1749948792.92b2e0e6fb20.2194.12 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:846a6f1b494a7193b33d2ffe15091a5d41830e73450cd6194604a44ab3d77dc9
+size 6680

runs/Jun15_00-58-48_92b2e0e6fb20/events.out.tfevents.1749949130.92b2e0e6fb20.2194.13 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4fb5bb1e51ed2a05490aa715749d0ae6cbe279aeca58c68e340c95ed949b284c
+size 62762

train_results.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
-    "epoch": 45.0,
     "total_flos": 0.0,
-    "train_loss": 0.9705644819471572,
-    "train_runtime": 232.6541,
-    "train_samples_per_second": 67.912,
-    "train_steps_per_second": 4.298
 }

 {
+    "epoch": 50.0,
     "total_flos": 0.0,
+    "train_loss": 0.8574352493286133,
+    "train_runtime": 257.7927,
+    "train_samples_per_second": 612.896,
+    "train_steps_per_second": 38.791
 }

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:66e01343304a8027b49b07fccbfd92f2c7fc70a061de471b2412977d28ec9eac
 size 5368

 version https://git-lfs.github.com/spec/v1
+oid sha256:d901cfad6e06e460cafaa9ca2058746b5b75a1a99806fb8871440df59f4b2356
 size 5368

training_summary.json ADDED Viewed

	@@ -0,0 +1,45 @@

+{
+  "model_name": "bert-philosophy-classifier",
+  "base_model": "maximuspowers/bert-philosophy-adapted",
+  "dataset": "maximuspowers/philosophai-papers-labeled",
+  "training_samples": 316,
+  "validation_samples": 40,
+  "test_samples": 40,
+  "num_epochs": 50,
+  "learning_rate": 2e-05,
+  "batch_size": 16,
+  "contrastive_weight": 0.2,
+  "test_results": {
+    "loss": 0.5290737152099609,
+    "exact_match_accuracy": 0.4,
+    "macro_precision": 0.1657754010695187,
+    "macro_recall": 0.1264705882352941,
+    "macro_f1": 0.14097904608067482,
+    "micro_precision": 0.92,
+    "micro_recall": 0.40350877192982454,
+    "micro_f1": 0.5609756097560976,
+    "hamming_loss": 0.052941176470588235,
+    "runtime": 0.2121,
+    "samples_per_second": 188.615,
+    "steps_per_second": 23.577
+  },
+  "philosophy_schools": [
+    "Effective Altruism",
+    "Existentialism",
+    "Idealism",
+    "Empiricism",
+    "Utilitarianism",
+    "Stoicism",
+    "Rationalism",
+    "Pragmatism",
+    "Cynicism",
+    "Confucianism",
+    "Hedonism",
+    "Deontology",
+    "Fanaticism",
+    "Nihilism",
+    "Absurdism",
+    "Transcendentalism",
+    "Machiavellanism"
+  ]
+}