---
library_name: transformers
license: mit
base_model: microsoft/Phi-4-multimodal-instruct
tags:
- generated_from_trainer
model-index:
- name: Phi4-5.6B-transformers-ex1
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# Phi4-5.6B-transformers-ex1

This model is a fine-tuned version of [microsoft/Phi-4-multimodal-instruct](https://huggingface.co/microsoft/Phi-4-multimodal-instruct) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.4529

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 4
- optimizer: Use OptimizerNames.PAGED_ADAMW_8BIT with betas=(0.9,0.95) and epsilon=1e-07 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 50
- num_epochs: 10

### Training results

| Training Loss | Epoch  | Step | Validation Loss |
|:-------------:|:------:|:----:|:---------------:|
| 0.1653        | 0.0799 | 20   | 0.1542          |
| 0.1324        | 0.1598 | 40   | 0.1429          |
| 0.2598        | 0.2398 | 60   | 0.3326          |
| 0.1638        | 0.3197 | 80   | 0.1500          |
| 0.1499        | 0.3996 | 100  | 0.4031          |
| 0.15          | 0.4795 | 120  | 0.3213          |
| 0.1679        | 0.5594 | 140  | 0.1489          |
| 0.1431        | 0.6394 | 160  | 0.1531          |
| 0.1462        | 0.7193 | 180  | 0.1488          |
| 0.1464        | 0.7992 | 200  | 0.1485          |
| 0.1379        | 0.8791 | 220  | 0.1482          |
| 0.1414        | 0.9590 | 240  | 0.1567          |
| 0.1328        | 1.0360 | 260  | 0.1472          |
| 0.134         | 1.1159 | 280  | 0.1466          |
| 0.1415        | 1.1958 | 300  | 0.1447          |
| 0.141         | 1.2757 | 320  | 0.1470          |
| 0.1378        | 1.3556 | 340  | 0.1685          |
| 0.1425        | 1.4356 | 360  | 0.1560          |
| 0.1405        | 1.5155 | 380  | 0.1412          |
| 0.135         | 1.5954 | 400  | 0.1512          |
| 0.1359        | 1.6753 | 420  | 0.1410          |
| 0.1336        | 1.7552 | 440  | 0.1394          |
| 0.1317        | 1.8352 | 460  | 0.1408          |
| 0.1323        | 1.9151 | 480  | 0.1497          |
| 0.1349        | 1.9950 | 500  | 0.1387          |
| 0.1204        | 2.0719 | 520  | 0.1407          |
| 0.1286        | 2.1518 | 540  | 0.1399          |
| 0.1333        | 2.2318 | 560  | 0.1414          |
| 0.1315        | 2.3117 | 580  | 0.1398          |
| 0.1313        | 2.3916 | 600  | 0.1455          |
| 0.1308        | 2.4715 | 620  | 0.1377          |
| 0.1327        | 2.5514 | 640  | 0.1400          |
| 0.1324        | 2.6314 | 660  | 0.1370          |
| 0.1309        | 2.7113 | 680  | 0.1343          |
| 0.1274        | 2.7912 | 700  | 0.1384          |
| 0.1287        | 2.8711 | 720  | 0.1353          |
| 0.1285        | 2.9510 | 740  | 0.1341          |
| 0.1256        | 3.0280 | 760  | 0.1380          |
| 0.1256        | 3.1079 | 780  | 0.1340          |
| 0.1224        | 3.1878 | 800  | 0.1372          |
| 0.1244        | 3.2677 | 820  | 0.1358          |
| 0.1256        | 3.3477 | 840  | 0.1337          |
| 0.1229        | 3.4276 | 860  | 0.1336          |
| 0.1252        | 3.5075 | 880  | 0.1333          |
| 0.1234        | 3.5874 | 900  | 0.1360          |
| 0.1276        | 3.6673 | 920  | 0.1344          |
| 0.1258        | 3.7473 | 940  | 0.1327          |
| 0.1249        | 3.8272 | 960  | 0.1357          |
| 0.1273        | 3.9071 | 980  | 0.1346          |
| 0.1266        | 3.9870 | 1000 | 0.1356          |
| 0.1172        | 4.0639 | 1020 | 0.1413          |
| 0.1236        | 4.1439 | 1040 | 0.1396          |
| 0.1219        | 4.2238 | 1060 | 0.1368          |
| 0.1187        | 4.3037 | 1080 | 0.1399          |
| 0.1225        | 4.3836 | 1100 | 0.1387          |
| 0.1243        | 4.4635 | 1120 | 0.1370          |
| 0.1218        | 4.5435 | 1140 | 0.1360          |
| 0.1189        | 4.6234 | 1160 | 0.1325          |
| 0.1185        | 4.7033 | 1180 | 0.1373          |
| 0.1251        | 4.7832 | 1200 | 0.1352          |
| 0.1214        | 4.8631 | 1220 | 0.1333          |
| 0.1225        | 4.9431 | 1240 | 0.1339          |
| 0.1138        | 5.0200 | 1260 | 0.1348          |
| 0.1205        | 5.0999 | 1280 | 0.1415          |
| 0.1208        | 5.1798 | 1300 | 0.1434          |
| 0.1165        | 5.2597 | 1320 | 0.1415          |
| 0.1154        | 5.3397 | 1340 | 0.1392          |
| 0.1143        | 5.4196 | 1360 | 0.1442          |
| 0.1165        | 5.4995 | 1380 | 0.1397          |
| 0.1162        | 5.5794 | 1400 | 0.1414          |
| 0.1148        | 5.6593 | 1420 | 0.1389          |
| 0.1133        | 5.7393 | 1440 | 0.1391          |
| 0.1145        | 5.8192 | 1460 | 0.1393          |
| 0.1152        | 5.8991 | 1480 | 0.1397          |
| 0.113         | 5.9790 | 1500 | 0.1407          |
| 0.0993        | 6.0559 | 1520 | 0.1625          |
| 0.0962        | 6.1359 | 1540 | 0.1609          |
| 0.0995        | 6.2158 | 1560 | 0.1573          |
| 0.1028        | 6.2957 | 1580 | 0.1582          |
| 0.0983        | 6.3756 | 1600 | 0.1620          |
| 0.0989        | 6.4555 | 1620 | 0.1572          |
| 0.0987        | 6.5355 | 1640 | 0.1602          |
| 0.0992        | 6.6154 | 1660 | 0.1593          |
| 0.0997        | 6.6953 | 1680 | 0.1644          |
| 0.0967        | 6.7752 | 1700 | 0.1630          |
| 0.0988        | 6.8551 | 1720 | 0.1596          |
| 0.098         | 6.9351 | 1740 | 0.1605          |
| 0.0915        | 7.0120 | 1760 | 0.1662          |
| 0.0666        | 7.0919 | 1780 | 0.2258          |
| 0.0638        | 7.1718 | 1800 | 0.2135          |
| 0.0581        | 7.2517 | 1820 | 0.2290          |
| 0.065         | 7.3317 | 1840 | 0.2115          |
| 0.0611        | 7.4116 | 1860 | 0.2396          |
| 0.059         | 7.4915 | 1880 | 0.2205          |
| 0.0598        | 7.5714 | 1900 | 0.2314          |
| 0.0608        | 7.6513 | 1920 | 0.2309          |
| 0.063         | 7.7313 | 1940 | 0.2383          |
| 0.0621        | 7.8112 | 1960 | 0.2304          |
| 0.0586        | 7.8911 | 1980 | 0.2433          |
| 0.0622        | 7.9710 | 2000 | 0.2354          |
| 0.0369        | 8.0480 | 2020 | 0.3233          |
| 0.0246        | 8.1279 | 2040 | 0.3437          |
| 0.022         | 8.2078 | 2060 | 0.3361          |
| 0.0243        | 8.2877 | 2080 | 0.3413          |
| 0.0235        | 8.3676 | 2100 | 0.3458          |
| 0.0229        | 8.4476 | 2120 | 0.3473          |
| 0.0218        | 8.5275 | 2140 | 0.3523          |
| 0.0234        | 8.6074 | 2160 | 0.3610          |
| 0.0228        | 8.6873 | 2180 | 0.3496          |
| 0.0221        | 8.7672 | 2200 | 0.3519          |
| 0.0223        | 8.8472 | 2220 | 0.3515          |
| 0.0224        | 8.9271 | 2240 | 0.3514          |
| 0.0193        | 9.0040 | 2260 | 0.3542          |
| 0.0081        | 9.0839 | 2280 | 0.4155          |
| 0.0071        | 9.1638 | 2300 | 0.4363          |
| 0.0065        | 9.2438 | 2320 | 0.4446          |
| 0.0057        | 9.3237 | 2340 | 0.4485          |
| 0.0064        | 9.4036 | 2360 | 0.4495          |
| 0.0071        | 9.4835 | 2380 | 0.4502          |
| 0.0058        | 9.5634 | 2400 | 0.4518          |
| 0.0066        | 9.6434 | 2420 | 0.4530          |
| 0.0072        | 9.7233 | 2440 | 0.4535          |
| 0.0064        | 9.8032 | 2460 | 0.4532          |
| 0.0076        | 9.8831 | 2480 | 0.4533          |
| 0.0063        | 9.9630 | 2500 | 0.4529          |


### Framework versions

- Transformers 4.48.2
- Pytorch 2.6.0+cu124
- Datasets 3.4.1
- Tokenizers 0.21.1