DARJYO
/

persadian_14B-GRPO

Reinforcement Learning

text-generation-inference

Model card Files Files and versions Community

darjyo commited on Feb 17

Commit

0d7a541

·

verified ·

1 Parent(s): d741ce3

Update README.md

Files changed (1) hide show

README.md +6 -6

README.md CHANGED Viewed

@@ -1,5 +1,4 @@
 ---
-developed by: DARJYO
 license: apache-2.0
 language:
 - en
@@ -7,10 +6,6 @@ metrics:
 - accuracy
 base_model:
 - unsloth/phi-4
-base_type:
-- Fine-tuned language model
-base_architecture:
-- Transformer-based/Phi-4
 library_name: transformers
 tags:
 - text-generation-inference
@@ -20,7 +15,12 @@ tags:
 - datasets
 ---
-# Fine-Tuned Model
 This model is fine-tuned on datasets for tasks with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 It is based on the `unsloth/Phi-4` model and uses reinforcement learning for improved performance.

 ---
 license: apache-2.0
 language:
 - en
 - accuracy
 base_model:
 - unsloth/phi-4
 library_name: transformers
 tags:
 - text-generation-inference
 - datasets
 ---
+# Model
+- **Developed by:** DARJYO
+- **Base Type:** Fine-tuned language model
+- **Finetuned model :** persadian_14B-GRPO
+- **Base Architecture:** Transformer-based/Phi-4
 This model is fine-tuned on datasets for tasks with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 It is based on the `unsloth/Phi-4` model and uses reinforcement learning for improved performance.