Update README.md
Browse files
README.md
CHANGED
@@ -1,54 +1,108 @@
|
|
1 |
---
|
2 |
library_name: peft
|
3 |
-
license:
|
4 |
base_model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
|
5 |
tags:
|
6 |
- trl
|
7 |
- sft
|
|
|
|
|
8 |
- generated_from_trainer
|
9 |
model-index:
|
10 |
- name: sajjadhadi-Disease-Diagnosis-DeepSeek-R1-Distill-Llama-8B
|
11 |
results: []
|
|
|
|
|
|
|
|
|
12 |
---
|
13 |
|
14 |
-
<!-- This model card has been generated automatically according to the information the Trainer had access to.
|
15 |
-
should probably proofread and complete it, then remove this comment. -->
|
16 |
|
17 |
-
|
18 |
-
# sajjadhadi-Disease-Diagnosis-DeepSeek-R1-Distill-Llama-8B
|
19 |
|
20 |
-
This model is a
|
21 |
|
22 |
-
|
23 |
|
24 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
25 |
|
26 |
-
##
|
27 |
|
28 |
-
|
|
|
|
|
29 |
|
30 |
-
|
31 |
|
32 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
33 |
|
34 |
-
##
|
35 |
|
36 |
-
###
|
|
|
|
|
|
|
37 |
|
38 |
-
|
39 |
-
-
|
40 |
-
-
|
41 |
-
-
|
42 |
-
- seed: 42
|
43 |
-
- optimizer: Use paged_adamw_32bit with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
44 |
-
- lr_scheduler_type: cosine
|
45 |
-
- lr_scheduler_warmup_ratio: 0.03
|
46 |
-
- num_epochs: 1
|
47 |
|
48 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
49 |
|
50 |
-
|
51 |
-
- Transformers 4.49.0
|
52 |
-
- Pytorch 2.6.0+cu124
|
53 |
-
- Datasets 3.3.2
|
54 |
-
- Tokenizers 0.21.0
|
|
|
1 |
---
|
2 |
library_name: peft
|
3 |
+
license: apache-2.0
|
4 |
base_model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
|
5 |
tags:
|
6 |
- trl
|
7 |
- sft
|
8 |
+
- medical
|
9 |
+
- diagnosis
|
10 |
- generated_from_trainer
|
11 |
model-index:
|
12 |
- name: sajjadhadi-Disease-Diagnosis-DeepSeek-R1-Distill-Llama-8B
|
13 |
results: []
|
14 |
+
datasets:
|
15 |
+
- sajjadhadi/disease-diagnosis-dataset
|
16 |
+
metrics:
|
17 |
+
- accuracy
|
18 |
---
|
19 |
|
20 |
+
<!-- This model card has been generated automatically according to the information the Trainer had access to. -->
|
|
|
21 |
|
22 |
+
## Disclaimer
|
|
|
23 |
|
24 |
+
**Important Notice**: This model is a research tool for disease diagnosis and is **NOT INTENDED** for clinical or medical use. It is designed for educational and experimental purposes only. The model's outputs should **NOT** be used to make medical decisions, diagnose conditions, or guide treatment. Always consult a qualified healthcare professional for medical advice.
|
25 |
|
26 |
+
The developers and contributors of this model are not responsible for any misuse or consequences arising from its application in medical contexts. Use this model responsibly and in compliance with ethical guidelines.
|
27 |
|
28 |
+
---
|
29 |
+
|
30 |
+
## Model Description
|
31 |
+
|
32 |
+
This model is a fine-tuned version of **DeepSeek-R1-Distill-Llama-8B**, adapted for disease diagnosis research. It leverages **LoRA (Low-Rank Adaptation)** to efficiently fine-tune the base model on a specialized dataset. The model is designed to analyze symptom descriptions and provide diagnostic suggestions in a structured format.
|
33 |
+
|
34 |
+
### Key Features:
|
35 |
+
- **Base Model**: `deepseek-ai/DeepSeek-R1-Distill-Llama-8B`
|
36 |
+
- **Fine-Tuning Method**: LoRA (Low-Rank Adaptation)
|
37 |
+
- **Training Framework**: PEFT (Parameter-Efficient Fine-Tuning)
|
38 |
+
- **Intended Use**: Research and educational applications in medical diagnosis.
|
39 |
+
|
40 |
+
---
|
41 |
+
|
42 |
+
## Intended Uses & Limitations
|
43 |
+
|
44 |
+
### Intended Uses:
|
45 |
+
- **Research**: Study of AI applications in medical diagnosis.
|
46 |
+
- **Education**: Simulation of diagnostic scenarios for training purposes.
|
47 |
+
- **Prototyping**: Development of AI-assisted diagnostic tools (non-clinical).
|
48 |
+
|
49 |
+
### Limitations:
|
50 |
+
- **Not for Clinical Use**: This model is not validated for real-world medical applications.
|
51 |
+
- **Data Dependency**: The model's performance depends on the quality and scope of its training data.
|
52 |
+
- **Ethical Concerns**: The model may generate incomplete or inaccurate suggestions. Always verify outputs with medical professionals.
|
53 |
+
|
54 |
+
---
|
55 |
|
56 |
+
## Training and Evaluation Data
|
57 |
|
58 |
+
The model was fine-tuned on a dataset containing symptom-disease mappings. The dataset includes:
|
59 |
+
- **Symptom Descriptions**: Textual descriptions of patient symptoms.
|
60 |
+
- **Disease Labels**: Corresponding disease classifications based on symptoms.
|
61 |
|
62 |
+
The dataset was preprocessed and tokenized to ensure compatibility with the base model's architecture. Specific details about the dataset size and composition are not disclosed.
|
63 |
|
64 |
+
---
|
65 |
+
|
66 |
+
## Training Procedure
|
67 |
+
|
68 |
+
### Training Hyperparameters:
|
69 |
+
| Parameter | Value |
|
70 |
+
|-----------|-------|
|
71 |
+
| Learning Rate | 1e-4 |
|
72 |
+
| Batch Size | 64 |
|
73 |
+
| Evaluation Batch Size | 8 |
|
74 |
+
| Optimizer | Paged AdamW 32-bit |
|
75 |
+
| Scheduler | Cosine with 3% Warmup |
|
76 |
+
| Epochs | 1 |
|
77 |
+
| Seed | 42 |
|
78 |
+
|
79 |
+
### Technical Stack:
|
80 |
+
- **PEFT**: 0.14.0
|
81 |
+
- **Transformers**: 4.49.0
|
82 |
+
- **PyTorch**: 2.6.0+cu124
|
83 |
+
- **Datasets**: 3.3.2
|
84 |
+
- **Tokenizers**: 0.21.0
|
85 |
+
|
86 |
+
---
|
87 |
|
88 |
+
## Ethical Considerations
|
89 |
|
90 |
+
### Responsible Use:
|
91 |
+
- **Transparency**: Users should be aware of the model's limitations and intended use cases.
|
92 |
+
- **Bias Mitigation**: The model may inherit biases from its training data. Careful evaluation is required.
|
93 |
+
- **Privacy**: No real patient data was used in training.
|
94 |
|
95 |
+
### Prohibited Uses:
|
96 |
+
- Clinical diagnosis or treatment decisions.
|
97 |
+
- Self-diagnosis tools for patients.
|
98 |
+
- Applications that could harm individuals or communities.
|
|
|
|
|
|
|
|
|
|
|
99 |
|
100 |
+
---
|
101 |
+
|
102 |
+
## Acknowledgments
|
103 |
+
|
104 |
+
This model was developed using the **DeepSeek-R1-Distill-Llama-8B** base model and fine-tuned with the **PEFT** library. Special thanks to the open-source community for their contributions to AI research.
|
105 |
+
|
106 |
+
---
|
107 |
|
108 |
+
**Note**: This model is a work in progress. Further evaluation and documentation will be provided in future updates.
|
|
|
|
|
|
|
|