File size: 2,927 Bytes
2f5a06f a754405 2f5a06f 829ba6a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 |
---
license: apache-2.0
language: en
tags:
- qwen3
- medical
- chat
- fine-tuned
- gguf
- healthcare
datasets:
- custom-medical-qa
model_type: qwen3
base_model: Qwen/Qwen3-0.6B
---
# Qwen3-0.6B-Medical-Finetuned-v1
This model is a fine-tuned version of `Qwen/Qwen3-0.6B` specialized for medical question-answering. It's designed to provide helpful, accurate medical information while emphasizing the importance of professional medical consultation.
## ๐ฅ Model Description
- **Base Model**: `Qwen/Qwen3-0.6B`
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
- **Dataset**: Custom medical Q&A dataset covering common health topics.
- **Training**: Optimized for conversational medical assistance.
## โ ๏ธ Important Disclaimer
**This model is NOT a substitute for professional medical advice, diagnosis, or treatment. Always consult qualified healthcare providers for medical concerns. Do not use this model for emergency situations - call emergency services immediately.**
## ๐ Usage
### With `transformers`
```python
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import torch
model_id = "rohitnagareddy/Qwen3-0.6B-Medical-Finetuned-v1"
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.float16,
device_map="auto",
trust_remote_code=True
)
# Create a conversation pipeline
pipe = pipeline(
"text-generation",
model=model,
tokenizer=tokenizer
)
# Create conversation
prompt = "<|im_start|>system\nYou are a helpful medical assistant providing accurate, evidence-based information.<|im_end|>\n<|im_start|>user\nWhat are the symptoms of hypertension?<|im_end|>\n<|im_start|>assistant\n"
# Generate response
response = pipe(prompt, max_new_tokens=300, temperature=0.7, top_p=0.9, do_sample=True)
print(response[0]["generated_text"])
```
## ๐ง GGUF Versions
This repository includes quantized GGUF versions for use with `llama.cpp` and compatible tools:
- `Qwen3-0.6B-Medical-Finetuned-v1.fp16.gguf` - Full precision (largest, best quality)
- `Qwen3-0.6B-Medical-Finetuned-v1.Q8_0.gguf` - 8-bit quantization (good balance)
- `Qwen3-0.6B-Medical-Finetuned-v1.Q5_K_M.gguf` - 5-bit quantization (smaller, fast)
- `Qwen3-0.6B-Medical-Finetuned-v1.Q4_K_M.gguf` - 4-bit quantization (smallest, fastest)
### Using with Ollama
```bash
# Pull the model (once available on the Hub)
ollama pull rohitnagareddy/Qwen3-0.6B-Medical-Finetuned-v1
# Run the model
ollama run rohitnagareddy/Qwen3-0.6B-Medical-Finetuned-v1 "What are the early signs of diabetes?"
```
## ๐ Training Details
- **Training Epochs**: 2
- **Batch Size**: 2 (with 4 steps of gradient accumulation)
- **Learning Rate**: 2e-4
- **Optimizer**: Paged AdamW 32-bit
- **LoRA Rank**: 16
- **LoRA Alpha**: 32
- **Target Modules**: Auto-detected linear layers
---
|