TachyHealthResearch's picture
Upload medical coding LoRA adapter (rank=8)
6f2ef64 verified
---
library_name: transformers
base_model: medgemma27B
tags:
- medical
- medical-coding
- icd10
- cpt
- hcpcs
- healthcare
- clinical
- fine-tuned
- peft
- lora
license: apache-2.0
language:
- en
pipeline_tag: text-generation
---
# medgemma-27b-medical-coding
## Model Description
This is a **LoRA adapter** fine-tuned on **medgemma27B** for medical coding tasks. The model is specifically designed to:
- Extract diseases and medical conditions from discharge summaries
- Identify medical procedures and interventions
- Assign appropriate medical codes (ICD-10, CPT, HCPCS)
- Process clinical documentation with high accuracy
**Base Model:** `medgemma27B`
**Fine-tuning Method:** LoRA (Low-Rank Adaptation)
## Training Details
### LoRA Configuration
- **Rank (r):** 8
- **Alpha:** 16
- **Dropout:** 0.1
- **Target Modules:** q_proj, up_proj, v_proj, gate_proj
## Usage
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("TachyHealthResearch/medgemma-27b-medical-coding")
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"medgemma27B",
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True
)
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "TachyHealthResearch/medgemma-27b-medical-coding")
```
### Example Usage
```python
# Define the system prompt for medical coding
system_prompt = """You are an expert medical coding specialist.
Analyze the discharge summary to extract diseases, procedures, and assign appropriate medical codes.
Return the response in JSON format with this structure:
{"diseases": ["disease1", "disease2"], "icd10_codes": ["code1", "code2"],
"procedures": ["procedure1", "procedure2"], "cpt_codes": ["code1", "code2"],
"hcpcs_codes": ["code1", "code2"]}"""
# Example discharge summary
discharge_summary = """
Patient admitted with chest pain and shortness of breath.
Diagnosed with acute myocardial infarction and congestive heart failure.
Underwent percutaneous coronary intervention with stent placement.
"""
# Prepare the conversation
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": f"Please analyze this discharge summary:\n\n{discharge_summary}"}
]
# Apply chat template and generate
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.1,
do_sample=True,
pad_token_id=tokenizer.eos_token_id,
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
generated_response = response[len(text):].strip()
print("Generated Medical Codes:")
print(generated_response)
```
## Model Performance
This model has been specifically fine-tuned for medical coding tasks and demonstrates strong performance in:
- Disease extraction from clinical text
- Medical procedure identification
- Medical code assignment (ICD-10, CPT, HCPCS)
- Structured JSON response generation
## Intended Use
### Primary Use Cases
- Medical coding automation
- Clinical documentation analysis
- Healthcare data processing
### Limitations
- Always verify generated codes with qualified medical coding professionals
- Performance may vary on clinical documents significantly different from training data
- Intended for use in appropriate healthcare environments only
## License
This model is released under the Apache 2.0 License.
## Citation
If you use this model in your research or applications, please cite:
```bibtex
@misc{TachyHealthResearch_medgemma_27b_medical_coding_2024},
title = {medgemma-27b-medical-coding: Medical Coding Model},
author = {TachyHealthResearch},
year = {2024},
publisher = {Hugging Face},
url = {https://huggingface.co/TachyHealthResearch/medgemma-27b-medical-coding}
}
```
---
**Important**: This model is intended for research and healthcare applications. Always ensure proper validation and human oversight when using AI models in medical contexts.