README.md · TachyHealthResearch/medgemma-27b-medical-coding at main

medgemma-27b-medical-coding / README.md

TachyHealthResearch

Upload medical coding LoRA adapter (rank=8)

6f2ef64 verified 2 months ago

preview code

raw

history blame contribute delete

4.21 kB

	---
	library_name: transformers
	base_model: medgemma27B
	tags:
	- medical
	- medical-coding
	- icd10
	- cpt
	- hcpcs
	- healthcare
	- clinical
	- fine-tuned
	- peft
	- lora
	license: apache-2.0
	language:
	- en
	pipeline_tag: text-generation
	---

	# medgemma-27b-medical-coding

	## Model Description

	This is a LoRA adapter fine-tuned on medgemma27B for medical coding tasks. The model is specifically designed to:

	- Extract diseases and medical conditions from discharge summaries
	- Identify medical procedures and interventions
	- Assign appropriate medical codes (ICD-10, CPT, HCPCS)
	- Process clinical documentation with high accuracy

	Base Model: `medgemma27B`
	Fine-tuning Method: LoRA (Low-Rank Adaptation)

	## Training Details

	### LoRA Configuration
	- Rank (r): 8
	- Alpha: 16
	- Dropout: 0.1
	- Target Modules: q_proj, up_proj, v_proj, gate_proj

	## Usage

	```python
	import torch
	from transformers import AutoTokenizer, AutoModelForCausalLM
	from peft import PeftModel

	# Load tokenizer
	tokenizer = AutoTokenizer.from_pretrained("TachyHealthResearch/medgemma-27b-medical-coding")

	# Load base model
	base_model = AutoModelForCausalLM.from_pretrained(
	"medgemma27B",
	torch_dtype=torch.bfloat16,
	device_map="auto",
	trust_remote_code=True
	)

	# Load LoRA adapter
	model = PeftModel.from_pretrained(base_model, "TachyHealthResearch/medgemma-27b-medical-coding")
	```

	### Example Usage

	```python
	# Define the system prompt for medical coding
	system_prompt = """You are an expert medical coding specialist.
	Analyze the discharge summary to extract diseases, procedures, and assign appropriate medical codes.
	Return the response in JSON format with this structure:
	{"diseases": ["disease1", "disease2"], "icd10_codes": ["code1", "code2"],
	"procedures": ["procedure1", "procedure2"], "cpt_codes": ["code1", "code2"],
	"hcpcs_codes": ["code1", "code2"]}"""

	# Example discharge summary
	discharge_summary = """
	Patient admitted with chest pain and shortness of breath.
	Diagnosed with acute myocardial infarction and congestive heart failure.
	Underwent percutaneous coronary intervention with stent placement.
	"""

	# Prepare the conversation
	messages = [
	{"role": "system", "content": system_prompt},
	{"role": "user", "content": f"Please analyze this discharge summary:\n\n{discharge_summary}"}
	]

	# Apply chat template and generate
	text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
	inputs = tokenizer(text, return_tensors="pt").to(model.device)

	with torch.no_grad():
	outputs = model.generate(
	**inputs,
	max_new_tokens=512,
	temperature=0.1,
	do_sample=True,
	pad_token_id=tokenizer.eos_token_id,
	)

	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	generated_response = response[len(text):].strip()
	print("Generated Medical Codes:")
	print(generated_response)
	```

	## Model Performance

	This model has been specifically fine-tuned for medical coding tasks and demonstrates strong performance in:

	- Disease extraction from clinical text
	- Medical procedure identification
	- Medical code assignment (ICD-10, CPT, HCPCS)
	- Structured JSON response generation

	## Intended Use

	### Primary Use Cases
	- Medical coding automation
	- Clinical documentation analysis
	- Healthcare data processing

	### Limitations
	- Always verify generated codes with qualified medical coding professionals
	- Performance may vary on clinical documents significantly different from training data
	- Intended for use in appropriate healthcare environments only

	## License

	This model is released under the Apache 2.0 License.

	## Citation

	If you use this model in your research or applications, please cite:

	```bibtex
	@misc{TachyHealthResearch_medgemma_27b_medical_coding_2024},
	title = {medgemma-27b-medical-coding: Medical Coding Model},
	author = {TachyHealthResearch},
	year = {2024},
	publisher = {Hugging Face},
	url = {https://huggingface.co/TachyHealthResearch/medgemma-27b-medical-coding}
	}
	```

	---

	Important: This model is intended for research and healthcare applications. Always ensure proper validation and human oversight when using AI models in medical contexts.