Qwen3-4B-PeriComp / README.md

Upload README.md with huggingface_hub

b17916f verified 4 months ago

9.95 kB

	---
	license: apache-2.0
	language:
	- zh
	tags:
	- medical
	- perioperative
	- complications
	- lora
	- adapter
	- clinical-ai
	datasets:
	- perioperative-complications
	pipeline_tag: text-classification
	paper_url: https://doi.org/10.1101/2025.06.11.25329235
	paper_title: "Enhancing Privacy-Preserving Deployable Large Language Models for Perioperative Complication Detection: A Targeted Strategy with LoRA Fine-tuning"
	repository: https://github.com/gscfwid/PeriComp
	---

	# PeriComp: Perioperative Complication Detection LoRA Adaptors

	![PeriComp Performance](figure6b.png)
	Figure: Performance comparison of fine-tuned models across different sizes

	## 🩺 Model Overview

	PeriComp is a collection of specialized LoRA (Low-Rank Adaptation) adaptors designed for perioperative complication detection from clinical narratives. These adaptors enhance smaller open-source language models to achieve expert-level performance in identifying and grading 22 distinct perioperative complications based on European Perioperative Clinical Outcome (EPCO) definitions.

	### 🎯 Key Features

	- Expert-level Performance: Matches or exceeds human clinician accuracy
	- Multi-scale Detection: Simultaneous identification and severity grading (mild/moderate/severe)
	- Comprehensive Coverage: 22 distinct perioperative complications
	- Resource Efficient: Optimized for deployment on standard clinical infrastructure
	- Privacy Preserving: Fully deployable on-premises without data transmission

	## 📊 Model Collection

	This collection includes five optimized LoRA adaptors:

	\| Model \| Base Model \| Parameters \| F1 Score \| Use Case \|
	\|-------\|------------\|------------\|----------\|----------\|
	\| PeriComp-4B \| Qwen3-4B \| 4B \| 0.55 \| Resource-constrained environments \|
	\| PeriComp-8B \| Qwen3-8B \| 8B \| 0.61 \| Balanced performance/efficiency \|
	\| PeriComp-14B \| Qwen3-14B \| 14B \| 0.65 \| High-performance deployment \|
	\| PeriComp-32B \| Qwen3-32B \| 32B \| 0.68 \| Maximum accuracy requirements \|
	\| PeriComp-QwQ-32B \| QwQ-32B \| 32B \| 0.70 \| Reasoning-enhanced performance \|

	## 🔬 Research Background

	Perioperative complications affect millions of patients globally, with traditional manual detection suffering from:
	- 27% under-reporting rate in clinical registries
	- High variability in expert performance across institutions
	- Cognitive load limitations with complex documentation

	Our research, published as a preprint on [medRxiv](https://doi.org/10.1101/2025.06.11.25329235), demonstrates that targeted task decomposition combined with LoRA fine-tuning enables smaller models to achieve expert-level diagnostic capabilities while maintaining practical deployability.

	![Strict Performance Evaluation](figure7.png)
	Figure: Strict performance evaluation requiring exact complication type and severity matching

	## 🚀 Quick Start

	### Installation

	```bash
	pip install transformers peft torch
	```

	### Basic Usage

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	from peft import PeftModel

	# Load base model and tokenizer
	model_name = "Qwen/Qwen3-8B"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	base_model = AutoModelForCausalLM.from_pretrained(model_name)

	# Load PeriComp adaptor
	adaptor_name = "gscfwid/Qwen3-8B-PeriComp"
	model = PeftModel.from_pretrained(base_model, adaptor_name)

	# Prepare clinical input
	clinical_text = """
	# Objective
	The objective is to identify postoperative complications from patient data in medical records, mimicking the diagnostic expertise of a senior surgeon.

	# Diagnostic Criteria
	The diagnostic criteria for the 22 postoperative complications are as follows:

	{the diagnostic criteria for the 22 postoperative complications}

	# Guidelines of Output structure

	The output format is specified as:
	{defined the output structure}

	# Data of medical records

	- {General Information (De-identified)}
	- {Postoperative Medical Record}
	- {Abnormal Test Results}
	- {Examination Results}
	"""

	# Prompt preparation format details can be found in the example files:
	# - comprehensive_prompts.json for QwQ 32B adapter
	# - targeted_prompts.json for Qwen 3 adapters
	# Note: Models are trained on Chinese clinical texts; performance on other languages is not validated

	# Generate complication assessment
	inputs = tokenizer(clinical_text, return_tensors="pt")
	outputs = model.generate(**inputs, max_new_tokens=512)
	result = tokenizer.decode(outputs[0], skip_special_tokens=True)
	```

	## 🔧 Technical Details

	### Training Methodology

	- Base Architecture: Qwen3 series and QwQ-32B
	- Fine-tuning Method: LoRA (Low-Rank Adaptation)
	- Training Data: 146 complex surgical cases
	- Validation: Dual-center external validation (52 cases)
	- Task Strategy: Targeted decomposition approach

	### LoRA Configuration

	```python
	lora_config = {
	"lora_rank": 16,
	"lora_alpha": 32,
	"learning_rate": 1e-4,
	"target_modules": ["q_proj", "k_proj", "v_proj", "o_proj"]
	}
	```

	### 💻 Code and Data Access

	- GitHub Repository: [gscfwid/PeriComp](https://github.com/gscfwid/PeriComp)
	- Complete Implementation: Training scripts, evaluation code, and data processing pipelines
	- Prompt Templates: Each model includes optimized prompt files:
	- `comprehensive_prompts.json`: For QwQ-32B adapter (comprehensive approach)
	- `targeted_prompts.json`: For Qwen3 adapters (targeted strategy)
	- Clinical Data: Available upon reasonable request through institutional collaboration with appropriate ethical approval

	## 📋 Supported Complications

	The models detect and grade 22 perioperative complications based on European Perioperative Clinical Outcome (EPCO) definitions¹:

	1. Cardiovascular: Myocardial injury, cardiac arrhythmias
	2. Respiratory: Pneumonia, respiratory failure
	3. Renal: Acute kidney injury
	4. Gastrointestinal: Paralytic ileus, anastomotic leakage
	5. Infectious: Surgical site infections, sepsis
	6. Neurological: Delirium, stroke
	7. Hematological: Bleeding, thromboembolism
	8. And more...

	Each complication is graded as:
	- Mild: Minor intervention required
	- Moderate: Significant medical management
	- Severe: Life-threatening, intensive intervention

	---
	¹ Jammer, I. et al. Standards for definitions and use of outcome measures for clinical effectiveness research in perioperative medicine: European Perioperative Clinical Outcome (EPCO) definitions: a statement from the ESA-ESICM joint taskforce on perioperative outcome measures. Eur J Anaesthesiol 32, 88-105 (2015). DOI: 10.1097/EJA.0000000000000118

	## 🏥 Clinical Applications

	### Primary Use Cases

	- Automated Screening: Continuous 24/7 complication monitoring
	- Quality Assurance: Systematic complication registry validation
	- Clinical Decision Support: "Second opinion" for complex cases
	- Research: Standardized outcome assessment for clinical studies

	### Deployment Scenarios

	- Resource-limited Settings: Use PeriComp-4B/8B models
	- Standard Clinical Environment: PeriComp-14B recommended
	- High-accuracy Requirements: PeriComp-32B for maximum performance
	- Reasoning-enhanced Tasks: PeriComp-QwQ-32B for complex diagnostic reasoning

	## ⚠️ Important Considerations

	### Clinical Validation Required

	⚠️ These models are research tools and require clinical validation before use in patient care

	### Limitations

	- Training on Chinese medical records (generalizability considerations)
	- Performance depends on documentation quality and completeness
	- Not a replacement for clinical judgment

	### Best Practices

	- Use as screening tool with clinical oversight
	- Validate outputs against clinical judgment
	- Consider local adaptation for specific institutional practices

	### Data Access

	⚠️ Clinical datasets are not publicly available due to patient privacy protection

	Data Request Process:
	- Clinical datasets can be requested from corresponding authors for legitimate research purposes
	- Requests must include detailed research protocol and intended use
	- Institutional ethical approval is required before data sharing
	- Data sharing agreements must comply with local privacy regulations
	- Contact: [email protected] for data access inquiries

	## 📚 Citation

	If you use PeriComp in your research, please cite:

	```bibtex
	@article{gao2025pericomp,
	title={Enhancing Privacy-Preserving Deployable Large Language Models for Perioperative Complication Detection: A Targeted Strategy with LoRA Fine-tuning},
	author={Gao, Shaowei and Zhao, Xu and Chen, Lihui and Yu, Junrong and Tian, Shuning and Zhou, Huaqiang and Chen, Jingru and Long, Sizhe and He, Qiulan and Feng, Xia},
	journal={medRxiv},
	pages={2025.06.11.25329235},
	year={2025},
	doi={10.1101/2025.06.11.25329235},
	url={https://doi.org/10.1101/2025.06.11.25329235},
	publisher={Cold Spring Harbor Laboratory Press}
	}
	```

	Paper: [Enhancing Privacy-Preserving Deployable Large Language Models for Perioperative Complication Detection: A Targeted Strategy with LoRA Fine-tuning](https://doi.org/10.1101/2025.06.11.25329235)

	Code: [GitHub Repository - gscfwid/PeriComp](https://github.com/gscfwid/PeriComp)

	## 📧 Contact & Support

	For questions, issues, or collaboration opportunities:

	- Research Team: Department of Anesthesiology, First Affiliated Hospital of Sun Yat-sen University, Guangzhou, China
	- Technical Issues: [email protected]
	- Clinical Data Requests: [email protected] (requires ethical approval and institutional collaboration)
	- Clinical Applications: Perioperative Complications Detection
	- Code Repository: [GitHub Issues](https://github.com/gscfwid/PeriComp/issues) for implementation questions

	## 📄 License

	This work is licensed under Apache License 2.0. See LICENSE for details.

	---

	PeriComp: Advancing perioperative patient safety through AI-powered complication detection