Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,89 @@
|
|
1 |
-
---
|
2 |
-
license: mit
|
3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
---
|
4 |
+
|
5 |
+
# Instruction Tuning Large Language Models to Understand Electronic Health Records
|
6 |
+
|
7 |
+
**Authors:** Zhenbang Wu, Anant Dadu, Michael Nalls, Faraz Faghri, Jimeng Sun
|
8 |
+
|
9 |
+
**Published at:** NeurIPS 2024 Datasets and Benchmarks Track (Spotlight)
|
10 |
+
|
11 |
+
[[馃搼Paper](https://openreview.net/pdf?id=Dgy5WVgPd2)] [[馃敆Code](https://github.com/zzachw/Llemr)]
|
12 |
+
|
13 |
+
This repository contains the model weights for Llemr, a large language model (LLM) capable of processing and interpreting electronic health records (EHR) with complex data structures.
|
14 |
+
|
15 |
+
|
16 |
+
## Model Description
|
17 |
+
|
18 |
+
Llemr is trained on MIMIC-Instr, a dataset comprising 350K schema-alignment examples and 100K clinical-reasoning examples generated from the MIMIC-IV EHR database. The model excels at generating relevant, context-aware responses to patient-related queries by leveraging:
|
19 |
+
|
20 |
+
- BiomedBERT as the event encoder.
|
21 |
+
|
22 |
+
- Vicuna as the backbone language model.
|
23 |
+
|
24 |
+
|
25 |
+
## How to Load Weights
|
26 |
+
|
27 |
+
Follow the steps below to load the pre-trained weights:
|
28 |
+
|
29 |
+
1. Clone the repository:
|
30 |
+
|
31 |
+
```bash
|
32 |
+
git clone https://huggingface.co/zzachw/llemr-v1
|
33 |
+
cd llemr-v1
|
34 |
+
```
|
35 |
+
|
36 |
+
2. Load the weights in Python:
|
37 |
+
|
38 |
+
```python
|
39 |
+
from peft import PeftModel
|
40 |
+
from src.model.init_llemr import init_llemr
|
41 |
+
|
42 |
+
# Define paths for the base model and LoRA weights
|
43 |
+
llm_pretrained_model_name_or_path = "lmsys/vicuna-7b-v1.5"
|
44 |
+
lora_name_or_path = "zzachw12/llemr-v1"
|
45 |
+
|
46 |
+
# Initialize the base model and tokenizer
|
47 |
+
model, tokenizer = init_llemr(llm_pretrained_model_name_or_path, hidden_size=1027)
|
48 |
+
|
49 |
+
# Integrate the LoRA weights into the model
|
50 |
+
model = PeftModel.from_pretrained(model, lora_name_or_path)
|
51 |
+
```
|
52 |
+
|
53 |
+
**Note:** This model requires pre-computed event embeddings generated by BiomedBERT. Refer to the [GitHub repository](https://github.com/zzachw/Llemr) for detailed instructions on data preprocessing and event embedding preparation.
|
54 |
+
|
55 |
+
|
56 |
+
## Notes on Model Enhancements
|
57 |
+
|
58 |
+
Llemr incorporates several minor improvements over the original implementation described in the paper:
|
59 |
+
|
60 |
+
1. **Enhanced Event Encoder:**
|
61 |
+
- Replaced ClinicalBERT (`emilyalsentzer/Bio_ClinicalBERT`) with BiomedBERT-large (`microsoft/BiomedNLP-BiomedBERT-large-uncased-abstract`), improving the quality of event embeddings.
|
62 |
+
|
63 |
+
2. **Improved Event Embedding:**
|
64 |
+
- Concatenated event timestamps and numeric values (where available) to the final event embeddings, resulting in better representation of time-sensitive and quantitative data.
|
65 |
+
|
66 |
+
3. **Expanded Dataset:**
|
67 |
+
- Increased the size of the clinical reasoning subset to 100K examples, doubling the data from the original 50K subset for more comprehensive coverage.
|
68 |
+
|
69 |
+
4. **Unified Training Approach:**
|
70 |
+
- Adopted a single-step training process that integrates schema alignment and clinical reasoning subsets simultaneously, streamlining the training pipeline.
|
71 |
+
|
72 |
+
These advancements collectively enhance the model's ability to interpret and reason with EHR data, delivering superior performance compared to its predecessor.
|
73 |
+
|
74 |
+
|
75 |
+
## Citation
|
76 |
+
|
77 |
+
If you utilize this work in your research or projects, please consider citing us:
|
78 |
+
|
79 |
+
```
|
80 |
+
@inproceedings{
|
81 |
+
wu2024instruction,
|
82 |
+
title={Instruction Tuning Large Language Models to Understand Electronic Health Records},
|
83 |
+
author={Zhenbang Wu and Anant Dadu and Michael Nalls and Faraz Faghri and Jimeng Sun},
|
84 |
+
booktitle={The Thirty-eighth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
|
85 |
+
year={2024},
|
86 |
+
url={https://openreview.net/forum?id=Dgy5WVgPd2}
|
87 |
+
}
|
88 |
+
```
|
89 |
+
|