tsilva commited on
Commit
89383af
·
verified ·
1 Parent(s): c3cc334

Push classification fine-tuned model

Browse files
README.md ADDED
@@ -0,0 +1,98 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ license: apache-2.0
4
+ base_model: distilbert/distilgpt2
5
+ tags:
6
+ - generated_from_trainer
7
+ model-index:
8
+ - name: clinical-field-mapper-classification
9
+ results: []
10
+ ---
11
+
12
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
+ should probably proofread and complete it, then remove this comment. -->
14
+
15
+ # clinical-field-mapper-classification
16
+
17
+ This model is a fine-tuned version of [distilbert/distilgpt2](https://huggingface.co/distilbert/distilgpt2) on an unknown dataset.
18
+ It achieves the following results on the evaluation set:
19
+ - Loss: 0.2783
20
+
21
+ ## Model description
22
+
23
+ More information needed
24
+
25
+ ## Intended uses & limitations
26
+
27
+ More information needed
28
+
29
+ ## Training and evaluation data
30
+
31
+ More information needed
32
+
33
+ ## Training procedure
34
+
35
+ ### Training hyperparameters
36
+
37
+ The following hyperparameters were used during training:
38
+ - learning_rate: 0.0005
39
+ - train_batch_size: 512
40
+ - eval_batch_size: 512
41
+ - seed: 42
42
+ - distributed_type: multi-GPU
43
+ - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
44
+ - lr_scheduler_type: cosine
45
+ - lr_scheduler_warmup_ratio: 0.01
46
+ - num_epochs: 50
47
+ - mixed_precision_training: Native AMP
48
+
49
+ ### Training results
50
+
51
+ | Training Loss | Epoch | Step | Validation Loss |
52
+ |:-------------:|:-----:|:----:|:---------------:|
53
+ | 7.615 | 1.0 | 18 | 5.8320 |
54
+ | 5.7769 | 2.0 | 36 | 5.5586 |
55
+ | 5.099 | 3.0 | 54 | 3.5 |
56
+ | 2.3325 | 4.0 | 72 | 1.2383 |
57
+ | 0.7197 | 5.0 | 90 | 0.6011 |
58
+ | 0.2701 | 6.0 | 108 | 0.4207 |
59
+ | 0.1317 | 7.0 | 126 | 0.3694 |
60
+ | 0.0744 | 8.0 | 144 | 0.3362 |
61
+ | 0.0522 | 9.0 | 162 | 0.3208 |
62
+ | 0.039 | 10.0 | 180 | 0.2910 |
63
+ | 0.0301 | 11.0 | 198 | 0.2922 |
64
+ | 0.0239 | 12.0 | 216 | 0.2900 |
65
+ | 0.0198 | 13.0 | 234 | 0.2935 |
66
+ | 0.0174 | 14.0 | 252 | 0.2825 |
67
+ | 0.0149 | 15.0 | 270 | 0.2788 |
68
+ | 0.0135 | 16.0 | 288 | 0.2842 |
69
+ | 0.0122 | 17.0 | 306 | 0.2825 |
70
+ | 0.0103 | 18.0 | 324 | 0.2788 |
71
+ | 0.0097 | 19.0 | 342 | 0.2776 |
72
+ | 0.0083 | 20.0 | 360 | 0.2798 |
73
+ | 0.0075 | 21.0 | 378 | 0.2793 |
74
+ | 0.0071 | 22.0 | 396 | 0.2788 |
75
+ | 0.0067 | 23.0 | 414 | 0.2756 |
76
+ | 0.0059 | 24.0 | 432 | 0.2754 |
77
+ | 0.0059 | 25.0 | 450 | 0.2788 |
78
+ | 0.0067 | 26.0 | 468 | 0.2864 |
79
+ | 0.0058 | 27.0 | 486 | 0.2769 |
80
+ | 0.0046 | 28.0 | 504 | 0.2742 |
81
+ | 0.0044 | 29.0 | 522 | 0.2786 |
82
+ | 0.0038 | 30.0 | 540 | 0.2776 |
83
+ | 0.0039 | 31.0 | 558 | 0.2764 |
84
+ | 0.0034 | 32.0 | 576 | 0.2764 |
85
+ | 0.0032 | 33.0 | 594 | 0.2705 |
86
+ | 0.0029 | 34.0 | 612 | 0.2766 |
87
+ | 0.0029 | 35.0 | 630 | 0.2742 |
88
+ | 0.0026 | 36.0 | 648 | 0.2751 |
89
+ | 0.0027 | 37.0 | 666 | 0.2771 |
90
+ | 0.0024 | 38.0 | 684 | 0.2783 |
91
+
92
+
93
+ ### Framework versions
94
+
95
+ - Transformers 4.51.3
96
+ - Pytorch 2.6.0+cu124
97
+ - Datasets 3.5.1
98
+ - Tokenizers 0.21.1
added_tokens.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "<|pad|>": 50257
3
+ }
config.json ADDED
@@ -0,0 +1,721 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_num_labels": 1,
3
+ "activation_function": "gelu_new",
4
+ "architectures": [
5
+ "GPT2ForSequenceClassification"
6
+ ],
7
+ "attn_pdrop": 0.1,
8
+ "bos_token_id": 50256,
9
+ "embd_pdrop": 0.1,
10
+ "eos_token_id": 50256,
11
+ "id2label": {
12
+ "0": "a1at_status",
13
+ "1": "acq_score",
14
+ "2": "act_score",
15
+ "3": "adas_cog_score",
16
+ "4": "admission_date",
17
+ "5": "admission_id",
18
+ "6": "admission_reason",
19
+ "7": "admission_source",
20
+ "8": "adverse_event_flag",
21
+ "9": "adverse_event_management_end_date",
22
+ "10": "adverse_event_management_start_date",
23
+ "11": "adverse_event_management_type",
24
+ "12": "adverse_event_permanent_discontinuation_flag",
25
+ "13": "adverse_event_type",
26
+ "14": "age",
27
+ "15": "alcohol_status",
28
+ "16": "allergy_reaction_type",
29
+ "17": "allergy_severity",
30
+ "18": "ami_event_date",
31
+ "19": "ana_result_status",
32
+ "20": "anesthesia_type",
33
+ "21": "animal_verbal_fluency_score",
34
+ "22": "ann_arbor_stage",
35
+ "23": "anti_ccp_level",
36
+ "24": "anti_ccp_status",
37
+ "25": "anxiety_depression_flag",
38
+ "26": "asa_physical_status",
39
+ "27": "asdas_score",
40
+ "28": "asthma_flag",
41
+ "29": "atrial_fibrillation_flag",
42
+ "30": "autoimmune_condition_reported",
43
+ "31": "basdai_score",
44
+ "32": "basfi_score",
45
+ "33": "benzodiazepine_use_flag",
46
+ "34": "best_response_date",
47
+ "35": "beta2_microglobulin_level",
48
+ "36": "biopsy_results_note",
49
+ "37": "biopsy_type",
50
+ "38": "birth_date",
51
+ "39": "bmi",
52
+ "40": "bmi_category",
53
+ "41": "bone_marrow_involvement_flag",
54
+ "42": "bone_marrow_plasma_cell_percent",
55
+ "43": "breast_cancer_flag",
56
+ "44": "bronchiectasis_flag",
57
+ "45": "cancer_history_flag",
58
+ "46": "cardiovascular_disease_flag",
59
+ "47": "care_site_id",
60
+ "48": "cart_manufacturing_completion_date",
61
+ "49": "cart_manufacturing_delay_reason",
62
+ "50": "cart_manufacturing_start_date",
63
+ "51": "cart_target_antigen",
64
+ "52": "cat_score",
65
+ "53": "cause_of_death_code",
66
+ "54": "cdr_score",
67
+ "55": "charlson_comorbidity_index",
68
+ "56": "chronic_bronchitis_flag",
69
+ "57": "chronic_kidney_disease_flag",
70
+ "58": "ck_mb_level",
71
+ "59": "clinical_response",
72
+ "60": "cns_risk_stratification_category",
73
+ "61": "comorbidities_reported",
74
+ "62": "concomitant_corticosteroid_flag",
75
+ "63": "concomitant_csdmard_flag",
76
+ "64": "condition_status",
77
+ "65": "copd_flag",
78
+ "66": "copd_severity_stage",
79
+ "67": "covid19_severity",
80
+ "68": "covid19_test_date",
81
+ "69": "covid19_test_result",
82
+ "70": "covid19_vaccination_date",
83
+ "71": "covid19_vaccination_status",
84
+ "72": "cpet_assessment",
85
+ "73": "crab_features_reported",
86
+ "74": "creatinine_level",
87
+ "75": "crp_level",
88
+ "76": "crs_grade",
89
+ "77": "cytogenetics_result",
90
+ "78": "dactylitis_severity_score",
91
+ "79": "dapsa_score",
92
+ "80": "death_date",
93
+ "81": "death_flag",
94
+ "82": "dementia_flag",
95
+ "83": "diabetes_flag",
96
+ "84": "diagnosis_code",
97
+ "85": "diagnosis_date",
98
+ "86": "diastolic_bp",
99
+ "87": "discharge_date",
100
+ "88": "discharge_destination",
101
+ "89": "discharge_medications_reported",
102
+ "90": "discontinuation_reason",
103
+ "91": "disease_activity_score_das28",
104
+ "92": "dlco_percent_predicted",
105
+ "93": "drug_administration_frequency",
106
+ "94": "drug_dose",
107
+ "95": "drug_name",
108
+ "96": "drug_regimen_name",
109
+ "97": "drug_route",
110
+ "98": "dyslipidemia_flag",
111
+ "99": "ecog_performance_status",
112
+ "100": "education_years",
113
+ "101": "egfr_level",
114
+ "102": "emphysema_flag",
115
+ "103": "eortc_qlq_c30_emotional_score",
116
+ "104": "eortc_qlq_c30_fatigue_score",
117
+ "105": "eortc_qlq_c30_global_health_score",
118
+ "106": "eortc_qlq_c30_nausea_vomiting_score",
119
+ "107": "eortc_qlq_c30_pain_score",
120
+ "108": "eortc_qlq_c30_physical_score",
121
+ "109": "eortc_qlq_c30_role_score",
122
+ "110": "eosinophil_category",
123
+ "111": "eosinophilic_inflammation_v1_flag",
124
+ "112": "eosinophilic_inflammation_v2_flag",
125
+ "113": "eq5d_score",
126
+ "114": "er_status",
127
+ "115": "erosion_score",
128
+ "116": "esr_level",
129
+ "117": "exercise_intolerance_6mwt_flag",
130
+ "118": "exertional_desaturation_flag",
131
+ "119": "extra_articular_manifestation_flag",
132
+ "120": "extra_articular_manifestation_reported",
133
+ "121": "extranodal_sites_reported",
134
+ "122": "fact_lym_emotional_score",
135
+ "123": "fact_lym_lymphoma_subscale_score",
136
+ "124": "fact_lym_physical_score",
137
+ "125": "fact_lym_total_score",
138
+ "126": "family_history_reported",
139
+ "127": "fasting_glucose_level",
140
+ "128": "feno_above_threshold_flag",
141
+ "129": "feno_level",
142
+ "130": "ferritin_level",
143
+ "131": "fev1_gold_stage",
144
+ "132": "fev1_percent_predicted",
145
+ "133": "follow_up_appointment_scheduled_flag",
146
+ "134": "follow_up_specialty",
147
+ "135": "frailty_status",
148
+ "136": "frequent_exacerbator_flag",
149
+ "137": "frequent_readmissions_flag",
150
+ "138": "functional_status",
151
+ "139": "fvc_percent_predicted",
152
+ "140": "gerd_flag",
153
+ "141": "global_disease_vas_patient",
154
+ "142": "global_disease_vas_physician",
155
+ "143": "gold_group",
156
+ "144": "hamd_score",
157
+ "145": "haq_di_score",
158
+ "146": "hba1c_level",
159
+ "147": "hdl_cholesterol_level",
160
+ "148": "heart_failure_etiology",
161
+ "149": "heart_failure_event_type",
162
+ "150": "heart_failure_flag",
163
+ "151": "heart_failure_primary_cause",
164
+ "152": "heart_failure_primary_cause_other",
165
+ "153": "heart_failure_type",
166
+ "154": "heart_rate",
167
+ "155": "height",
168
+ "156": "hemoglobin_level",
169
+ "157": "hepatitis_b_status",
170
+ "158": "hepatitis_c_status",
171
+ "159": "her2_status",
172
+ "160": "hiv_status",
173
+ "161": "hla_b27_status",
174
+ "162": "hospitalization_heart_failure_flag",
175
+ "163": "hpylori_status",
176
+ "164": "hsct_details_note",
177
+ "165": "hyperinflation_rv_tlc_flag",
178
+ "166": "hyperinflation_rv_tlc_tlc_flag",
179
+ "167": "hypertension_flag",
180
+ "168": "icans_grade",
181
+ "169": "imaging_extramedullary_involvement_sites_reported",
182
+ "170": "imaging_joint_inflammation_flag",
183
+ "171": "imaging_joint_space_narrowing_grade",
184
+ "172": "imaging_leukemia_involvement_flag",
185
+ "173": "imaging_lymphoma_involvement_flag",
186
+ "174": "imaging_myeloma_findings_reported",
187
+ "175": "imaging_progressive_disease_sites_reported",
188
+ "176": "imaging_reporting_standard_flag",
189
+ "177": "imaging_suv_max",
190
+ "178": "imaging_type",
191
+ "179": "immunophenotyping_marker_reported",
192
+ "180": "infusion_reaction_flag",
193
+ "181": "inhaler_technique_assessment",
194
+ "182": "initial_diagnostic_method",
195
+ "183": "insurance_type",
196
+ "184": "ischemic_heart_disease_flag",
197
+ "185": "iv_diuretic_use_flag",
198
+ "186": "last_visit_date",
199
+ "187": "ldh_level_category",
200
+ "188": "ldl_cholesterol_level",
201
+ "189": "leeds_enthesitis_index",
202
+ "190": "length_of_stay_days",
203
+ "191": "leukemia_risk_stratification",
204
+ "192": "long_covid_symptoms_reported",
205
+ "193": "long_term_oxygen_therapy_flag",
206
+ "194": "long_term_toxicity_reported",
207
+ "195": "loss_of_response_date",
208
+ "196": "lost_followup_flag",
209
+ "197": "lugano_classification",
210
+ "198": "lvef_percent",
211
+ "199": "madrs_score",
212
+ "200": "malnutrition_flag",
213
+ "201": "manufacturing_issue_note",
214
+ "202": "measurement_date",
215
+ "203": "measurement_lower_limit",
216
+ "204": "measurement_unit",
217
+ "205": "measurement_upper_limit",
218
+ "206": "medication_adherence_assessment",
219
+ "207": "mild_exacerbation_count",
220
+ "208": "minicog_score",
221
+ "209": "mmrc_dyspnea_score",
222
+ "210": "mmse_score",
223
+ "211": "moca_score",
224
+ "212": "moderate_exacerbation_count",
225
+ "213": "molecular_genetic_marker_reported",
226
+ "214": "mortality_30day_flag",
227
+ "215": "niv_bipap_flag",
228
+ "216": "nodal_mass_size",
229
+ "217": "note_date",
230
+ "218": "note_text",
231
+ "219": "note_type",
232
+ "220": "nutritional_status",
233
+ "221": "nyha_class",
234
+ "222": "obesity_flag",
235
+ "223": "opioid_use_flag",
236
+ "224": "osteoporosis_flag",
237
+ "225": "other_inhaled_exposure_status",
238
+ "226": "oxygen_requirement_status",
239
+ "227": "pacu_discharge_intermediate_care_flag",
240
+ "228": "pain_vas",
241
+ "229": "pasdas_score",
242
+ "230": "pasqal_tool_used_flag",
243
+ "231": "past_surgical_history_reported",
244
+ "232": "peripheral_arterial_disease_flag",
245
+ "233": "person_id",
246
+ "234": "phq9_score",
247
+ "235": "physical_activity_flag",
248
+ "236": "platelet_count",
249
+ "237": "polypharmacy_status",
250
+ "238": "postoperative_complication_flag",
251
+ "239": "postoperative_delirium_flag",
252
+ "240": "postoperative_delirium_history_flag",
253
+ "241": "pr_status",
254
+ "242": "prescription_days_supply",
255
+ "243": "prescription_dispense_date",
256
+ "244": "pretreatment_screening_reported",
257
+ "245": "prior_corticosteroids_count",
258
+ "246": "prior_dmards_count",
259
+ "247": "prior_nsaids_count",
260
+ "248": "procedure_code",
261
+ "249": "procedure_date",
262
+ "250": "procedure_notes",
263
+ "251": "provider_specialty",
264
+ "252": "psa_subtype",
265
+ "253": "pulmonary_hypertension_flag",
266
+ "254": "ra_functional_class",
267
+ "255": "race",
268
+ "256": "radiation_fraction_count",
269
+ "257": "radiation_fraction_dose",
270
+ "258": "radiation_total_dose",
271
+ "259": "radiation_treated_sites_reported",
272
+ "260": "radiographic_stage",
273
+ "261": "random_glucose_level",
274
+ "262": "readmission_date",
275
+ "263": "readmission_flag",
276
+ "264": "readmission_prevention_measures_reported",
277
+ "265": "readmission_reason",
278
+ "266": "readmission_risk_score",
279
+ "267": "relapse_confirmation_method",
280
+ "268": "relapse_date",
281
+ "269": "remission_flag",
282
+ "270": "reported_allergy_drug",
283
+ "271": "respiratory_insufficiency_flag",
284
+ "272": "respiratory_insufficiency_type",
285
+ "273": "respiratory_rate",
286
+ "274": "rheumatoid_factor_level",
287
+ "275": "rheumatoid_factor_status",
288
+ "276": "rso2_baseline_percent",
289
+ "277": "rv_percent_predicted",
290
+ "278": "rv_tlc_ratio",
291
+ "279": "serum_albumin_level",
292
+ "280": "serum_calcium_level",
293
+ "281": "serum_free_light_chain_level",
294
+ "282": "severe_exacerbation_count",
295
+ "283": "sex",
296
+ "284": "sf36_score",
297
+ "285": "sleep_apnea_flag",
298
+ "286": "smoking_status",
299
+ "287": "social_determinants_reported",
300
+ "288": "social_support_assessment",
301
+ "289": "sparcc_score",
302
+ "290": "spirometry_reversibility_flag",
303
+ "291": "steroid_equivalent_dose",
304
+ "292": "steroid_taper_duration",
305
+ "293": "substance_use_history_reported",
306
+ "294": "support_system_assessment",
307
+ "295": "supportive_medication_reported",
308
+ "296": "surgical_risk_classification",
309
+ "297": "surgical_specialty",
310
+ "298": "symptoms_at_presentation_reported",
311
+ "299": "synovial_fluid_analysis_reported",
312
+ "300": "systolic_bp",
313
+ "301": "temperature",
314
+ "302": "therapeutic_adherence_assessment",
315
+ "303": "therapeutic_class",
316
+ "304": "time_from_diagnosis_to_biologic",
317
+ "305": "tlc_percent_predicted",
318
+ "306": "total_cholesterol_level",
319
+ "307": "toxicity_grade",
320
+ "308": "treatment_cycle_count",
321
+ "309": "treatment_end_date",
322
+ "310": "treatment_inefficacy_flag",
323
+ "311": "treatment_intent",
324
+ "312": "treatment_line",
325
+ "313": "treatment_modification_date",
326
+ "314": "treatment_modification_reason",
327
+ "315": "treatment_modification_type",
328
+ "316": "treatment_start_date",
329
+ "317": "treatment_switch_count",
330
+ "318": "treatment_type",
331
+ "319": "triglycerides_level",
332
+ "320": "troponin_level",
333
+ "321": "tumor_grade",
334
+ "322": "tumor_stage_m",
335
+ "323": "tumor_stage_n",
336
+ "324": "tumor_stage_t",
337
+ "325": "ultrasound_doppler_grade",
338
+ "326": "unknown_field",
339
+ "327": "urine_albumin_creatinine_ratio",
340
+ "328": "visit_end_date",
341
+ "329": "visit_id",
342
+ "330": "visit_start_date",
343
+ "331": "visit_type",
344
+ "332": "wbc_count",
345
+ "333": "weight",
346
+ "334": "worsening_heart_failure_episode_order",
347
+ "335": "worsening_heart_failure_event_type",
348
+ "336": "worsening_heart_failure_flag",
349
+ "337": "worsening_heart_failure_start_date"
350
+ },
351
+ "initializer_range": 0.02,
352
+ "label2id": {
353
+ "a1at_status": 0,
354
+ "acq_score": 1,
355
+ "act_score": 2,
356
+ "adas_cog_score": 3,
357
+ "admission_date": 4,
358
+ "admission_id": 5,
359
+ "admission_reason": 6,
360
+ "admission_source": 7,
361
+ "adverse_event_flag": 8,
362
+ "adverse_event_management_end_date": 9,
363
+ "adverse_event_management_start_date": 10,
364
+ "adverse_event_management_type": 11,
365
+ "adverse_event_permanent_discontinuation_flag": 12,
366
+ "adverse_event_type": 13,
367
+ "age": 14,
368
+ "alcohol_status": 15,
369
+ "allergy_reaction_type": 16,
370
+ "allergy_severity": 17,
371
+ "ami_event_date": 18,
372
+ "ana_result_status": 19,
373
+ "anesthesia_type": 20,
374
+ "animal_verbal_fluency_score": 21,
375
+ "ann_arbor_stage": 22,
376
+ "anti_ccp_level": 23,
377
+ "anti_ccp_status": 24,
378
+ "anxiety_depression_flag": 25,
379
+ "asa_physical_status": 26,
380
+ "asdas_score": 27,
381
+ "asthma_flag": 28,
382
+ "atrial_fibrillation_flag": 29,
383
+ "autoimmune_condition_reported": 30,
384
+ "basdai_score": 31,
385
+ "basfi_score": 32,
386
+ "benzodiazepine_use_flag": 33,
387
+ "best_response_date": 34,
388
+ "beta2_microglobulin_level": 35,
389
+ "biopsy_results_note": 36,
390
+ "biopsy_type": 37,
391
+ "birth_date": 38,
392
+ "bmi": 39,
393
+ "bmi_category": 40,
394
+ "bone_marrow_involvement_flag": 41,
395
+ "bone_marrow_plasma_cell_percent": 42,
396
+ "breast_cancer_flag": 43,
397
+ "bronchiectasis_flag": 44,
398
+ "cancer_history_flag": 45,
399
+ "cardiovascular_disease_flag": 46,
400
+ "care_site_id": 47,
401
+ "cart_manufacturing_completion_date": 48,
402
+ "cart_manufacturing_delay_reason": 49,
403
+ "cart_manufacturing_start_date": 50,
404
+ "cart_target_antigen": 51,
405
+ "cat_score": 52,
406
+ "cause_of_death_code": 53,
407
+ "cdr_score": 54,
408
+ "charlson_comorbidity_index": 55,
409
+ "chronic_bronchitis_flag": 56,
410
+ "chronic_kidney_disease_flag": 57,
411
+ "ck_mb_level": 58,
412
+ "clinical_response": 59,
413
+ "cns_risk_stratification_category": 60,
414
+ "comorbidities_reported": 61,
415
+ "concomitant_corticosteroid_flag": 62,
416
+ "concomitant_csdmard_flag": 63,
417
+ "condition_status": 64,
418
+ "copd_flag": 65,
419
+ "copd_severity_stage": 66,
420
+ "covid19_severity": 67,
421
+ "covid19_test_date": 68,
422
+ "covid19_test_result": 69,
423
+ "covid19_vaccination_date": 70,
424
+ "covid19_vaccination_status": 71,
425
+ "cpet_assessment": 72,
426
+ "crab_features_reported": 73,
427
+ "creatinine_level": 74,
428
+ "crp_level": 75,
429
+ "crs_grade": 76,
430
+ "cytogenetics_result": 77,
431
+ "dactylitis_severity_score": 78,
432
+ "dapsa_score": 79,
433
+ "death_date": 80,
434
+ "death_flag": 81,
435
+ "dementia_flag": 82,
436
+ "diabetes_flag": 83,
437
+ "diagnosis_code": 84,
438
+ "diagnosis_date": 85,
439
+ "diastolic_bp": 86,
440
+ "discharge_date": 87,
441
+ "discharge_destination": 88,
442
+ "discharge_medications_reported": 89,
443
+ "discontinuation_reason": 90,
444
+ "disease_activity_score_das28": 91,
445
+ "dlco_percent_predicted": 92,
446
+ "drug_administration_frequency": 93,
447
+ "drug_dose": 94,
448
+ "drug_name": 95,
449
+ "drug_regimen_name": 96,
450
+ "drug_route": 97,
451
+ "dyslipidemia_flag": 98,
452
+ "ecog_performance_status": 99,
453
+ "education_years": 100,
454
+ "egfr_level": 101,
455
+ "emphysema_flag": 102,
456
+ "eortc_qlq_c30_emotional_score": 103,
457
+ "eortc_qlq_c30_fatigue_score": 104,
458
+ "eortc_qlq_c30_global_health_score": 105,
459
+ "eortc_qlq_c30_nausea_vomiting_score": 106,
460
+ "eortc_qlq_c30_pain_score": 107,
461
+ "eortc_qlq_c30_physical_score": 108,
462
+ "eortc_qlq_c30_role_score": 109,
463
+ "eosinophil_category": 110,
464
+ "eosinophilic_inflammation_v1_flag": 111,
465
+ "eosinophilic_inflammation_v2_flag": 112,
466
+ "eq5d_score": 113,
467
+ "er_status": 114,
468
+ "erosion_score": 115,
469
+ "esr_level": 116,
470
+ "exercise_intolerance_6mwt_flag": 117,
471
+ "exertional_desaturation_flag": 118,
472
+ "extra_articular_manifestation_flag": 119,
473
+ "extra_articular_manifestation_reported": 120,
474
+ "extranodal_sites_reported": 121,
475
+ "fact_lym_emotional_score": 122,
476
+ "fact_lym_lymphoma_subscale_score": 123,
477
+ "fact_lym_physical_score": 124,
478
+ "fact_lym_total_score": 125,
479
+ "family_history_reported": 126,
480
+ "fasting_glucose_level": 127,
481
+ "feno_above_threshold_flag": 128,
482
+ "feno_level": 129,
483
+ "ferritin_level": 130,
484
+ "fev1_gold_stage": 131,
485
+ "fev1_percent_predicted": 132,
486
+ "follow_up_appointment_scheduled_flag": 133,
487
+ "follow_up_specialty": 134,
488
+ "frailty_status": 135,
489
+ "frequent_exacerbator_flag": 136,
490
+ "frequent_readmissions_flag": 137,
491
+ "functional_status": 138,
492
+ "fvc_percent_predicted": 139,
493
+ "gerd_flag": 140,
494
+ "global_disease_vas_patient": 141,
495
+ "global_disease_vas_physician": 142,
496
+ "gold_group": 143,
497
+ "hamd_score": 144,
498
+ "haq_di_score": 145,
499
+ "hba1c_level": 146,
500
+ "hdl_cholesterol_level": 147,
501
+ "heart_failure_etiology": 148,
502
+ "heart_failure_event_type": 149,
503
+ "heart_failure_flag": 150,
504
+ "heart_failure_primary_cause": 151,
505
+ "heart_failure_primary_cause_other": 152,
506
+ "heart_failure_type": 153,
507
+ "heart_rate": 154,
508
+ "height": 155,
509
+ "hemoglobin_level": 156,
510
+ "hepatitis_b_status": 157,
511
+ "hepatitis_c_status": 158,
512
+ "her2_status": 159,
513
+ "hiv_status": 160,
514
+ "hla_b27_status": 161,
515
+ "hospitalization_heart_failure_flag": 162,
516
+ "hpylori_status": 163,
517
+ "hsct_details_note": 164,
518
+ "hyperinflation_rv_tlc_flag": 165,
519
+ "hyperinflation_rv_tlc_tlc_flag": 166,
520
+ "hypertension_flag": 167,
521
+ "icans_grade": 168,
522
+ "imaging_extramedullary_involvement_sites_reported": 169,
523
+ "imaging_joint_inflammation_flag": 170,
524
+ "imaging_joint_space_narrowing_grade": 171,
525
+ "imaging_leukemia_involvement_flag": 172,
526
+ "imaging_lymphoma_involvement_flag": 173,
527
+ "imaging_myeloma_findings_reported": 174,
528
+ "imaging_progressive_disease_sites_reported": 175,
529
+ "imaging_reporting_standard_flag": 176,
530
+ "imaging_suv_max": 177,
531
+ "imaging_type": 178,
532
+ "immunophenotyping_marker_reported": 179,
533
+ "infusion_reaction_flag": 180,
534
+ "inhaler_technique_assessment": 181,
535
+ "initial_diagnostic_method": 182,
536
+ "insurance_type": 183,
537
+ "ischemic_heart_disease_flag": 184,
538
+ "iv_diuretic_use_flag": 185,
539
+ "last_visit_date": 186,
540
+ "ldh_level_category": 187,
541
+ "ldl_cholesterol_level": 188,
542
+ "leeds_enthesitis_index": 189,
543
+ "length_of_stay_days": 190,
544
+ "leukemia_risk_stratification": 191,
545
+ "long_covid_symptoms_reported": 192,
546
+ "long_term_oxygen_therapy_flag": 193,
547
+ "long_term_toxicity_reported": 194,
548
+ "loss_of_response_date": 195,
549
+ "lost_followup_flag": 196,
550
+ "lugano_classification": 197,
551
+ "lvef_percent": 198,
552
+ "madrs_score": 199,
553
+ "malnutrition_flag": 200,
554
+ "manufacturing_issue_note": 201,
555
+ "measurement_date": 202,
556
+ "measurement_lower_limit": 203,
557
+ "measurement_unit": 204,
558
+ "measurement_upper_limit": 205,
559
+ "medication_adherence_assessment": 206,
560
+ "mild_exacerbation_count": 207,
561
+ "minicog_score": 208,
562
+ "mmrc_dyspnea_score": 209,
563
+ "mmse_score": 210,
564
+ "moca_score": 211,
565
+ "moderate_exacerbation_count": 212,
566
+ "molecular_genetic_marker_reported": 213,
567
+ "mortality_30day_flag": 214,
568
+ "niv_bipap_flag": 215,
569
+ "nodal_mass_size": 216,
570
+ "note_date": 217,
571
+ "note_text": 218,
572
+ "note_type": 219,
573
+ "nutritional_status": 220,
574
+ "nyha_class": 221,
575
+ "obesity_flag": 222,
576
+ "opioid_use_flag": 223,
577
+ "osteoporosis_flag": 224,
578
+ "other_inhaled_exposure_status": 225,
579
+ "oxygen_requirement_status": 226,
580
+ "pacu_discharge_intermediate_care_flag": 227,
581
+ "pain_vas": 228,
582
+ "pasdas_score": 229,
583
+ "pasqal_tool_used_flag": 230,
584
+ "past_surgical_history_reported": 231,
585
+ "peripheral_arterial_disease_flag": 232,
586
+ "person_id": 233,
587
+ "phq9_score": 234,
588
+ "physical_activity_flag": 235,
589
+ "platelet_count": 236,
590
+ "polypharmacy_status": 237,
591
+ "postoperative_complication_flag": 238,
592
+ "postoperative_delirium_flag": 239,
593
+ "postoperative_delirium_history_flag": 240,
594
+ "pr_status": 241,
595
+ "prescription_days_supply": 242,
596
+ "prescription_dispense_date": 243,
597
+ "pretreatment_screening_reported": 244,
598
+ "prior_corticosteroids_count": 245,
599
+ "prior_dmards_count": 246,
600
+ "prior_nsaids_count": 247,
601
+ "procedure_code": 248,
602
+ "procedure_date": 249,
603
+ "procedure_notes": 250,
604
+ "provider_specialty": 251,
605
+ "psa_subtype": 252,
606
+ "pulmonary_hypertension_flag": 253,
607
+ "ra_functional_class": 254,
608
+ "race": 255,
609
+ "radiation_fraction_count": 256,
610
+ "radiation_fraction_dose": 257,
611
+ "radiation_total_dose": 258,
612
+ "radiation_treated_sites_reported": 259,
613
+ "radiographic_stage": 260,
614
+ "random_glucose_level": 261,
615
+ "readmission_date": 262,
616
+ "readmission_flag": 263,
617
+ "readmission_prevention_measures_reported": 264,
618
+ "readmission_reason": 265,
619
+ "readmission_risk_score": 266,
620
+ "relapse_confirmation_method": 267,
621
+ "relapse_date": 268,
622
+ "remission_flag": 269,
623
+ "reported_allergy_drug": 270,
624
+ "respiratory_insufficiency_flag": 271,
625
+ "respiratory_insufficiency_type": 272,
626
+ "respiratory_rate": 273,
627
+ "rheumatoid_factor_level": 274,
628
+ "rheumatoid_factor_status": 275,
629
+ "rso2_baseline_percent": 276,
630
+ "rv_percent_predicted": 277,
631
+ "rv_tlc_ratio": 278,
632
+ "serum_albumin_level": 279,
633
+ "serum_calcium_level": 280,
634
+ "serum_free_light_chain_level": 281,
635
+ "severe_exacerbation_count": 282,
636
+ "sex": 283,
637
+ "sf36_score": 284,
638
+ "sleep_apnea_flag": 285,
639
+ "smoking_status": 286,
640
+ "social_determinants_reported": 287,
641
+ "social_support_assessment": 288,
642
+ "sparcc_score": 289,
643
+ "spirometry_reversibility_flag": 290,
644
+ "steroid_equivalent_dose": 291,
645
+ "steroid_taper_duration": 292,
646
+ "substance_use_history_reported": 293,
647
+ "support_system_assessment": 294,
648
+ "supportive_medication_reported": 295,
649
+ "surgical_risk_classification": 296,
650
+ "surgical_specialty": 297,
651
+ "symptoms_at_presentation_reported": 298,
652
+ "synovial_fluid_analysis_reported": 299,
653
+ "systolic_bp": 300,
654
+ "temperature": 301,
655
+ "therapeutic_adherence_assessment": 302,
656
+ "therapeutic_class": 303,
657
+ "time_from_diagnosis_to_biologic": 304,
658
+ "tlc_percent_predicted": 305,
659
+ "total_cholesterol_level": 306,
660
+ "toxicity_grade": 307,
661
+ "treatment_cycle_count": 308,
662
+ "treatment_end_date": 309,
663
+ "treatment_inefficacy_flag": 310,
664
+ "treatment_intent": 311,
665
+ "treatment_line": 312,
666
+ "treatment_modification_date": 313,
667
+ "treatment_modification_reason": 314,
668
+ "treatment_modification_type": 315,
669
+ "treatment_start_date": 316,
670
+ "treatment_switch_count": 317,
671
+ "treatment_type": 318,
672
+ "triglycerides_level": 319,
673
+ "troponin_level": 320,
674
+ "tumor_grade": 321,
675
+ "tumor_stage_m": 322,
676
+ "tumor_stage_n": 323,
677
+ "tumor_stage_t": 324,
678
+ "ultrasound_doppler_grade": 325,
679
+ "unknown_field": 326,
680
+ "urine_albumin_creatinine_ratio": 327,
681
+ "visit_end_date": 328,
682
+ "visit_id": 329,
683
+ "visit_start_date": 330,
684
+ "visit_type": 331,
685
+ "wbc_count": 332,
686
+ "weight": 333,
687
+ "worsening_heart_failure_episode_order": 334,
688
+ "worsening_heart_failure_event_type": 335,
689
+ "worsening_heart_failure_flag": 336,
690
+ "worsening_heart_failure_start_date": 337
691
+ },
692
+ "layer_norm_epsilon": 1e-05,
693
+ "model_type": "gpt2",
694
+ "n_ctx": 1024,
695
+ "n_embd": 768,
696
+ "n_head": 12,
697
+ "n_inner": null,
698
+ "n_layer": 6,
699
+ "n_positions": 1024,
700
+ "pad_token_id": 50257,
701
+ "problem_type": "single_label_classification",
702
+ "reorder_and_upcast_attn": false,
703
+ "resid_pdrop": 0.1,
704
+ "scale_attn_by_inverse_layer_idx": false,
705
+ "scale_attn_weights": true,
706
+ "summary_activation": null,
707
+ "summary_first_dropout": 0.1,
708
+ "summary_proj_to_labels": true,
709
+ "summary_type": "cls_index",
710
+ "summary_use_proj": true,
711
+ "task_specific_params": {
712
+ "text-generation": {
713
+ "do_sample": true,
714
+ "max_length": 50
715
+ }
716
+ },
717
+ "torch_dtype": "float16",
718
+ "transformers_version": "4.51.3",
719
+ "use_cache": true,
720
+ "vocab_size": 50258
721
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f4946b1321e11372c1c8714b6ac4a251744bb552ecf521617fb828a6291c3411
3
+ size 164353504
special_tokens_map.json ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<|endoftext|>",
3
+ "eos_token": "<|endoftext|>",
4
+ "pad_token": {
5
+ "content": "<|pad|>",
6
+ "lstrip": false,
7
+ "normalized": false,
8
+ "rstrip": false,
9
+ "single_word": false
10
+ },
11
+ "unk_token": "<|endoftext|>"
12
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "added_tokens_decoder": {
4
+ "50256": {
5
+ "content": "<|endoftext|>",
6
+ "lstrip": false,
7
+ "normalized": true,
8
+ "rstrip": false,
9
+ "single_word": false,
10
+ "special": true
11
+ },
12
+ "50257": {
13
+ "content": "<|pad|>",
14
+ "lstrip": false,
15
+ "normalized": false,
16
+ "rstrip": false,
17
+ "single_word": false,
18
+ "special": true
19
+ }
20
+ },
21
+ "bos_token": "<|endoftext|>",
22
+ "clean_up_tokenization_spaces": false,
23
+ "eos_token": "<|endoftext|>",
24
+ "extra_special_tokens": {},
25
+ "model_max_length": 1024,
26
+ "pad_token": "<|pad|>",
27
+ "tokenizer_class": "GPT2Tokenizer",
28
+ "unk_token": "<|endoftext|>"
29
+ }
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7697c4e9b1007db58fda5e695eceb61866ceb9d5c73b48287d7bd931be45a4d9
3
+ size 7352
vocab.json ADDED
The diff for this file is too large to render. See raw diff