UMCU commited on
Commit
1ded9e1
·
verified ·
1 Parent(s): f664fcd

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -0
README.md ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: gpl-3.0
3
+ datasets:
4
+ - oscar-corpus/OSCAR-2301
5
+ language:
6
+ - nl
7
+ base_model:
8
+ - DTAI-KULeuven/robbert-2023-dutch-base
9
+ pipeline_tag: text-classification
10
+ tags:
11
+ - medical
12
+ ---
13
+
14
+
15
+ We used GPT4.1-nano to classify generic texts from OSCAR as medical/non-medical. We labeled 400.000 texts, with about 40.000 labeled as positive.
16
+ We then trained a SequenceClassifier on 80.000 samples with a 50/50 class ratio.