Minibase commited on
Commit
396411e
·
verified ·
1 Parent(s): 7f35bef

Upload benchmarks.txt with huggingface_hub

Browse files
Files changed (1) hide show
  1. benchmarks.txt +14 -40
benchmarks.txt CHANGED
@@ -2,66 +2,40 @@
2
  **Model:** Minibase-DeId-Small
3
  **Dataset:** Personal_De-identifier_Benchmark_SFT.jsonl
4
  **Sample Size:** 100
5
- **Date:** 2025-09-25T12:35:05.897062
6
 
7
  ## Overall Performance
8
 
9
  | Metric | Score | Description |
10
  |--------|-------|-------------|
11
- | PII Detection Rate | 0.203 | How well personal identifiers are detected |
12
- | Completeness Score | 0.640 | Percentage of texts fully de-identified |
13
  | Semantic Preservation | 0.109 | How well meaning is preserved |
14
- | Average Latency | 492.4ms | Response time performance |
15
 
16
- ## Domain Performance
17
 
18
- ### Medical Domain (33 samples)
19
- - PII Detection: 0.214
20
- - Completeness: 0.606
21
- - Semantic Preservation: 0.110
22
-
23
- ### Legal Domain (6 samples)
24
- - PII Detection: 0.113
25
- - Completeness: 0.500
26
- - Semantic Preservation: 0.056
27
-
28
- ### Hr Domain (11 samples)
29
- - PII Detection: 0.202
30
- - Completeness: 0.273
31
- - Semantic Preservation: 0.108
32
-
33
- ### General Domain (40 samples)
34
- - PII Detection: 0.218
35
- - Completeness: 0.750
36
- - Semantic Preservation: 0.120
37
-
38
- ### Research Domain (4 samples)
39
- - PII Detection: 0.192
40
- - Completeness: 0.500
41
- - Semantic Preservation: 0.108
42
-
43
- ### Customer_Service Domain (6 samples)
44
- - PII Detection: 0.140
45
- - Completeness: 1.000
46
- - Semantic Preservation: 0.083
47
 
48
  ## Example Results
49
 
50
- ### Example 1 (medical domain)
51
  **Input:** Patient Sarah Johnson, DOB 05/12/1980, visited Dr. Lee at St. Jude Hospital on 2023-10-26. Her conta...
52
  **Expected:** Patient [NAME_1], DOB [DOB_1], visited [NAME_2] at [HOSPITAL_1] on [DATE_1]. Her contact is [PHONE_1...
53
  **Predicted:** Patient [FIRSTNAME_1] [MIDDLENAME_1], DOB [DOB_1], visited Dr. [LASTNAME_1] at [CITY_1] Hospital on ...
54
- **PII Detection:** 0.286
55
 
56
- ### Example 2 (legal domain)
57
  **Input:** Deponent Mr. Robert Davis, CEO of GlobalCorp Inc., stated under oath on December 1, 2022, that his a...
58
  **Expected:** Deponent [NAME_1], CEO of [ORGANIZATION_1], stated under oath on [DATE_1], that his attorney, [NAME_...
59
  **Predicted:** Deponent [PREFIX_1] [FIRSTNAME_1] [LASTNAME_1], CEO of [COMPANYNAME_1], stated under oath on [DATE_1...
60
- **PII Detection:** 0.167
61
 
62
- ### Example 3 (hr domain)
63
  **Input:** Employee ID: EMP-001-XYZ. Name: John Doe. Salary: $85,000. Email: [email protected]. Marital Stat...
64
  **Expected:** Employee ID: [EMPLOYEE_ID_1]. Name: [NAME_1]. Salary: [SALARY_1]. Email: [EMAIL_1]. Marital Status: ...
65
  **Predicted:** Employee ID: EMP-[BUILDINGNUMBER_1]. Name: [FIRSTNAME_1] Doe. Salary: [CURRENCYSYMBOL_1][AMOUNT_1]. ...
66
- **PII Detection:** 0.167
67
 
 
2
  **Model:** Minibase-DeId-Small
3
  **Dataset:** Personal_De-identifier_Benchmark_SFT.jsonl
4
  **Sample Size:** 100
5
+ **Date:** 2025-09-25T12:38:54.363196
6
 
7
  ## Overall Performance
8
 
9
  | Metric | Score | Description |
10
  |--------|-------|-------------|
11
+ | PII Detection Rate | 1.000 | How well personal identifiers are detected |
12
+ | Completeness Score | 0.670 | Percentage of texts fully de-identified |
13
  | Semantic Preservation | 0.109 | How well meaning is preserved |
14
+ | Average Latency | 483.7ms | Response time performance |
15
 
16
+ ## Key Improvements
17
 
18
+ - **PII Detection**: Now measures if model generates ANY placeholders when PII is present in input
19
+ - **Unified Evaluation**: All examples evaluated together (no domain separation)
20
+ - **Lenient Scoring**: Focuses on detection capability rather than exact placeholder matching
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
 
22
  ## Example Results
23
 
24
+ ### Example 1
25
  **Input:** Patient Sarah Johnson, DOB 05/12/1980, visited Dr. Lee at St. Jude Hospital on 2023-10-26. Her conta...
26
  **Expected:** Patient [NAME_1], DOB [DOB_1], visited [NAME_2] at [HOSPITAL_1] on [DATE_1]. Her contact is [PHONE_1...
27
  **Predicted:** Patient [FIRSTNAME_1] [MIDDLENAME_1], DOB [DOB_1], visited Dr. [LASTNAME_1] at [CITY_1] Hospital on ...
28
+ **PII Detection:** 1.000
29
 
30
+ ### Example 2
31
  **Input:** Deponent Mr. Robert Davis, CEO of GlobalCorp Inc., stated under oath on December 1, 2022, that his a...
32
  **Expected:** Deponent [NAME_1], CEO of [ORGANIZATION_1], stated under oath on [DATE_1], that his attorney, [NAME_...
33
  **Predicted:** Deponent [PREFIX_1] [FIRSTNAME_1] [LASTNAME_1], CEO of [COMPANYNAME_1], stated under oath on [DATE_1...
34
+ **PII Detection:** 1.000
35
 
36
+ ### Example 3
37
  **Input:** Employee ID: EMP-001-XYZ. Name: John Doe. Salary: $85,000. Email: [email protected]. Marital Stat...
38
  **Expected:** Employee ID: [EMPLOYEE_ID_1]. Name: [NAME_1]. Salary: [SALARY_1]. Email: [EMAIL_1]. Marital Status: ...
39
  **Predicted:** Employee ID: EMP-[BUILDINGNUMBER_1]. Name: [FIRSTNAME_1] Doe. Salary: [CURRENCYSYMBOL_1][AMOUNT_1]. ...
40
+ **PII Detection:** 1.000
41