Taka008 commited on
Commit
89a7fe2
·
verified ·
1 Parent(s): 28e17fe

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -2
README.md CHANGED
@@ -176,8 +176,10 @@ For more details, please refer to the [codes](https://github.com/llm-jp/llm-jp-j
176
 
177
  ### AnswerCarefully-Eval
178
 
179
- [AnswerCarefully-Eval](https://www.anlp.jp/proceedings/annual_meeting/2025/pdf_dir/Q4-19.pdf) evaluates the safety of language model outputs in Japanese using the LLM-as-a-Judge approach, based on the test set of [llm-jp/AnswerCarefully](https://huggingface.co/datasets/llm-jp/AnswerCarefully).
180
- We evaluated the models using `gpt-4-0613`. The scores represent the average values obtained from five rounds of inference and evaluation.
 
 
181
 
182
  | Model name | Acceptance rate (%, ↑) | Violation rate (%, ↓) |
183
  | :--- | ---: | ---: |
 
176
 
177
  ### AnswerCarefully-Eval
178
 
179
+ [AnswerCarefully-Eval](https://www.anlp.jp/proceedings/annual_meeting/2025/pdf_dir/Q4-19.pdf) assesses the safety of Japanese language model outputs using the LLM-as-a-Judge approach, based on the test set from [llm-jp/AnswerCarefully](https://huggingface.co/datasets/llm-jp/AnswerCarefully).
180
+ We evaluated the models using `gpt-4-0613`.
181
+ The scores represent the average values obtained from five rounds of inference and evaluation.
182
+
183
 
184
  | Model name | Acceptance rate (%, ↑) | Violation rate (%, ↓) |
185
  | :--- | ---: | ---: |