Question About Accuracy Calculation in HaluEval

#28
by SungJoo - opened

Hi,
I am using HaluEval to evaluate models, but I am unsure whether the accuracy metric is being calculated correctly.

In my results, the exact match (em) score is higher than accuracy (acc), which seems counterintuitive. After reviewing the code, I noticed that res["acc"] = 1.0 if (is_correct and is_exact) else 0.0 may be the cause of this issue. Since accuracy should typically be greater than or equal to exact match, I believe that using is_correct alone might be a more appropriate condition, rather than requiring both is_correct and is_exact to be True.
Could you confirm whether this is the intended behavior? If not, would modifying the calculation to res["acc"] = 1.0 if is_correct else 0.0 be a more appropriate fix?
Thank you!

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment