autoeval

#1
by fieryTransition - opened

Would be nice to add autoeval or llm-evaluation-harness to automate some common benchmarks on the models as well :-)

Sign up or log in to comment