Running
1
Werewolf Benchmark
๐
The Werewolf Benchmark tests LLMsโ social intelligence.
None defined yet.
Reshaping businesses for the agentic era.
Foaster.ai is a French start-up focused on reshaping businesses for the agentic era. At Foaster Labs, our Werewolf Benchmark studies how LLMs behave under social pressure: leadership, bluffing, and resistance to manipulation.
ELO-W = wolf (manipulation power) ยท ELO-V = villager (manipulation resistance)