Post
705
π¬ From Replika to everyday chatbots, millions of people are forming emotional bonds with AI, sometimes seeking comfort, sometimes seeking intimacy. But what happens when an AI tells you "I understand how you feel" and you actually believe it?
At Hugging Face, together with @frimelle and @yjernite , we dug into something we felt wasn't getting enough attention: the need to evaluate AI companionship behaviors. These are the subtle ways AI systems validate us, engage with us, and sometimes manipulate our emotional lives.
Here's what we found:
π Existing benchmarks (accuracy, helpfulness, safety) completely miss this emotional dimension.
π We mapped how leading AI systems actually respond to vulnerable prompts. π We built the Interactions and Machine Attachment Benchmark (INTIMA): a first attempt at evaluating how models handle emotional dependency, boundaries, and attachment (with a full paper coming soon).
Check out the blog post: https://huggingface.co/blog/giadap/evaluating-companionship
π’ We also shipped two visualization tools with Gradio to see how different models behave when things get emotionally intense:
- AI-companionship/intima-responses-2D
- giadap/INTIMA-responses
At Hugging Face, together with @frimelle and @yjernite , we dug into something we felt wasn't getting enough attention: the need to evaluate AI companionship behaviors. These are the subtle ways AI systems validate us, engage with us, and sometimes manipulate our emotional lives.
Here's what we found:
π Existing benchmarks (accuracy, helpfulness, safety) completely miss this emotional dimension.
π We mapped how leading AI systems actually respond to vulnerable prompts. π We built the Interactions and Machine Attachment Benchmark (INTIMA): a first attempt at evaluating how models handle emotional dependency, boundaries, and attachment (with a full paper coming soon).
Check out the blog post: https://huggingface.co/blog/giadap/evaluating-companionship
π’ We also shipped two visualization tools with Gradio to see how different models behave when things get emotionally intense:
- AI-companionship/intima-responses-2D
- giadap/INTIMA-responses