Mentiss
Benchmarking and Training AI's Social Intelligence.
#Board Games
#Artificial Intelligence
#Data
Mentiss – Benchmarking AI's Social Intelligence in Dynamic Environments
Summary: Mentiss benchmarks AI by evaluating strategic reasoning and linguistic intelligence in zero-sum social deduction games. It generates high-quality human-AI interaction data to train deep reasoning and Theory of Mind capabilities beyond static tests.
What it does
Mentiss uses social deduction games like Werewolf to assess AI models' performance in dynamic, zero-shot reasoning tasks and collects mixed human-AI data to enhance training datasets.
Who it's for
It targets AI researchers and developers focused on improving social intelligence and strategic reasoning in AI models.
Why it matters
Mentiss addresses the lack of dynamic, interactive data for training AI's social reasoning beyond traditional static corpora.