Theory of Mind May Have Spontaneously Emerged in Large Language Models
Theory of mind (ToM), or the ability to impute unobservable mental states to others, is central to human social interactions, communication, empathy, self-consciousness, and morality. We tested several language models using 40 classic false-belief tasks widely used to test ToM in humans. The models published before 2020 showed virtually no ability to solve ToM tasks. Yet, the first version of GPT-3 ("davinci-001"), published in May 2020, solved about 40 of false-belief tasks-performance comparable with 3.5-year-old children. Its second version ("davinci-002"; January 2022) solved 70 performance comparable with six-year-olds. Its most recent version, GPT-3.5 ("davinci-003"; November 2022), solved 90 of seven-year-olds. GPT-4 published in March 2023 solved nearly all the tasks (95 uniquely human) may have spontaneously emerged as a byproduct of language models' improving language skills.
READ FULL TEXT