We asked people to describe the same pictures with different task instructions and trained a CNN to learn these sentence embeddings.
Both networks learned task-relevant visual features that humans also needed for the same tasks!
We asked people to describe the same pictures with different task instructions and trained a CNN to learn these sentence embeddings.
Both networks learned task-relevant visual features that humans also needed for the same tasks!
✅ Affordances
✅ Surfaces
✅ Materials
People picked the affordance outlier as being the most different most often. (3/7)
✅ Affordances
✅ Surfaces
✅ Materials
People picked the affordance outlier as being the most different most often. (3/7)