This study investigates how contextual information influences emotion recognition in both humans and deep neural networks (DNNs). We examined the extent to which external context affects the perception of emotions and compared human assessments with those made by DNN models.
Participants evaluated images depicting individuals expressing emotions within various contexts. The same images were processed by DNN models trained for emotion recognition. Our findings indicate that while humans naturally incorporate contextual cues into their interpretations, DNN models often rely predominantly on facial expressions, overlooking surrounding context.
These results highlight the necessity of integrating contextual understanding into AI systems to enhance their emotion recognition capabilities, making them more aligned with human perception.