Interpretable Time-Dependent Convolutional Emotion Recognition with Contextual Data Streams

Abstract

Emotion prediction is important when interacting with computers. However, emotions are complex, difficult to assess, understand, and hard to classify. Current emotion classification strategies skip why a specific emotion was predicted, complicating the user’s understanding of affective and empathic interface behaviors. Advances in deep learning showed that convolutional networks can learn powerful time-series patterns while showing classification decisions and feature importances. We present a novel convolution-based model that classifies emotions robustly. Our model not only offers high emotion-prediction performance but also enables transparency on CHI the model decisions. Our solution thereby provides a time-aware feature interpretation of classification decisions using saliency maps. We evaluate the system on a contextual, real-world driving dataset involving twelve participants. Our model achieves a mean accuracy of 70% in 5-class emotion classification on unknown roads and outperforms in-car facial expression recognition by 14%. We conclude how emotion prediction can be improved by incorporating emotion sensing into interactive computing systems.

Publication
In Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems