Multimodal Emotion Recognition from Advertisements with Application to Computational Advertising
Abhinav Shukla (Home Page)
Advertisements (ads) are often filled with strong affective content covering a gamut of emotions intended to capture viewer attention and attempt to convey an effective message. However, most approaches to computationally analyze the emotion present in ads are based on the text modality and only a limited amount of work has been done on affective understanding of advertisements videos from the content and user-centric perspectives. This work attempts to bring together recent advances in deep learning (especially in the domain of visual recognition) and affective computing, and use them to perform affect estimation of advertisements. We first create a dataset of 100 ads which are annotated by 5 experts and are evenly distributed over the valence-arousal plane. We then perform content-based affect recognition via a transfer learning based approach to estimate the affective content in this dataset using prior affective knowledge gained from a large annotated movie dataset. We employ both visual features from video frames and audio features from spectrograms to train our deep neural networks. This approach vastly outperforms the existing benchmark. It is also very interesting to see how human physiological signals, such as that captured by Electroencephalography (EEG) data are able to provide useful affective insights into the content from a user-centric perspective. Using this time series data of the electrical activity of the brain, we train models which are able to classify the emotional dimensions of this data. This also enables us to benchmark this user-centric performance and compare it to the content-centric deep learning based models, and we find that the user-centric models outperform the content-centric models, and set the state-of-the-art in ad affect recognition. We also combine the two kinds of modalities (audiovisual and EEG) using decision fusion and find that the fusion performance is greater than either single modality, which shows that human physiological signals and the audiovisual content contain complementary affective information. We also use multi task learning (MTL) on top of the features of each kind to exploit the intrinsic relatedness of the data and boost the performance. Lastly, we validate the hypothesis of better affect estimation being able to enhance a real world application by supplying the affective values computed by our methods to a computational advertising framework to get a video program sequence with ads inserted at emotionally relevant points, determined to be appropriate based on the affective relevance between the the program content and the ads. Multiple user studies find that our methods significantly outperform the existing algorithms and are very close (and sometimes better than) human level performance. We are able to achieve much more emotionally relevant and non disruptive advertisement insertion into a program video stream. In summary, this work (1) compiles an affective ad dataset capable of evoking coherent emotions across users; (2) explores the efficacy of content-centric convolutional neural network (CNN) features for affect recognition (AR), and find that CNN features outperform low level audio-visual descriptors; (3) study user-centric ad affect recognition from Electroencephalogram (EEG) responses (with conventional classifiers as well as a novel CNN architecture for EEG) acquired while viewing the content that outperform content descriptors; (4) Examine a multi-task learning framework based on CNN and EEG features which provides state of the art AR from ads; (5) Demonstrates how better affect predictions facilitates more effective computational advertising in a real world application.
|Year of completion:||June 2018|
|Advisor :||Ramanathan Subramanian|
Abhinav Shukla, Shruti Shriya Gullapuram, Harish Katti, Karthik Yadati, Mohan Kankanhalli and Ramanathan Subramanian - Evaluating Content-centric vs User-centric Ad Affect Recognition In Proceedings of 19th ACM International Conference on Multimodal Interaction (ICMI'17).[PDF]
Abhinav Shukla, Shruti Shriya Gullapuram, Harish Katti, Karthik Yadati, Mohan Kankanhalli and Ramanathan Subramanian - Affect Recognition in Ads with Application to Computational Advertising In Proceedings of 25th ACM International Conference on Multimedia (ACM MM '17).[PDF]