MULTICAUSENET temporal attention for multimodal emotion cause pair extraction

Item request has been placed!

Item request cannot be made.

Processing Request

Read Online Read More Add to Saved list

Author(s): Ma Junchi; Hassan Nazeer Chaudhry; Farzana Kulsoom; Yang Guihua; Sajid Ullah Khan; Sujit Biswas; Zahid Ullah Khan; Faheem Khan
Source:
Scientific Reports, Vol 15, Iss 1, Pp 1-30 (2025)
Subject Terms:
Emotion–cause pair extraction; Multimodal emotion recognition; Graph attention networks (GATs); Vision transformers (ViTs); Transformers and attention mechanisms; Feature fusion; Medicine; Science
Document Type:
article
Language:
English
Online Access:
https://doaj.org/article/d91f152d8bb04c899db1554e31ebcad3

Additional Information
- Publication Information:
  Nature Portfolio, 2025.
- Publication Date:
  2025
- Collection:
  LCC:Medicine
  LCC:Science
- Abstract:
  Abstract In the realm of emotion recognition, understanding the intricate relationships between emotions and their underlying causes remains a significant challenge. This paper presents MultiCauseNet, a novel framework designed to effectively extract emotion-cause pairs by leveraging multimodal data, including text, audio, and video. The proposed approach integrates advanced multimodal feature extraction techniques with attention mechanisms to enhance the understanding of emotional contexts. The key text, audio, and video features are extracted using BERT, Wav2Vec, and Vision transformers (ViTs), which are then employed to construct a comprehensive multimodal graph. The graph encodes the relationships between emotions and potential causes, and Graph Attention Networks (GATs) are used to weigh and prioritize relevant features across the modalities. To further improve performance, Transformers are employed to model intra-modal and inter-modal dependencies through self-attention and cross-attention mechanisms. This enables a more robust multimodal information fusion, capturing the global context of emotional interactions. This dynamic attention mechanism enables MultiCauseNet to capture complex interactions between emotional triggers and causes, improving extraction accuracy. Experiments on emotion benchmark datasets, including IEMOCAP and MELD achieved a WFI score of 73.02 and 53.67 respectively. The results for cause pair analysis are evaluated on ECF and ConvECPE with a Cause recognition F1 score of 65.12 and 84.51, and a Pair extraction F1 score of 55.12 and 51.34.
- File Description:
  electronic resource
- ISSN:
  2045-2322
- Relation:
  https://doaj.org/toc/2045-2322
- Accession Number:
  10.1038/s41598-025-01221-w
- Accession Number:
  edsdoj.91f152d8bb04c899db1554e31ebcad3

Comments

No Comments.

MULTICAUSENET temporal attention for multimodal emotion cause pair extraction

Contact

Follow us