菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-14
📄 Abstract - AffectAgent: Collaborative Multi-Agent Reasoning for Retrieval-Augmented Multimodal Emotion Recognition

LLM-based multimodal emotion recognition relies on static parametric memory and often hallucinates when interpreting nuanced affective states. In this paper, given that single-round retrieval-augmented generation is highly susceptible to modal ambiguity and therefore struggles to capture complex affective dependencies across modalities, we introduce AffectAgent, an affect-oriented multi-agent retrieval-augmented generation framework that leverages collaborative decision-making among agents for fine-grained affective understanding. Specifically, AffectAgent comprises three jointly optimized specialized agents, namely a query planner, an evidence filter, and an emotion generator, which collaboratively perform analytical reasoning to retrieve cross-modal samples, assess evidence, and generate predictions. These agents are optimized end-to-end using Multi-Agent Proximal Policy Optimization (MAPPO) with a shared affective reward to ensure consistent emotion understanding. Furthermore, we introduce Modality-Balancing Mixture of Experts (MB-MoE) and Retrieval-Augmented Adaptive Fusion (RAAF), where MB-MoE dynamically regulates the contributions of different modalities to mitigate representation mismatch caused by cross-modal heterogeneity, while RAAF enhances semantic completion under missing-modality conditions by incorporating retrieved audiovisual embeddings. Extensive experiments on MER-UniBench demonstrate that AffectAgent achieves superior performance across complex scenarios. Our code will be released at: this https URL.

顶级标签: multi-modal agents natural language processing
详细标签: multimodal emotion recognition retrieval-augmented generation multi-agent systems affective computing modality fusion 或 搜索:

AffectAgent:基于协作多智能体推理的检索增强多模态情感识别 / AffectAgent: Collaborative Multi-Agent Reasoning for Retrieval-Augmented Multimodal Emotion Recognition


1️⃣ 一句话总结

这篇论文提出了一个名为AffectAgent的多智能体协作框架,通过三个专门化的智能体协同工作来更准确地理解和识别复杂、细微的多模态情感,有效解决了传统方法因模态差异或信息缺失导致的识别不准或‘幻觉’问题。

源自 arXiv: 2604.12735