ICED: Concept-level Machine Unlearning via Interpretable Concept Decomposition

📄 Abstract - ICED: Concept-level Machine Unlearning via Interpretable Concept Decomposition

Machine unlearning in Vision-Language Models (VLMs) is typically performed at the image or instance level, making it difficult to precisely remove target knowledge without affecting unrelated semantics. This issue is especially pronounced since a single image often contains multiple entangled concepts, including both target concepts to be forgotten and contextual information that should be preserved. In this paper, we propose an interpretable concept-level unlearning framework for VLMs, which constructs a compact task-specific concept vocabulary from the forgetting set using a multimodal large language model. In addition to modality alignment, visual representations are decomposed into sparse, nonnegative combinations of semantic concepts, providing an explicit interface for fine-grained knowledge manipulation. Based on this decomposition, our method formulates unlearning as concept-level optimization, where target concepts are selectively suppressed while intra-instance non-target semantics and global cross-modal knowledge are preserved. Extensive experiments across both in-domain and out-of-domain forgetting settings demonstrate that our method enables more comprehensive target forgetting, better preserves non-target knowledge within the same image, and maintains competitive model utility compared with existing VLM unlearning methods.

通过可解释概念分解实现概念级机器遗忘 / ICED: Concept-level Machine Unlearning via Interpretable Concept Decomposition

1️⃣ 一句话总结

该论文提出了一种名为ICED的方法，通过将图像中的视觉信息分解成多个可解释的语义概念，从而允许人工智能模型在遗忘特定概念（如物体、场景）时，不影响同一图像中其他无关内容的记忆，解决了现有方法无法精准删除目标知识的问题。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要