菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-27
📄 Abstract - BiMol-Diff: A Unified Diffusion Framework for Molecular Generation and Captioning

Bridging molecular structures and natural language is essential for controllable design. Autoregressive models struggle with long-range dependencies, while standard diffusion processes apply uniform corruption across positions, which can distort structurally informative tokens. We present BiMol-Diff, a unified diffusion framework for the paired tasks of text-conditioned molecule generation and molecule captioning. Our key component is a token-aware noise schedule that assigns position-dependent corruption based on token recovery difficulty, preserving harder-to-recover substructures during the forward process. On ChEBI-20 and M3-20M, BiMol-Diff improves molecule reconstruction with a 15.4% relative gain in Exact Match and achieves strong captioning results, attaining best BLEU and BERTScore among compared baselines. These results indicate token-aware noising improves fidelity in molecular structure-language modelling.

顶级标签: multi-modal machine learning llm
详细标签: molecular generation diffusion model captioning molecule language model token-aware noise 或 搜索:

BiMol-Diff:一个用于分子生成和分子标题生成的统一扩散框架 / BiMol-Diff: A Unified Diffusion Framework for Molecular Generation and Captioning


1️⃣ 一句话总结

该论文提出了一种名为BiMol-Diff的统一扩散模型,通过一种能根据不同分子子结构恢复难度来调整噪声添加程度的“令牌感知”技术,同时实现了从文字描述生成分子结构和从分子结构生成文字描述两项任务,在多项指标上显著优于现有方法。

源自 arXiv: 2604.24089