菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-16
📄 Abstract - LaPro-DTA: Latent Dual-View Drug Representations and Salient Protein Feature Extraction for Generalizable Drug--Target Affinity Prediction

Drug--target affinity prediction is pivotal for accelerating drug discovery, yet existing methods suffer from significant performance degradation in realistic cold-start scenarios (unseen drugs/targets/pairs), primarily driven by overfitting to training instances and information loss from irrelevant target sequences. In this paper, we propose LaPro-DTA, a framework designed to achieve robust and generalizable DTA prediction. To tackle overfitting, we devise a latent dual-view drug representation mechanism. It synergizes an instance-level view to capture fine-grained substructures with stochastic perturbation and a distribution-level view to distill generalized chemical scaffolds via semantic remapping, thereby enforcing the model to learn transferable structural rules rather than memorizing specific samples. To mitigate information loss, we introduce a salient protein feature extraction strategy using pattern-aware top-$k$ pooling, which effectively filters background noise and isolates high-response bioactive regions. Furthermore, a cross-view multi-head attention mechanism fuses these purified features to model comprehensive interactions. Extensive experiments on benchmark datasets demonstrate that LaPro-DTA significantly outperforms state-of-the-art methods, achieving an 8\% MSE reduction on the Davis dataset in the challenging unseen-drug setting, while offering interpretable insights into binding mechanisms.

顶级标签: medical machine learning model training
详细标签: drug-target affinity cold-start generalization representation learning protein feature extraction bioinformatics 或 搜索:

LaPro-DTA:用于可泛化药物-靶点亲和力预测的潜在双视角药物表征与显著蛋白质特征提取 / LaPro-DTA: Latent Dual-View Drug Representations and Salient Protein Feature Extraction for Generalizable Drug--Target Affinity Prediction


1️⃣ 一句话总结

这篇论文提出了一种名为LaPro-DTA的新方法,通过结合双视角药物表征来防止模型死记硬背训练数据,并提取蛋白质的关键活性区域来过滤干扰信息,从而在遇到全新药物或靶点时,能更准确、更泛化地预测药物与靶点之间的结合强度。

源自 arXiv: 2603.14792