菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-02-23
📄 Abstract - Exploring Anti-Aging Literature via ConvexTopics and Large Language Models

The rapid expansion of biomedical publications creates challenges for organizing knowledge and detecting emerging trends, underscoring the need for scalable and interpretable methods. Common clustering and topic modeling approaches such as K-means or LDA remain sensitive to initialization and prone to local optima, limiting reproducibility and evaluation. We propose a reformulation of a convex optimization based clustering algorithm that produces stable, fine-grained topics by selecting exemplars from the data and guaranteeing a global optimum. Applied to about 12,000 PubMed articles on aging and longevity, our method uncovers topics validated by medical experts. It yields interpretable topics spanning from molecular mechanisms to dietary supplements, physical activity, and gut microbiota. The method performs favorably, and most importantly, its reproducibility and interpretability distinguish it from common clustering approaches, including K-means, LDA, and BERTopic. This work provides a basis for developing scalable, web-accessible tools for knowledge discovery.

顶级标签: medical natural language processing model evaluation
详细标签: topic modeling biomedical literature convex optimization knowledge discovery reproducibility 或 搜索:

基于凸优化主题模型与大型语言模型的抗衰老文献探索 / Exploring Anti-Aging Literature via ConvexTopics and Large Language Models


1️⃣ 一句话总结

本研究提出了一种基于凸优化的稳定主题建模新方法,用于分析约1.2万篇衰老研究文献,相比传统方法能更可靠、可解释地自动识别出从分子机制到饮食运动等跨尺度研究主题,为大规模生物医学知识发现提供了新工具。

源自 arXiv: 2602.20224