视角:话语分析工具套件中的交互式文档聚类工具 / Perspectives - Interactive Document Clustering in the Discourse Analysis Tool Suite
1️⃣ 一句话总结
这篇论文介绍了一个名为‘视角’的交互式工具,它通过让数字人文学者参与聚类过程,帮助他们灵活地探索和整理海量、无结构的文档集,从而发现主题、情感等模式,为后续深度分析做好准备。
This paper introduces Perspectives, an interactive extension of the Discourse Analysis Tool Suite designed to empower Digital Humanities (DH) scholars to explore and organize large, unstructured document collections. Perspectives implements a flexible, aspect-focused document clustering pipeline with human-in-the-loop refinement capabilities. We showcase how this process can be initially steered by defining analytical lenses through document rewriting prompts and instruction-based embeddings, and further aligned with user intent through tools for refining clusters and mechanisms for fine-tuning the embedding model. The demonstration highlights a typical workflow, illustrating how DH researchers can leverage Perspectives's interactive document map to uncover topics, sentiments, or other relevant categories, thereby gaining insights and preparing their data for subsequent in-depth analysis.
视角:话语分析工具套件中的交互式文档聚类工具 / Perspectives - Interactive Document Clustering in the Discourse Analysis Tool Suite
这篇论文介绍了一个名为‘视角’的交互式工具,它通过让数字人文学者参与聚类过程,帮助他们灵活地探索和整理海量、无结构的文档集,从而发现主题、情感等模式,为后续深度分析做好准备。
源自 arXiv: 2602.15540