菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-06-10
📄 Abstract - Tree-Structured Orthonormal Decomposition of the Aitchison Simplex

Compositional data -- vectors encoding relative proportions -- arise across scientific domains, including ecology, geochemistry, and genomics. The features in these data often come with known hierarchical structure (e.g., taxonomies, phylogenies, ontologies), yet existing methods either ignore this structure, discard the intrinsic Aitchison geometry, are designed for binary trees, or yield incomplete coordinate systems. We describe PolyILR, a canonical orthonormal decomposition of the Aitchison tangent space aligned with any tree topology. Our construction defines a weighted local geometry at each internal node capturing full branching structure, then lifts these to a global orthonormal basis where every coordinate corresponds to a specific tree location. On microbiome and single-cell benchmarks, PolyILR yields stable, interpretable features and enables inference at multiscale tree resolution. We also establish a novel theoretical connection to softmax classifiers, suggesting possible applications to probabilistic modeling.

顶级标签: machine learning biology
详细标签: compositional data hierarchical structure orthonormal decomposition interpretability microbiome 或 搜索:

艾奇逊单纯形的树结构正交分解 / Tree-Structured Orthonormal Decomposition of the Aitchison Simplex


1️⃣ 一句话总结

本文提出了一种名为PolyILR的方法,能够将成分数据(如微生物组或基因表达中的相对比例数据)沿着已知的层级结构(如生物分类树)分解为一系列正交、互不相关的坐标,从而在保留数据几何特性的同时,生成易于解释的多尺度特征,帮助研究人员在树的不同分支和层级上进行分析与建模。

源自 arXiv: 2606.11646