菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-05-02
📄 Abstract - EFGPP: Exploratory framework for genotype-phenotype prediction

Predicting complex human traits from genetic data is challenging because different genetic, clinical, and molecular data sources often contain different parts of the signal. Here, we present EFGPP, a reproducible framework for generating, ranking, and combining multiple types of data for genotype-to-phenotype prediction. We applied EFGPP to migraine prediction using UK Biobank data from 733 individuals. The framework combined genotype-derived features, principal components, clinical and metabolomic covariates, and polygenic risk scores generated from migraine and depression GWAS using PLINK, PRSice-2, AnnoPred, and LDAK-GWAS. The best single data type achieved a test AUC of 0.644, while combining multiple data types improved performance to 0.688 using migraine-focused inputs and 0.663 using cross-trait depression-derived inputs. Genetic features alone did not outperform the covariates-only baseline, but genotype-derived features performed better than PRS alone, and depression-derived PRS showed useful predictive signal. Overall, EFGPP provides a practical proof-of-concept framework for prioritising and integrating heterogeneous genetic data sources for complex phenotype prediction.

顶级标签: medical machine learning data
详细标签: genotype-phenotype polygenic risk scores data integration migraine prediction auc evaluation 或 搜索:

EFGPP:基因型-表型预测的探索性框架 / EFGPP: Exploratory framework for genotype-phenotype prediction


1️⃣ 一句话总结

该论文提出并验证了一个名为EFGPP的通用框架,能够系统整合来自不同来源的遗传、临床和代谢数据,并通过实际预测偏头痛的案例证明,组合多种数据类型比单一数据源能显著提升复杂人类性状的预测准确率。

源自 arXiv: 2605.02954