钓鱼网站检测中的鲁棒性、成本与攻击面集中性研究 / Robustness, Cost, and Attack-Surface Concentration in Phishing Detection
1️⃣ 一句话总结
这篇论文研究发现,钓鱼网站检测器的安全防御能力主要取决于攻击者修改网站特征的成本高低,而不是机器学习模型本身的复杂度,因为攻击者总能找到少数低成本特征进行有效攻击。
Phishing detectors built on engineered website features attain near-perfect accuracy under i.i.d.\ evaluation, yet deployment security depends on robustness to post-deployment feature manipulation. We study this gap through a cost-aware evasion framework that models discrete, monotone feature edits under explicit attacker budgets. Three diagnostics are introduced: minimal evasion cost (MEC), the evasion survival rate $S(B)$, and the robustness concentration index (RCI). On the UCI Phishing Websites benchmark (11\,055 instances, 30 ternary features), Logistic Regression, Random Forests, Gradient Boosted Trees, and XGBoost all achieve $\mathrm{AUC}\ge 0.979$ under static evaluation. Under budgeted sanitization-style evasion, robustness converges across architectures: the median MEC equals 2 with full features, and over 80\% of successful minimal-cost evasions concentrate on three low-cost surface features. Feature restriction improves robustness only when it removes all dominant low-cost transitions. Under strict cost schedules, infrastructure-leaning feature sets exhibit 17-19\% infeasible mass for ensemble models, while the median MEC among evadable instances remains unchanged. We formalize this convergence: if a positive fraction of correctly detected phishing instances admit evasion through a single feature transition of minimal cost $c_{\min}$, no classifier can raise the corresponding MEC quantile above $c_{\min}$ without modifying the feature representation or cost model. Adversarial robustness in phishing detection is governed by feature economics rather than model complexity.
钓鱼网站检测中的鲁棒性、成本与攻击面集中性研究 / Robustness, Cost, and Attack-Surface Concentration in Phishing Detection
这篇论文研究发现,钓鱼网站检测器的安全防御能力主要取决于攻击者修改网站特征的成本高低,而不是机器学习模型本身的复杂度,因为攻击者总能找到少数低成本特征进行有效攻击。
源自 arXiv: 2603.19204