Accelerating Large-Scale Dataset Distillation via Exploration-Exploitation Optimization

📄 Abstract - Accelerating Large-Scale Dataset Distillation via Exploration-Exploitation Optimization

Dataset distillation compresses the original data into compact synthetic datasets, reducing training time and storage while retaining model performance, enabling deployment under limited resources. Although recent decoupling-based distillation methods enable dataset distillation at large-scale, they continue to face an efficiency gap: optimization-based decoupling methods achieve higher accuracy but demand intensive computation, whereas optimization-free decoupling methods are efficient but sacrifice accuracy. To overcome this trade-off, we propose Exploration-Exploitation Distillation (E^2D), a simple, practical method that minimizes redundant computation through an efficient pipeline that begins with full-image initialization to preserve semantic integrity and feature diversity. It then uses a two-phase optimization strategy: an exploration phase that performs uniform updates and identifies high-loss regions, and an exploitation phase that focuses updates on these regions to accelerate convergence. We evaluate E^2D on large-scale benchmarks, surpassing the state-of-the-art on ImageNet-1K while being 18x faster, and on ImageNet-21K, our method substantially improves accuracy while remaining 4.3x faster. These results demonstrate that targeted, redundancy-reducing updates, rather than brute-force optimization, bridge the gap between accuracy and efficiency in large-scale dataset distillation. Code is available at this https URL.

通过探索-利用优化加速大规模数据集蒸馏 / Accelerating Large-Scale Dataset Distillation via Exploration-Exploitation Optimization

1️⃣ 一句话总结

这篇论文提出了一种名为E^2D的新方法，它通过‘探索-利用’两阶段优化策略，在压缩大规模数据集时，既能保持模型的高精度，又能大幅提升计算效率，解决了现有方法在精度和速度之间难以兼顾的问题。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要