基于用户行为序列的端侧模型推理特征提取优化 / Optimizing Feature Extraction for On-device Model Inference with User Behavior Sequences
1️⃣ 一句话总结
这篇论文提出了一个名为AutoFeature的自动化系统,通过优化从原始应用日志中提取模型输入特征的过程,显著降低了端侧AI模型执行的整体延迟,从而提升移动应用的用户体验。
Machine learning models are widely integrated into modern mobile apps to analyze user behaviors and deliver personalized services. Ensuring low-latency on-device model execution is critical for maintaining high-quality user experiences. While prior research has primarily focused on accelerating model inference with given input features, we identify an overlooked bottleneck in real-world on-device model execution pipelines: extracting input features from raw application logs. In this work, we explore a new direction of feature extraction optimization by analyzing and eliminating redundant extraction operations across different model features and consecutive model inferences. We then introduce AutoFeature, an automated feature extraction engine designed to accelerate on-device feature extraction process without compromising model inference accuracy. AutoFeature comprises three core designs: (1) graph abstraction to formulate the extraction workflows of different input features as one directed acyclic graph, (2) graph optimization to identify and fuse redundant operation nodes across different features within the graph; (3) efficient caching to minimize operations on overlapping raw data between consecutive model inferences. We implement a system prototype of AutoFeature and integrate it into five industrial mobile services spanning search, video and e-commerce domains. Online evaluations show that AutoFeature reduces end-to-end on-device model execution latency by 1.33x-3.93x during daytime and 1.43x-4.53x at night.
基于用户行为序列的端侧模型推理特征提取优化 / Optimizing Feature Extraction for On-device Model Inference with User Behavior Sequences
这篇论文提出了一个名为AutoFeature的自动化系统,通过优化从原始应用日志中提取模型输入特征的过程,显著降低了端侧AI模型执行的整体延迟,从而提升移动应用的用户体验。
源自 arXiv: 2603.21508