菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-26
📄 Abstract - Once-for-All Channel Mixers (HYPERTINYPW): Generative Compression for TinyML

Deploying neural networks on microcontrollers is constrained by kilobytes of flash and SRAM, where 1x1 pointwise (PW) mixers often dominate memory even after INT8 quantization across vision, audio, and wearable sensing. We present HYPER-TINYPW, a compression-as-generation approach that replaces most stored PW weights with generated weights: a shared micro-MLP synthesizes PW kernels once at load time from tiny per-layer codes, caches them, and executes them with standard integer operators. This preserves commodity MCU runtimes and adds only a one-off synthesis cost; steady-state latency and energy match INT8 separable CNN baselines. Enforcing a shared latent basis across layers removes cross-layer redundancy, while keeping PW1 in INT8 stabilizes early, morphology-sensitive mixing. We contribute (i) TinyML-faithful packed-byte accounting covering generator, heads/factorization, codes, kept PW1, and backbone; (ii) a unified evaluation with validation-tuned t* and bootstrap confidence intervals; and (iii) a deployability analysis covering integer-only inference and boot versus lazy synthesis. On three ECG benchmarks (Apnea-ECG, PTB-XL, MIT-BIH), HYPER-TINYPW shifts the macro-F1 versus flash Pareto frontier: at about 225 kB it matches a roughly 1.4 MB CNN while being 6.31x smaller (84.15% fewer bytes), retaining at least 95% of large-model macro-F1. Under 32-64 kB budgets it sustains balanced detection where compact baselines degrade. The mechanism applies broadly to other 1D biosignals, on-device speech, and embedded sensing tasks where per-layer redundancy dominates, indicating a wider role for compression-as-generation in resource-constrained ML systems. Beyond ECG, HYPER-TINYPW transfers to TinyML audio: on Speech Commands it reaches 96.2% test accuracy (98.2% best validation), supporting broader applicability to embedded sensing workloads where repeated linear mixers dominate memory.

顶级标签: machine learning model training systems
详细标签: model compression tinyml edge computing generative compression microcontrollers 或 搜索:

一次性通道混合器(HYPERTINYPW):面向微型机器学习的生成式压缩方法 / Once-for-All Channel Mixers (HYPERTINYPW): Generative Compression for TinyML


1️⃣ 一句话总结

这篇论文提出了一种名为HYPER-TINYPW的生成式压缩技术,它用一个微型神经网络在模型加载时动态生成大部分权重,从而将神经网络模型的内存占用大幅减少6倍以上,同时保持与原始模型相近的准确率,使得在内存极其有限的微型设备(如单片机)上部署复杂的AI模型成为可能。

源自 arXiv: 2603.24916