菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-16
📄 Abstract - GenRec: A Preference-Oriented Generative Framework for Large-Scale Recommendation

Generative Retrieval (GR) offers a promising paradigm for recommendation through next-token prediction (NTP). However, scaling it to large-scale industrial systems introduces three challenges: (i) within a single request, the identical model inputs may produce inconsistent outputs due to the pagination request mechanism; (ii) the prohibitive cost of encoding long user behavior sequences with multi-token item representations based on semantic IDs, and (iii) aligning the generative policy with nuanced user preference signals. We present GenRec, a preference-oriented generative framework deployed on the JD App that addresses above challenges within a single decoder-only architecture. For training objective, we propose Page-wise NTP task, which supervises over an entire interaction page rather than each interacted item individually, providing denser gradient signal and resolving the one-to-many ambiguity of point-wise training. On the prefilling side, an asymmetric linear Token Merger compresses multi-token Semantic IDs in the prompt while preserving full-resolution decoding, reducing input length by ~2X with negligible accuracy loss. To further align outputs with user satisfaction, we introduce GRPO-SR, a reinforcement learning method that pairs Group Relative Policy Optimization with NLL regularization for training stability, and employs Hybrid Rewards combining a dense reward model with a relevance gate to mitigate reward hacking. In month-long online A/B tests serving production traffic, GenRec achieves 9.5% improvement in click count and 8.7% in transaction count over the existing pipeline.

顶级标签: machine learning model training systems
详细标签: generative retrieval recommendation systems reinforcement learning large-scale deployment next-token prediction 或 搜索:

GenRec:面向用户偏好的大规模推荐生成框架 / GenRec: A Preference-Oriented Generative Framework for Large-Scale Recommendation


1️⃣ 一句话总结

这篇论文提出了一个名为GenRec的生成式推荐框架,它通过创新的页面级训练目标、高效的输入压缩技术和结合强化学习的偏好对齐方法,成功解决了大规模工业推荐系统中模型输出不一致、计算成本高和用户偏好匹配难三大挑战,并在实际应用中显著提升了点击和交易量。

源自 arXiv: 2604.14878