菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-21
📄 Abstract - Tstars-Tryon 1.0: Robust and Realistic Virtual Try-On for Diverse Fashion Items

Recent advances in image generation and editing have opened new opportunities for virtual try-on. However, existing methods still struggle to meet complex real-world demands. We present Tstars-Tryon 1.0, a commercial-scale virtual try-on system that is robust, realistic, versatile, and highly efficient. First, our system maintains a high success rate across challenging cases like extreme poses, severe illumination variations, motion blur, and other in-the-wild conditions. Second, it delivers highly photorealistic results with fine-grained details, faithfully preserving garment texture, material properties, and structural characteristics, while largely avoiding common AI-generated artifacts. Third, beyond apparel try-on, our model supports flexible multi-image composition (up to 6 reference images) across 8 fashion categories, with coordinated control over person identity and background. Fourth, to overcome the latency bottlenecks of commercial deployment, our system is heavily optimized for inference speed, delivering near real-time generation for a seamless user experience. These capabilities are enabled by an integrated system design spanning end-to-end model architecture, a scalable data engine, robust infrastructure, and a multi-stage training paradigm. Extensive evaluation and large-scale product deployment demonstrate that Tstars-Tryon1.0 achieves leading overall performance. To support future research, we also release a comprehensive benchmark. The model has been deployed at an industrial scale on the Taobao App, serving millions of users with tens of millions of requests.

顶级标签: computer vision aigc multi-modal
详细标签: virtual try-on image generation garment preservation real-time inference benchmark 或 搜索:

Tstars-Tryon 1.0:面向多样化时尚单品的鲁棒且逼真的虚拟试穿系统 / Tstars-Tryon 1.0: Robust and Realistic Virtual Try-On for Diverse Fashion Items


1️⃣ 一句话总结

本文提出了一套商业级的虚拟试穿系统Tstars-Tryon 1.0,通过优化模型架构、数据处理和推理速度,能够处理复杂真实场景(如极端姿势、光线变化),实现高保真度的服装细节还原,并支持多种时尚品类和多人组合搭配,已在淘宝App大规模部署并服务千万用户。

源自 arXiv: 2604.19748