菜单

📄

提交新论文

AI论文阅读
🔍 搜索与筛选
大类标签
所有标签
reinforcement learning 90 diffusion models 58 benchmark 42 vision-language models 39 reasoning 36 benchmark evaluation 34 visual reasoning 32 mathematical reasoning 31 policy optimization 29 text-to-image 27 vision-language-action 27 chain-of-thought 23 multimodal reasoning 23 embodied ai 21 image generation 21 synthetic data 21 evaluation framework 20 multimodal llms 18 evaluation benchmark 16 image editing 16 knowledge distillation 16 spatial reasoning 16 video understanding 15 efficient inference 14 fine-tuning 14 mixture-of-experts 14 retrieval-augmented generation 14 tool usage 14 vision-language model 14 computational efficiency 13 video reasoning 13 3d reconstruction 12 curriculum learning 12 dataset 12 instruction tuning 12 multimodal evaluation 12 video generation 12 3d generation 11 code generation 11 domain adaptation 11 multi-agent systems 11 reasoning models 11 self-supervised learning 11 diffusion transformer 10 diffusion transformers 10 multimodal llm 10 video diffusion 10 attention mechanisms 9 evaluation 9 flow matching 9 generative models 9 post-training 9 transformer 9 gaussian splatting 8 generalization 8 preference optimization 8 quantization 8 representation learning 8 reward modeling 8 sparse attention 8 test-time scaling 8 transformer architecture 8 agentic framework 7 attention mechanism 7 data synthesis 7 in-context learning 7 instruction following 7 reasoning capabilities 7 robotic manipulation 7 robustness 7 scaling laws 7 supervised fine-tuning 7 text-to-video 7 video synthesis 7 visual question answering 7 web agents 7 alignment 6 autoregressive generation 6 autoregressive models 6 benchmark dataset 6 dataset creation 6 diffusion language models 6 efficiency 6 efficiency optimization 6 foundation models 6 gui agents 6 medical imaging 6 model compression 6 rlvr 6 self-improvement 6 software engineering 6 world models 6 adversarial training 5 benchmarking 5 contrastive learning 5 depth estimation 5 explainable ai 5 foundation model 5 imitation learning 5 inference acceleration 5 kv cache 5 latent space 5 long context 5 model scaling 5 multi-agent collaboration 5 multi-step reasoning 5 optimization 5 preference learning 5 reasoning efficiency 5 safety alignment 5 tool integration 5 tool use 5 training stability 5 agent evaluation 4 automated evaluation 4 autonomous agents 4 autonomous driving 4 catastrophic forgetting 4 classifier-free guidance 4 cross-modal alignment 4 data augmentation 4 data curation 4 diffusion distillation 4 document understanding 4 end-to-end training 4 high-resolution generation 4 inference optimization 4 latency optimization 4 latent diffusion 4 long-context reasoning 4 long-form video 4 long-horizon tasks 4 low-resource languages 4 memory management 4 model architecture 4 motion generation 4 multi-agent system 4 multimodal models 4 object detection 4 physics simulation 4 policy learning 4 positional encoding 4 pre-training 4 preference alignment 4 prompt engineering 4 reasoning evaluation 4 reasoning tasks 4 research agents 4 reward shaping 4 robot control 4 robot manipulation 4 scientific discovery 4 self-evolution 4 sim-to-real 4 small language models 4 synthetic data generation 4 temporal alignment 4 theorem proving 4 token compression 4 training efficiency 4 training-free 4 unsupervised learning 4 vision language models 4 zero-shot learning 4 3d consistency 3 3d vision 3 adaptive thinking 3 adversarial attack 3 agent training 3 agentic reasoning 3 ai safety 3 automated theorem proving 3 autoregressive video diffusion 3 camera control 3 causal reasoning 3 chain of thought 3 computer-use agents 3 continual learning 3 conversational ai 3 credit assignment 3 cross-lingual transfer 3 data efficiency 3 data generation 3 dense retrieval 3 dexterous manipulation 3 distribution matching 3 document parsing 3 dynamic scenes 3 edge deployment 3 efficient training 3 egocentric vision 3 error correction 3 generation efficiency 3 generative modeling 3 geometric reasoning 3 geometry synthesis 3 gradient alignment 3 gui grounding 3 hallucination analysis 3 hallucination mitigation 3 hallucination reduction 3 hierarchical reasoning 3 image restoration 3 information retrieval 3 interactive environments 3 large multimodal models 3 long video generation 3 long-context 3 math reasoning 3 mechanistic interpretability 3 mllm 3 mllm evaluation 3 model context protocol 3 model evaluation 3 model optimization 3 motion control 3 multi-agent framework 3 multi-modal 3 multi-turn agents 3 multi-turn reasoning 3 multilingual evaluation 3 multimodal generation 3 multimodal learning 3 multimodal understanding 3 neural rendering 3 novel view synthesis 3 physical reasoning 3 point cloud 3 policy gradient 3 procedural generation 3 process supervision 3 question answering 3 reasoning benchmarks 3 reasoning verification 3 rectified flow 3 reference-guided generation 3 reinforcement fine-tuning 3 representation alignment 3 resource allocation 3 rl training 3 safety 3 safety evaluation 3 search agents 3 self-correction 3 self-evolving agents 3 self-verification 3 sim2real 3 software development 3 sparsity 3 speculative decoding 3 speech translation 3 streaming video 3 survey 3 test-time adaptation 3 text-to-3d 3 tool-augmented reasoning 3 training-free optimization 3 uncertainty quantification 3 unified framework 3 video comprehension 3 video editing 3 video foundation models 3 video quality 3 vision foundation models 3 vision transformer 3 vision transformers 3 vision-language 3 workflow automation 3 world model 3 world modeling 3 zero-shot generalization 3 3d city generation 2 3d editing 2 3d modeling 2 3d reasoning 2 3d scene generation 2 3d scene synthesis 2 3d world generation 2 4d generation 2 4d reconstruction 2 academic writing 2 action prediction 2 active learning 2 adaptive inference 2 adaptive routing 2 adversarial attacks 2 agent benchmarking 2 agent evolution 2 agentic ai 2 agentic systems 2 ai alignment 2 ai security 2 articulated objects 2 attention alignment 2 audio-driven 2 audio-language models 2 audio-video generation 2 autonomous improvement 2 avatar animation 2 benchmark design 2 brain-computer interface 2 browser automation 2 character animation 2 clinical decision-making 2 clinical reasoning 2 co-evolution 2 code execution 2 cognitive reasoning 2 compute optimization 2 consistency models 2 constraint satisfaction 2 context compression 2 context engineering 2 context modeling 2 continuous control 2 controllable generation 2 creative writing 2 cross-modal reasoning 2 cuda optimization 2 data collection 2 dataset generation 2 decision-making 2 diffusion model 2 discrete diffusion 2 distributed systems 2 drug discovery 2 e-commerce 2 edge computing 2 efficient deployment 2 efficient generation 2 electronic health records 2 embedding models 2 embodied intelligence 2 embodied navigation 2 emotion recognition 2 emotional intelligence 2 empirical study 2 entropy analysis 2 environment scaling 2 episodic memory 2 evaluation metrics 2 evaluation protocol 2 external memory 2 fact verification 2 failure modes 2 faithful generation 2 faithfulness evaluation 2 fast sampling 2 feature upsampling 2 few-shot learning 2 financial markets 2 formal verification 2 game development 2 geometric constraints 2 geometric fidelity 2 gui interaction 2 hallucination detection 2 hard negative mining 2 hierarchical control 2 hierarchical retrieval 2 human alignment 2 human evaluation 2 human feedback 2 humanoid locomotion 2 image captioning 2 image segmentation 2 image synthesis 2 image-to-video 2 inference efficiency 2 information flow 2 information seeking 2 interactive agents 2 interactive segmentation 2 interleaved generation 2 interleaved reasoning 2 interpretability 2 iterative refinement 2 knowledge preservation 2 knowledge synthesis 2 language modeling 2 language models 2 latent representations 2 length extrapolation 2 lifelong learning 2 llm evaluation 2 llm-as-a-judge 2 load balancing 2 long-horizon reasoning 2 lora 2 mamba-transformer 2 medical ai 2 medical benchmark 2 medical diagnosis 2 memory optimization 2 memory systems 2 meta-reasoning 2 mixture of experts 2 mixture-of-transformers 2 model acceleration 2 model alignment 2 model distillation 2 model fingerprinting 2 model fusion 2 model interpretability 2 model merging 2 model pruning 2 monocular video 2 monte carlo tree search 2 motion imitation 2 multi-agent 2 multi-hop reasoning 2 multi-label classification 2 multi-person generation 2 multi-stage training 2 multi-step tasks 2 multi-task learning 2 multi-turn interaction 2 multi-view 2 multi-view consistency 2 multilingual 2 multilingual llm 2 multilingual models 2 multilingual translation 2 multimodal agents 2 multimodal assessment 2 multimodal consistency 2 multimodal dataset 2 multimodal fusion 2 multimodal integration 2 multimodal interaction 2 multimodal memory 2 normalizing flows 2 numeracy 2 off-policy learning 2 off-policy rl 2 olympiad problems 2 one-step generation 2 open-ended learning 2 open-vocabulary 2 out-of-distribution generalization 2 parallel decoding 2 parameter efficiency 2 parameter-efficient fine-tuning 2 parameter-efficient training 2 partial observability 2 performance optimization 2 personalization 2 physical dynamics 2 physical plausibility 2 physical simulation 2 physics reasoning 2 physics-aware generation 2 planning 2 positional embeddings 2 pretraining 2 progressive learning 2 progressive training 2 prompt injection 2 prompt optimization 2 puzzle solving 2 query decomposition 2 rag systems 2 real-time rendering 2 reasoning analysis 2 reasoning benchmark 2 remote sensing 2 representation analysis 2 research automation 2 resolution enhancement 2 retrieval augmentation 2 rl framework 2 rlhf 2 robot learning 2 robotic control 2 safety guardrails 2 safety vulnerabilities 2 sample efficiency 2 sampling methods 2 scene generation 2 scientific reasoning 2 self-play 2 self-rewarding 2 semantic alignment 2 sequential decision-making 2 sequential modeling 2 simulation 2 small models 2 sparse activation 2 spatial cognition 2 spatial control 2 spatial grounding 2 spatio-temporal 2 speech synthesis 2 streaming generation 2 structured data extraction 2 structured knowledge 2 supervised learning 2 tabular data 2 task planning 2 temporal consistency 2 temporal reasoning 2 temporal understanding 2 test-time optimization 2 text embeddings 2 text generation 2 text-to-image generation 2 texture generation 2 theoretical analysis 2 tool augmentation 2 tool calling 2 tool composition 2 tool orchestration 2 tool-integrated reasoning 2 tool-use agents 2 training dynamics 2 training-free extrapolation 2 trajectory optimization 2 transfer learning 2 transformer theory 2 tree search 2 uncertainty 2 user interface 2 value alignment 2 verifiable rewards 2 video diffusion models 2 video models 2 video question answering 2 video relighting 2 video retrieval 2 video segmentation 2 video-to-video 2 virtual try-on 2 vision-language navigation 2 vision-language pretraining 2 visual generation 2 visual grounding 2 visual language models 2 visual representation 2 visual tokens 2 voxel representation 2 web navigation 2 zero-shot 2 2:4 sparsity 1 3d consistent video 1 3d ct analysis 1 3d data generation 1 3d geometry 1 3d grounding 1 3d human generation 1 3d inpainting 1 3d multimodal 1 3d physics 1 3d point tracks 1 3d scene composition 1 3d scenes 1 3d shape generation 1 3d stylization 1 3d tracking 1 3d trajectory estimation 1 3d understanding 1 3d virtual worlds 1 4d control 1 4d representation 1 4d scene understanding 1 4d scenes 1 4d understanding 1 4d video rendering 1 4d world modeling 1 a/b testing 1 abstract reasoning 1 academic search 1 academic seminars 1 accelerated sampling 1 accessibility 1 accessibility tree 1 accountability 1 accuracy-latency tradeoff 1 acoustic-semantic gap 1 action chunking 1 action control 1 action decoding 1 action degeneration 1 action plausibility 1 action representation 1 action space construction 1 action understanding 1 activation analysis 1 activation functions 1 activation steering 1 actor-critic 1 adaptation 1 adapter heads 1 adaptive clipping 1 adaptive difficulty 1 adaptive learning 1 adaptive memory 1 adaptive reasoning 1 adaptive retrieval 1 adaptive training 1 adaptive transforms 1 adaptive vision 1 adoption practices 1 advantage estimation 1 advantage normalization 1 adventure games 1 adversarial evaluation 1 adversarial perturbations 1 adversarial robustness 1 adversarial simulation 1 aesthetic adaptation 1 affective reasoning 1 affordance-aware composition 1 agent capabilities 1 agent co-evolution 1 agent collaboration 1 agent economies 1 agent fine-tuning 1 agent frameworks 1 agent performance 1 agent planning 1 agent reinforcement learning 1 agent robustness 1 agent scaffold 1 agent scaffolds 1 agent-environment interaction 1 agentic enhancement 1 agentic foundation models 1 agentic intelligence 1 agentic markets 1 agentic rl 1 agentic thinking 1 agentic web 1 agentic workflow 1 agi definition 1 agi evaluation 1 agi safety 1 ai agents 1 ai evaluation 1 ai governance 1 ai research agents 1 ai scientist 1 ai systems 1 ai-generated code 1 ai-generated content detection 1 aleatoric epistemic 1 algorithm design 1 algorithmic reasoning 1 algorithmic trading 1 alignment degradation 1 alignment evaluation 1 alignment strategies 1 alpha matting 1 alzheimer's detection 1 ambiguity handling 1 amodal completion 1 analogical reasoning 1 analogue modelling 1 anime hairstyle 1 anomaly detection 1 anomaly generation 1 anti-exploration 1 antisemitism detection 1 ar/vr integration 1 arabic nlp 1 arbitrary resolution 1 arc-agi benchmark 1 architectural optimization 1 architecture optimization 1 architecture search 1 arena evaluation 1 asset creation 1 assistive technology 1 asynchronous denoising 1 asynchronous execution 1 asynchronous inference 1 asynchronous reasoning 1 attention analysis 1 attention bias 1 attention complexity 1 attention compression 1 attention control 1 attention dispersion 1 attention guidance 1 attention heads 1 attention masking 1 attention optimization 1 attention pooling 1 attention sinks 1 attribute knowledge 1 attribute transfer 1 attribution 1 audio classification 1 audio editing 1 audio generation 1 audio plugin 1 audio reasoning 1 audio understanding 1 audio-video joint denoising 1 audio-video synchronization 1 audio-visual alignment 1 audio-visual fusion 1 audio-visual interaction 1 audio-visual segmentation 1 audio-visual speech recognition 1 audio-visual sync 1 audio-visual synchronization 1 audio-visual understanding 1 audiovisual captioning 1 audiovisual learning 1 authorship attribution 1 auto-thinking 1 autoformalization 1 automated annotation 1 automated assessment 1 automated grading 1 automated kernel tuning 1 automated patching 1 automated testing 1 automated trading 1 automated training 1 automatic differentiation 1 automatic ml research 1 automatic speech recognition 1 automl 1 autonomous coding 1 autonomous evaluation 1 autonomous research 1 autoregressive decoding 1 autoregressive synthesis 1 autoscaling 1 auxiliary constructions 1 auxiliary objective 1 avatar creation 1 ayurveda 1 bandit algorithms 1 base models 1 batch sampling 1 battle outcomes 1 bavarian language 1 bayesian optimization 1 behavior cloning 1 behavioral analysis 1 behavioral biases 1 behavioral cloning 1 behavioral taxonomy 1 behavioral traits 1 beir benchmark 1 benchmark performance 1 benchmark validity 1 beyond-accuracy objectives 1 bi-level reward 1 bi-mode annealing 1 bias detection 1 bilingual model 1 bilingual reasoning 1 bimanual robotics 1 binaural audio 1 biographical features 1 biological annotation 1 biomedical 1 blind users 1 block diffusion 1 block inpainting 1 block-causal models 1 block-diffusion 1 bounded response 1 brain systems 1 bridge models 1 bronchoscopy simulation 1 browser agents 1 budget-aware training 1 bundle adjustment 1 business process modeling 1 byte-pair encoding 1 cad generation 1 calibration 1 calibration error 1 camera extrinsic 1 camera motion control 1 camera pose 1 camera pose decoupling 1 camera trajectory 1 capability evaluation 1 capability probing 1 caption-assisted reasoning 1 capture the flag 1 card games 1 cascaded optimization 1 catalog alignment 1 category theory 1 causal attention 1 causal discovery 1 causal mask 1 causal representation learning 1 cfg augmentation 1 chain of guidance 1 chains-of-thought 1 challenge 1 chaos engineering 1 chart comprehension 1 chart grounding 1 chart understanding 1 chat chain 1 cheating behavior 1 checkpoint recycling 1 chemical compositions 1 chemical reaction 1 chemistry 1 chest x-ray 1 chinese context 1 chinese language 1 chunk-level optimization 1 cinematic narratives 1 cinematic video 1 citation attribution 1 citation evaluation 1 citation grounding 1 citation-aware 1 city-scale synthesis 1 clinical ai 1 clinical benchmarks 1 clinical decision support 1 clinical language models 1 clip 1 clip evaluation 1 closed-loop training 1 cloud computing 1 clustering 1 co-training 1 coarse-to-fine generation 1 code competition 1 code completion 1 code diff 1 code editing 1 code embeddings 1 code interpreter 1 code quality 1 code reasoning 1 code retrieval 1 code understanding 1 code-as-tool 1 code-driven pipeline 1 coding agents 1 coevolution 1 cognitive abilities 1 cognitive benchmarking 1 cognitive capacity 1 cognitive comparison 1 cognitive decline 1 cognitive elements 1 cognitive framework 1 cognitive inertia 1 cognitive modeling 1 cognitive neuroscience 1 cognitive patterns 1 cognitive perception 1 cognitive rules 1 cognitive science 1 cognitive simulation 1 cognitive skills 1 cognitive systems 1 coherence metrics 1 colbert 1 collaborative learning 1 collaborative modeling 1 collaborative training 1 collaborative workflows 1 collective intelligence 1 color alignment 1 color spaces 1 comic generation 1 common crawl 1 common ground 1 communication efficiency 1 communicative agents 1 comparative reasoning 1 competitive programming 1 complexity 1 composition 1 compositional assembly 1 compositional frameworks 1 compositional generation 1 compound generation 1 compressed reasoning 1 computational art 1 computational imaging 1 computational pathology 1 computational procedures 1 computational social science 1 compute budget 1 compute efficiency 1 computer vision 1 computer-using agents 1 concept generation 1 concept segmentation 1 concept-aware 1 concept-guided learning 1 conceptual memory 1 condition injection 1 conditional flow matching 1 conditional generation 1 confidence calibration 1 conformer 1 consistency distillation 1 contamination detection 1 content creation 1 content moderation 1 content verification 1 context comprehension 1 context consistency 1 context fusion 1 context knowledge 1 context optimization 1 context pruning 1 context summarization 1 context-free grammar 1 contextual bandit 1 continual pre-training 1 continual pretraining 1 continuous learning 1 continuous-time dynamics 1 continuum robots 1 contrastive attention 1 control protocols 1 controllable assets 1 controllable editing 1 convergence analysis 1 conversational agents 1 conversational recommender systems 1 coordinate prediction 1 coordinate-free 1 coordination mechanisms 1 copy-paste artifacts 1 correspondence estimation 1 coset sampling 1 cost optimization 1 cost-optimal planning 1 counter-intuitive ability 1 counterfactual regret minimization 1 creative composition 1 criteria following 1 critic models 1 critic-free rl 1 cross reconstruction 1 cross-cultural evaluation 1 cross-device orchestration 1 cross-domain coordination 1 cross-domain generalization 1 cross-embodied 1 cross-lingual 1 cross-lingual alignment 1 cross-lingual generalization 1 cross-modal adapter 1 cross-modal co-training 1 cross-modal fusion 1 cross-modal generation 1 cross-modal interaction 1 cross-modal retrieval 1 cross-model adaptation 1 cross-model transferability 1 cross-platform agents 1 cross-robot generalization 1 cross-session memory 1 cross-video analysis 1 crystal representation 1 cultural bias 1 cultural understanding 1 curriculum reinforcement learning 1 custom workflows 1 cyber threat intelligence 1 cybersecurity 1 damage assessment 1 dark humor detection 1 data alignment 1 data analysis 1 data engineering 1 data flywheel 1 data imbalance 1 data leakage 1 data management 1 data mixture 1 data quality 1 data recycling 1 data refinement 1 data sampling 1 data scaling 1 data selection 1 data transparency 1 data-free learning 1 dataset augmentation 1 dataset construction 1 dataset curation 1 dataset evaluation 1 dataset quality 1 dataset splitting 1 de-identification 1 debiasing methods 1 deblurring 1 decentralized learning 1 deception 1 decision support 1 declarative framework 1 decoding acceleration 1 decoding strategies 1 decoupled training 1 deep learning 1 deep research 1 deep research agents 1 deep research systems 1 deepfake detection 1 defense bypass 1 deformable objects 1 degradation modeling 1 dehallucination 1 deliberate decontextualization 1 delimiter sensitivity 1 deming cycle 1 demographic diversity 1 demonstration retrieval 1 denoising 1 denoising process 1 dense alignment 1 dense correspondence 1 dense geometry estimation 1 dense video captioning 1 dental imaging 1 depth-recurrent 1 desktop automation 1 detail correction 1 detail preservation 1 detection transformers 1 determinantal point processes 1 deterministic inference 1 deterministic sampling 1 developer perspective 1 diagnostic analysis 1 diagnostic error reduction 1 diagnostic feedback 1 diagram generation 1 dialect robustness 1 dialogue grounding 1 dialogue robustness 1 dialogue systems 1 diff representation 1 differentiable critics 1 differentiable rasterization 1 differentiable simulation 1 differential privacy 1 difficult problem generation 1 difficulty calibration 1 difficulty levels 1 difficulty progression 1 diffusion 1 diffusion decoder 1 diffusion decoding 1 diffusion sampling 1 digital agents 1 digital humans 1 digital signal processing 1 digital twin 1 dino adaptation 1 direct preference optimization 1 directed acyclic graphs 1 directional degeneration 1 discriminative verification 1 discriminator design 1 disease detection 1 disentangled learning 1 dishonesty 1 distributed computing 1 distributed dag 1 distributed training 1 distribution shift 1 divergence selection 1 diverse generation 1 diversity optimization 1 diversity preservation 1 diversity-quality tradeoff 1 document conversion 1 document generation 1 document memory 1 document reranking 1 document retrieval 1 document search 1 document structure 1 domain shift 1 doppler imaging 1 dpo training 1 drag-based editing 1 drawing generation 1 dream narratives 1 dual-agent architecture 1 dual-brain architecture 1 dual-clock denoising 1 dual-encoder 1 dual-process theory 1 dynamic balancing 1 dynamic benchmark 1 dynamic communication 1 dynamic context window 1 dynamic environments 1 dynamic evaluation 1 dynamic improvement reward 1 dynamic optimization 1 dynamic point clouds 1 dynamic process modeling 1 dynamic pruning 1 dynamic range 1 dynamic routing 1 dynamic sampling 1 dynamic scene understanding 1 dynamic time warping 1 dynamic verification 1 dynamics model 1 early convergence 1 early exit 1 early experience 1 early stopping 1 earth observation 1 echo training 1 ecology vision 1 economic decision-making 1 economic reasoning 1 economic risk 1 edge ai 1 editor plugin 1 educational technology 1 efficient attention 1 efficient mllms 1 efficient pretraining 1 efficient rl 1 efficient transformers 1 efficient verification 1 ego-motion 1 egocentric interaction 1 egocentric video 1 elo rating 1 embedding analysis 1 embedding evaluation 1 embedding fusion 1 embedding learning 1 embedding retraining 1 embedding selection 1 embedding tasks 1 embodied cognition 1 embodied environments 1 embodied reasoning 1 emergent behavior 1 emergent misalignment 1 emotion control 1 emotional reasoning 1 emotional variation 1 encoder-decoder 1 energy efficiency 1 energy-based models 1 enterprise workflows 1 entity embeddings 1 entity normalization 1 entity-aware control 1 entropy balancing 1 entropy control 1 entropy modulation 1 entropy regularization 1 entropy stabilization 1 environment generation 1 environment stabilization 1 environment tuning 1 epipolar geometry 1 epistemic humility 1 epistemic uncertainty 1 error accumulation 1 error compensation 1 error localization 1 ethical reasoning 1 evaluation methodology 1 evaluation metric 1 evaluation pipeline 1 evaluation protocols 1 evaluation reliability 1 evaluation robustness 1 event forecasting 1 evidence grounding 1 evidence localization 1 evidence retrieval 1 evidence synthesis 1 evidence-based reasoning 1 evidence-seeking 1 evolutionary algorithms 1 evolutionary search 1 execution environments 1 experience generation 1 experience inheritance 1 experience replay 1 experience synthesis 1 experience-guided learning 1 expert assessment 1 expert iteration 1 expert-amateur contrast 1 expert-level reasoning 1 exploration bottleneck 1 exploration dynamics 1 exploration enhancement 1 exploration stability 1 exploration strategies 1 exploration techniques 1 exploration-exploitation 1 exposure-aware 1 f-divergence 1 facial animation 1 fact-checking 1 factor mining 1 factual accuracy 1 factual alignment 1 factual correction 1 factual qa 1 factual recall 1 factuality detection 1 failure taxonomy 1 fairness 1 fairness assessment 1 fast decoding 1 feature aggregation 1 feature agnostic 1 feature guidance 1 feature matching 1 federated learning 1 feed-forward editing 1 feedback 1 few-shot adaptation 1 few-step generation 1 fidelity metrics 1 financial auditing 1 financial reasoning 1 financial services 1 fine-grained action 1 fine-grained assessment 1 fine-grained classification 1 fine-grained detection 1 fine-grained evaluation 1 fine-tuning degradation 1 fine-tuning strategies 1 first frame analysis 1 fisher information 1 flow environment 1 flow maps 1 flow models 1 flow-based models 1 fmri reconstruction 1 foot contact estimation 1 force estimation 1 foreground-background fusion 1 formal analysis 1 formal reasoning 1 forward learning 1 foundation policy 1 fpga acceleration 1 frame selection 1 frequency-domain generation 1 function calling 1 functional shifts 1 fusion mechanism 1 gaap compliance 1 game engines 1 game theory 1 garment registration 1 garment transfer 1 gaze-guided reasoning 1 generalist agent 1 generalization failure 1 generation assessment 1 generative judges 1 generative model 1 generative policies 1 generative prior 1 generative reasoning 1 generative recommendation 1 generative testing 1 genetic algorithm 1 genetic algorithms 1 geolocalization 1 geometric consistency 1 geometric deep learning 1 geometric dense prediction 1 geometric reconstruction 1 geometric regularization 1 geometry 1 geometry learning 1 geometry preservation 1 geometry representation 1 geometry solver 1 geometry-aware 1 geometry-aware generation 1 german language data 1 gnn comparison 1 goal tracking 1 gpt evaluation 1 gradient conflict 1 gradient descent 1 gradient reparameterization 1 graph neural networks 1 graph optimization 1 graph understanding 1 graph-structured pruning 1 graphic design automation 1 graphics pipeline 1 graphics-ready scenes 1 ground-aware features 1 grounded language models 1 group influence 1 group preferences 1 groupwise ranking 1 gui automation 1 gui navigation 1 gumbel-softmax 1 hadamard 1 hadamard transform 1 hair cards 1 hallucination 1 hallucination benchmark 1 hard instance mining 1 hardware acceleration 1 hardware optimization 1 hardware-aware design 1 harmful content detection 1 harmonized system 1 hate speech detection 1 head-tail rebalancing 1 headline generation 1 health indicators 1 healthcare ai 1 hessian optimization 1 heterogeneous acceleration 1 heterogeneous agents 1 heterogeneous hardware 1 heuristic search 1 hgemm 1 hidden embeddings 1 hierarchical context 1 hierarchical framework 1 hierarchical rules 1 hierarchical summarization 1 high-frequency trading 1 high-resolution synthesis 1 high-resolution video 1 histopathology 1 history context 1 hit identification 1 hospital operations 1 html elements 1 human activities 1 human animation 1 human cognition 1 human demonstrations 1 human mesh recovery 1 human motion evaluation 1 human movement analysis 1 human pose animation 1 human preference 1 human priors 1 human reenactment 1 human-agent collaboration 1 human-agent interaction 1 human-ai collaboration 1 human-ai interaction 1 human-centric video 1 human-in-the-loop 1 human-robot coordination 1 human-robot interaction 1 humanoid robots 1 humanoid teleoperation 1 hybrid architecture 1 hybrid interaction 1 hybrid rewards 1 ideation diversity 1 identity-consistent generation 1 illumination control 1 image classification 1 image consistency 1 image customization 1 image deblurring 1 image diffusion 1 image inpainting 1 image inversion 1 image manipulation 1 image personalization 1 image processing 1 image quality assessment 1 image reconstruction 1 image refinement 1 image relighting 1 image statistics 1 image understanding 1 image-text alignment 1 image-to-3d 1 image-to-image translation 1 imaginative scenarios 1 implicit operators 1 implicit reasoning 1 importance sampling 1 in-context conditioning 1 in-context generation 1 in-hand rotation 1 in-tool learning 1 inconsistency analysis 1 indic languages 1 industrial inspection 1 inference compute 1 inference latency 1 inference scaling 1 inference stability 1 inference-time adaptation 1 inference-time framework 1 inference-time manipulation 1 inference-time optimization 1 inference-time processing 1 inference-time scaling 1 infinite environments 1 infinite-length video 1 influence functions 1 influence maximization 1 information gain 1 information loss 1 information theory 1 information-seeking 1 infrastructure 1 innovation evaluation 1 inpainting 1 input reformulation 1 inspiration engine 1 instance segmentation 1 instruction diversity 1 instruction grounding 1 instruction optimization 1 instruction reasoning 1 instruction routing 1 instruction-based control 1 instruction-conditioned generation 1 instruction-driven 1 instruction-guided 1 instruction-guided video editing 1 instructional scaffolding 1 intelligence measurement 1 intent recognition 1 intention modeling 1 interaction rewards 1 interaction scaling 1 interactive code generation 1 interactive editing 1 interactive exploration 1 interactive generation 1 interactive poses 1 interactive reasoning 1 interactive recommendation 1 interactive reinforcement learning 1 interactive video 1 interactive world model 1 interactivity 1 intermediate images 1 intermediate reasoning 1 internal activations 1 internal knowledge 1 internal representations 1 internal states 1 interoperability 1 interruptibility 1 intrinsic reward 1 intuitive physics 1 inverse graphics 1 inverse problems 1 inversion process 1 invertible networks 1 ip-adapter 1 item response theory 1 iterative planning 1 iterative reasoning 1 javascript 1 joint denoising 1 joint parameter estimation 1 joint training 1 judge models 1 judge reliability 1 kernel optimization 1 keyword graphs 1 kinematic parts 1 kinematic synthesis 1 kl divergence 1 knowledge agents 1 knowledge compression 1 knowledge editing 1 knowledge extraction 1 knowledge graph construction 1 knowledge interaction 1 knowledge networks 1 knowledge reasoning 1 knowledge recall 1 knowledge retention 1 knowledge traversal 1 knowledge validation 1 knowledge-intensive evaluation 1 kolmogorov-arnold network 1 korean language 1 kubernetes 1 kv cache compression 1 kv-cache management 1 label fusion 1 language grounding 1 language guidance 1 language model integration 1 language of thought 1 language understanding 1 language-driven generation 1 language-guided policies 1 language-spatial mapping 1 laplacian eigenfunctions 1 large audio-language models 1 large language model 1 large language models 1 large-scale model 1 large-scale training 1 last-token 1 last-token pooling 1 late-interaction 1 late-interaction models 1 latency reduction 1 latent actions 1 latent collaboration 1 latent quality 1 latent reasoning 1 latent refinement 1 latent reward modeling 1 latent space reasoning 1 latent steering 1 latent variable model 1 latent vectors 1 latent working memory 1 lattice problems 1 layer merging 1 layer skipping 1 layer-aware generation 1 layer-selective tuning 1 layered composition 1 layered representation 1 layout control 1 layout optimization 1 layout reasoning 1 layout-to-image 1 learning dynamics 1 learning-to-rank 1 lecture translation 1 legal information retrieval 1 length generalization 1 length optimization 1 length regularization 1 lesion segmentation 1 lightweight models 1 likelihood preference 1 likelihood regularization 1 line drawing generation 1 linear subspaces 1 linguistic depth 1 linguistic generalization 1 linguistic patterns 1 linguistic variability 1 lip reading 1 lip synchronization 1 lip-sync 1 live benchmark 1 live trading 1 llm agents 1 llm collaboration 1 llm fine-tuning 1 llm framework 1 llm inference 1 llm internal state 1 llm planning 1 llm-as-judge 1 llm-based evaluation 1 llm-driven optimization 1 llm4survey 1 local inference 1 locality 1 localized editing 1 logical fallacies 1 long sequences 1 long video 1 long video reasoning 1 long video understanding 1 long-chain reasoning 1 long-context diffusion 1 long-context evaluation 1 long-context learning 1 long-context llms 1 long-context processing 1 long-context training 1 long-context video 1 long-form answers 1 long-form audio 1 long-form narrative 1 long-form qa 1 long-form reports 1 long-form video generation 1 long-horizon execution 1 long-horizon memory 1 long-horizon rl 1 long-horizon search 1 long-range memory 1 long-tailed learning 1 long-term interaction 1 long-term memory 1 long-term tracking 1 long-term training 1 long-video generation 1 long-video synthesis 1 lora adaptation 1 lora fine-tuning 1 loss coefficient tuning 1 lottery ticket hypothesis 1 low bitwidth 1 low-bit 1 low-probability tokens 1 low-rank adaptation 1 low-rank compression 1 low-resource speech 1 low-shot learning 1 lstm 1 machine design 1 machine translation evaluation 1 mamba-attention 1 manifold learning 1 manipulation 1 manipulation policies 1 marginal-data transport 1 market concentration 1 market simulation 1 markov decision process 1 mask-guided generation 1 masked autoencoder 1 masked denoising 1 masked image modeling 1 masked video modeling 1 masking 1 material properties 1 materials discovery 1 math benchmarks 1 mathematical problem-solving 1 matrix factorization 1 matrix multiplication 1 matryoshka representation 1 maze-solving 1 mcmc 1 mcts search 1 mean velocity field 1 medical data generation 1 medical evaluation 1 medical image retrieval 1 medical image segmentation 1 medical knowledge 1 medical llm 1 medical segmentation 1 meg decoding 1 memory 1 memory architecture 1 memory decay 1 memory efficiency 1 memory frameworks 1 memory mechanisms 1 memory networks 1 memory retrieval 1 memory update 1 memory-based computation 1 mental health 1 mental health applications 1 mental models 1 mesh generation 1 mesh processing 1 meta-awareness 1 meta-heuristic 1 meta-imitation learning 1 meta-strategy 1 metacognition 1 metal-organic frameworks 1 metaphysical shifts 1 minority languages 1 misevolution 1 misinformation detection 1 misunderstanding analysis 1 mitigation strategy 1 mitre att&ck 1 mixed-initiative 1 mixed-precision training 1 mixture of horizons 1 mle-bench 1 mobile automation 1 mobile control 1 mobile interaction 1 mobile manipulation 1 mobile robotics 1 modality alignment 1 modality complementarity 1 modality conflict 1 modality entanglement 1 modality switching 1 modality-specific architectures 1 model analysis 1 model capabilities 1 model clustering 1 model collaboration 1 model coordination 1 model disruption 1 model downloads 1 model ecosystem 1 model editing 1 model efficiency 1 model ensemble 1 model finetuning 1 model generalization 1 model heterogeneity 1 model lifecycle 1 model quantization 1 model reliability 1 model safety 1 model steering 1 model stitching 1 model training 1 model transfer 1 model understanding 1 model-context protocol 1 modular arithmetic 1 modular framework 1 module communities 1 moe 1 molecular design 1 mongolian 1 monitor evasion 1 monocular depth estimation 1 monocular slam 1 monocular vision 1 motion capture 1 motion dynamics 1 motion fidelity 1 motion modeling 1 motion planning 1 motion priors 1 motion synthesis 1 motion transfer 1 movie understanding 1 mri super-resolution 1 multi-agent communication 1 multi-agent exploration 1 multi-agent rl 1 multi-agent search 1 multi-attempt performance 1 multi-camera 1 multi-document 1 multi-domain 1 multi-domain training 1 multi-file reasoning 1 multi-granular alignment 1 multi-head decoding 1 multi-hop inference 1 multi-hop qa 1 multi-hop retrieval 1 multi-human composition 1 multi-island map-elites 1 multi-llm collaboration 1 multi-modal coding 1 multi-modal conditioning 1 multi-modal dataset 1 multi-modal evaluation 1 multi-modal generation 1 multi-modal learning 1 multi-modal llm 1 multi-modal llms 1 multi-modal policies 1 multi-modal semantic drift 1 multi-modal trajectories 1 multi-modal understanding 1 multi-objective optimization 1 multi-perspective learning 1 multi-reference generation 1 multi-reward 1 multi-round tournaments 1 multi-scale architecture 1 multi-scale models 1 multi-scale processing 1 multi-server 1 multi-shot coherence 1 multi-shot video 1 multi-speaker 1 multi-stage learning 1 multi-stage reasoning 1 multi-supervision learning 1 multi-turn editing 1 multi-turn rl 1 multi-turn search 1 multi-video understanding 1 multi-view images 1 multi-view learning 1 multi-view video 1 multiagent collaboration 1 multidimensional assessment 1 multidisciplinary evaluation 1 multidomain corpus 1 multijurisdictional datasets 1 multilingual adaptation 1 multilingual analysis 1 multilingual code 1 multilingual moderation 1 multilingual ocr 1 multilingual vqa 1 multimodal attention 1 multimodal backbone 1 multimodal benchmark 1 multimodal benchmarks 1 multimodal chain-of-thought 1 multimodal conditioning 1 multimodal control 1 multimodal corpora 1 multimodal datasets 1 multimodal dit 1 multimodal education 1 multimodal efficiency 1 multimodal embedding 1 multimodal embeddings 1 multimodal encoder 1 multimodal foundation models 1 multimodal instruction 1 multimodal instructions 1 multimodal rag 1 multimodal retrieval 1 multimodal robustness 1 multimodal safety 1 multimodal security 1 multimodal systems 1 multimodal training 1 multimodal transformer 1 multimodal translation 1 multimodal unification 1 multiple instance learning 1 multiple-choice qa 1 multiple-choice questions 1 multitask learning 1 music analysis 1 music generation 1 music-driven video generation 1 music-visual coherence 1 named entity retrieval 1 narrative analysis 1 natural language commands 1 natural language explanations 1 navigation 1 neighborhood attention 1 network analysis 1 network interface 1 neural architecture search 1 neural codecs 1 neural decoding 1 neural dynamics 1 neural metrics 1 neural modules 1 neural network pruning 1 neural networks 1 neural processing 1 neural processing units 1 neural video compression 1 neurodegenerative screening 1 news context 1 next-event prediction 1 nlp applications 1 nlp benchmarks 1 noise filtering 1 noise inversion 1 non-autoregressive generation 1 nonparametric identification 1 normal estimation 1 novelty metrics 1 numerical claims 1 numerical reasoning 1 nurbs modeling 1 object counting 1 object geometry 1 object grounding 1 object hallucination 1 object interactions 1 object manipulation 1 object tracking 1 ocr 1 offline rl 1 olympiad math 1 omni-modal model 1 omnimodal understanding 1 on-policy reflection 1 online alignment 1 online decision making 1 online rl 1 ontology consistency 1 open model 1 open source models 1 open-ended generation 1 open-source 1 open-source model 1 open-world learning 1 openly licensed corpus 1 optical character recognition 1 optical flow 1 optimal transport 1 optimization algorithms 1 optimization dynamics 1 optimization framework 1 optimization theory 1 orthogonalization 1 out-of-distribution 1 out-of-distribution detection 1 outcome evaluation 1 overlap analysis 1 overrefusal reduction 1 overthinking 1 panoptic segmentation 1 panoramic image generation 1 panoramic perception 1 paper-to-agent 1 paper-to-page generation 1 parallel denoising 1 parallel execution 1 parallel generation 1 parallel inference 1 parallel multilingual prompting 1 parallel sampling 1 parallel scaling 1 parameter allocation 1 parameter analysis 1 parameter expansion 1 parameter scaling 1 parameter-efficient adaptation 1 parameter-efficient finetuning 1 parameter-efficient tuning 1 parametric human model 1 parametric knowledge 1 parametric modeling 1 parametric representation 1 paraphrasing 1 pareto front 1 part segmentation 1 part-aware reasoning 1 part-based modeling 1 partial manipulation 1 pass@k 1 patch collapse 1 patch module 1 path planning 1 pathology ai 1 patient simulator 1 pbr textures 1 pdf processing 1 peer evaluation 1 peer review 1 perceiver architecture 1 perception planning 1 perception policy 1 perception-action loop 1 perceptual assessment 1 perceptual optimization 1 performance benchmarking 1 performance degradation 1 performance engineering 1 performance gain 1 performance gap 1 person retrieval 1 persona clustering 1 personal narratives 1 personalization bias 1 personalized agents 1 personalized generation 1 perspectivist annotation 1 persuasion dynamics 1 phase preservation 1 phoneme classification 1 phonetic tasks 1 physical ai 1 physical realism 1 physically based rendering 1 physics 1 physics from video 1 physics understanding 1 physics-constrained retargeting 1 physics-plausible feedback 1 pipeline parallelism 1 pixel correlations 1 pixel space 1 pixel-level understanding 1 planning execution 1 plausibility evaluation 1 point cloud processing 1 point clouds 1 point-to-point communication 1 pointcloud augmentation 1 policy co-evolution 1 policy composition 1 policy entropy 1 policy gradients 1 policy reinforcement 1 policy specialization 1 portfolio management 1 portrait animation 1 portrait video editing 1 pose correction 1 pose-conditioned generation 1 positional encodings 1 post-editing 1 pragmatic understanding 1 pre-execution safety 1 prediction markets 1 predictive sensing 1 preference dynamics 1 preference hijacking 1 preference modeling 1 preference semantics 1 prefilling 1 prefix tuning 1 preprocessing 1 pretext tasks 1 primal-dual methods 1 privacy awareness 1 privacy preservation 1 privacy protection 1 privileged signal 1 proactive assistance 1 proactive prediction 1 probabilistic scoring 1 probe analysis 1 probing methods 1 problem generation 1 problem solving 1 procedural knowledge 1 procedural learning 1 procedural skills 1 process consistency 1 process reward 1 process reward modeling 1 process-aware modeling 1 production systems 1 professional documents 1 prognostics 1 program synthesis 1 program-of-thoughts 1 programming languages 1 progressive pre-training 1 prompt formatting 1 prompt strategies 1 promptable segmentation 1 proof evaluation 1 proof generation 1 proof search 1 property prediction 1 prophet method 1 proportional-integral control 1 pruning 1 pruning algorithms 1 pseudo-count 1 psychological computing 1 psychometric analysis 1 psychometric evaluation 1 psychometric jailbreak 1 pyramid pooling 1 python interpreter 1 python tools 1 qa benchmarks 1 qa evaluation 1 qa-based assessment 1 quadrotor control 1 quadruped robots 1 quality assessment 1 quality verification 1 quantitative finance 1 quantum algorithms 1 quantum circuits 1 quantum-inspired models 1 query augmentation 1 query difficulty 1 query reduction 1 query rewriting 1 query typology 1 query-conditioned pruning 1 question-answering 1 quiz-driven evaluation 1 qwen backbone 1 rank fusion 1 ranking 1 ranking optimization 1 ranking quality 1 ranking uncertainty 1 rare concept generation 1 rate-distortion 1 rdma 1 real-robot evaluation 1 real-time 1 real-time adaptation 1 real-time avatar 1 real-time control 1 real-time decision making 1 real-time generation 1 real-time information 1 real-time intervention 1 real-time monitoring 1 real-time synthesis 1 real-time video 1 reality alignment 1 reasoning acceleration 1 reasoning accuracy 1 reasoning agents 1 reasoning control 1 reasoning dataset 1 reasoning decomposition 1 reasoning frameworks 1 reasoning guidance 1 reasoning hallucinations 1 reasoning model 1 reasoning optimization 1 reasoning patterns 1 reasoning policy 1 reasoning process 1 reasoning reuse 1 reasoning sparks 1 reasoning strategies 1 reasoning styles 1 reasoning uncertainty 1 reasoning-aware generation 1 rebus puzzles 1 recommendation systems 1 recurrent models 1 recurrent-depth models 1 reference guidance 1 reference resolution 1 reference-based generation 1 refinement capability 1 refining framework 1 reflection analysis 1 reflection removal 1 refusal mechanisms 1 region prompting 1 regularization 1 reinforce algorithm 1 reinforcement pretraining 1 rejection fine-tuning 1 relation extraction 1 relational concepts 1 relative advantage 1 relay inference 1 reliability 1 reliability assessment 1 relighting 1 remaining useful life 1 rephrasing 1 replanning 1 report synthesis 1 repository-level 1 repository-level evaluation 1 repository-level testing 1 representation activation 1 representation autoencoders 1 representation geometry 1 representation hijacking 1 representation steering 1 reproducible datasets 1 reproducible tools 1 reranking 1 rescorla-wagner 1 research ecosystem 1 research strategies 1 research synthesis 1 residual learning 1 resilience 1 resilient ai 1 resolution extrapolation 1 resource management 1 resource-efficient 1 response refinement 1 responsible ai 1 retrieval 1 retrieval algorithms 1 retrieval models 1 retrieval quality 1 retrieval-augmented validation 1 reverse engineering 1 reward conditioning 1 reward design 1 reward functions 1 reward learning 1 reward maximization 1 reward model 1 reward variance 1 reward weighting 1 rgb-depth fusion 1 rgb-part synthesis 1 rgba generation 1 rgbd 1 risk assessment 1 risk detection 1 risk management 1 robot embodiment 1 robot policies 1 robot reasoning 1 robot simulation 1 robot state prediction 1 robust inference 1 robust inversion 1 robust optimization 1 robust representation 1 robust updates 1 robustness evaluation 1 robustness testing 1 rollout sampling 1 rope analysis 1 rope variants 1 rotary position encoding 1 rotary positional embedding 1 rubric-guided learning 1 rule induction 1 rule-based reasoning 1 runtime adaptation 1 russian language 1 safeguarded training 1 safety detection 1 safety reflection 1 safety risks 1 safety testing 1 sample difficulty 1 sample rehearsal 1 sampling algorithms 1 sampling efficiency 1 scalable synthesis 1 scaling analysis 1 scaling properties 1 scaling strategy 1 scaling trends 1 scene evolution 1 scene graph 1 scene navigation 1 scene parameter optimization 1 scheduler classification 1 scheduling algorithms 1 schema compliance 1 science agents 1 science challenge 1 science-grade reasoning 1 scientific computing 1 scientific imagery 1 scientific literature 1 scientific research 1 scientific understanding 1 scientific video understanding 1 scientific videos 1 scientific workflow 1 scientific workflows 1 scientific writing 1 score regularization 1 screen exploration 1 screen parsing 1 search-augmented llms 1 security 1 security analysis 1 security evaluation 1 security threats 1 selective refusal 1 self-adversarial 1 self-alignment 1 self-conditioning 1 self-consistency 1 self-critique 1 self-distillation 1 self-evolving 1 self-evolving learning 1 self-improving 1 self-improving reasoning 1 self-reconstruction 1 self-referential learning 1 self-refinement 1 self-reflection 1 self-supervised pretraining 1 semantic coherence 1 semantic control 1 semantic diversity 1 semantic features 1 semantic information theory 1 semantic interpretation 1 semantic modeling 1 semantic perturbation 1 semantic relationships 1 semantic schemas 1 semantic segmentation 1 semantic tree 1 semantic workspaces 1 semi-autoregressive decoding 1 semi-online rl 1 semi-supervised learning 1 semidefinite programming 1 sensitivity detection 1 sensor simulation 1 sentiment analysis 1 sequence modeling 1 sequence-to-sequence 1 sequential action 1 sequential decision process 1 sequential fine-tuning 1 sequential navigation 1 sequential reasoning 1 sequential refinement 1 sft 1 shadow art 1 shape decomposition 1 sharpness regularization 1 shortcut generation 1 side effects 1 siglip encoder 1 sim2real transfer 1 simulated data 1 simulated environments 1 simulated users 1 simulation benchmark 1 simulation environment 1 simulation framework 1 simultaneous translation 1 single image 1 single-cell biology 1 skeletal geometry 1 sketch verification 1 skill profiling 1 slavic languages 1 slide localization 1 slow-fast encoding 1 small language model 1 smartphone agents 1 soccer skills 1 social behavior 1 social bias 1 social cognition 1 social inequality 1 social media analytics 1 social pressure 1 socio-economic prediction 1 socio-technical analysis 1 socio-technical systems 1 soft-thinking 1 software engineering agents 1 software optimization 1 software refactoring 1 solution diversity 1 sora-2 1 span-level annotation 1 sparse computation 1 sparse matrices 1 sparse selection 1 spatial annotation 1 spatial audio generation 1 spatial consistency 1 spatial encoding 1 spatial generalization 1 spatial intelligence 1 spatial memory 1 spatial-temporal grounding 1 spatio-temporal grounding 1 spatio-textual prompting 1 spatiotemporal coherence 1 spatiotemporal consistency 1 spatiotemporal grounding 1 spatiotemporal modeling 1 spatiotemporal perception 1 spatiotemporal reasoning 1 specialized llm 1 specification violation 1 spectral analysis 1 spectral learning 1 spectral properties 1 speech classification 1 speech emotion recognition 1 speech generation 1 speech instructions 1 speech processing 1 speech tokenization 1 speech tokenizer 1 speech-to-speech 1 sphere packing 1 spin estimation 1 spine disorders 1 spoken language models 1 sql pipelines 1 stage-aware rewards 1 stance change 1 standard protocol 1 standardized benchmarking 1 state space models 1 state-space models 1 step-level evaluation 1 step-level policy optimization 1 step-level supervision 1 step-wise rewards 1 stepwise feedback 1 stitching experts 1 story completion 1 story visualization 1 strategic downsampling 1 strategic reasoning 1 streaming inference 1 streaming video understanding 1 structural analysis 1 structural patterns 1 structure alignment 1 structure evaluation 1 structured captions 1 structured generation 1 structured output 1 structured prompting 1 structured text 1 style analysis 1 style evaluation 1 style transfer 1 style-invariant learning 1 subgoal decomposition 1 subject preservation 1 subject-driven generation 1 subjective quality 1 submodular optimization 1 subspace disentanglement 1 super-resolution 1 supernumerary limbs 1 surface defect detection 1 surface normal prediction 1 surgical video generation 1 surgical video segmentation 1 surveillance video 1 survey generation 1 svg generation 1 sycophancy 1 symbolic music 1 synthetic environment 1 synthetic psychopathology 1 synthetic text generation 1 system cards 1 system evolution 1 system hacking 1 system prompts 1 system2 1 table extraction 1 table images 1 table recognition 1 table sanitization 1 table tennis 1 tabular reasoning 1 talking head 1 talking head synthesis 1 task adaptation 1 task alignment 1 task conflicts 1 task constellation 1 task embedding 1 task generation 1 task performance 1 task scheduling 1 task success 1 task vectors 1 task-specific adaptation 1 taxonomy 1 taxonomy reasoning 1 technical indicators 1 temperature prediction 1 temporal coherence 1 temporal control 1 temporal dynamics 1 temporal regularization 1 temporal search 1 temporal sparsity 1 temporal synchronization 1 termination policy 1 ternary weights 1 test case exploitation 1 test set analysis 1 test-set leakage 1 test-time compute 1 test-time search 1 text classification 1 text embedding 1 text preprocessing 1 text processing 1 text rendering 1 text representation 1 text representations 1 text-rich video 1 text-to-audio 1 text-to-cad 1 text-to-interaction 1 text-to-motion 1 text-to-multi-image generation 1 text-to-panorama 1 text-to-speech 1 text-to-video adaptation 1 texture prediction 1 thinking calibration 1 thinking models 1 thinking protocols 1 thinking trajectories 1 thought communication 1 thought processes 1 thought templates 1 threshold tuning 1 thyroid nodule 1 tibetan 1 timestep distillation 1 token acceptance 1 token embedding 1 token merging 1 token optimization 1 token permutation 1 token prediction 1 token prior 1 token pruning 1 token reduction 1 token routing 1 token sparsification 1 token-level routing 1 tokenization 1 tokenizer optimization 1 tool generation 1 tool interaction 1 tool learning 1 tool-augmented models 1 tool-augmented training 1 tool-based refinement 1 tool-integrated rl 1 tool-use 1 tool-use dataset 1 toolkit 1 top-p prediction 1 topological awareness 1 training collapse 1 training data curation 1 training framework 1 training infrastructure 1 training objective 1 training optimization 1 training pipeline 1 training-free enhancement 1 training-free method 1 trait analysis 1 trajectory generation 1 trajectory prediction 1 transferability 1 transformer compression 1 transformer mechanics 1 transformer networks 1 transformer optimization 1 transparency 1 tree structure 1 trilemma 1 triplane diffusion 1 trustworthiness 1 trustworthy ai 1 truth constraints 1 truth encoding 1 turkish nlp 1 ubiquitous computing 1 ui agents 1 ui grounding 1 ui sandbox 1 ui simulation 1 ui validation 1 ui-to-code 1 ultrasound 1 uncertainty reduction 1 understanding-generation alignment 1 unified encoding 1 unified model 1 unified modeling 1 unified models 1 unified multimodal models 1 universal segmentation 1 unsupervised training 1 urban sensing 1 user memory 1 user profiling 1 user-centric agents 1 utility evaluation 1 utility-privacy tradeoff 1 utilization metric 1 uv unwrapping 1 uyghur 1 variance reduction 1 vector drawings 1 vector quantization 1 verbosity reduction 1 verifiable environments 1 verifiable feedback 1 verification 1 vertebral level reasoning 1 vertex completion 1 video alignment 1 video anomaly understanding 1 video benchmark 1 video captioning 1 video chaptering 1 video completion 1 video composition 1 video customization 1 video dataset 1 video depth estimation 1 video difference captioning 1 video diffusion transformers 1 video emotion analysis 1 video forgery 1 video foundation model 1 video frames 1 video generation benchmark 1 video generation models 1 video hallucination 1 video inpainting 1 video language models 1 video large language models 1 video llm 1 video object removal 1 video quality metric 1 video restoration 1 video retakes 1 video scene graph generation 1 video segment selection 1 video streaming 1 video structure 1 video super-resolution 1 video synchronization 1 video translation 1 video-language models 1 video-to-audio 1 view consistency 1 viewpoint learning 1 viewpoint planning 1 virtual screening 1 virtual worlds 1 vision models 1 vision-driven control 1 vision-grounded rl 1 vision-language alignment 1 vision-language modeling 1 vision-language reasoning 1 vision-language synergy 1 visual answering 1 visual chain-of-thought 1 visual consistency 1 visual cues 1 visual editing 1 visual encoder 1 visual encoders 1 visual evaluation 1 visual foresight 1 visual hallucinations 1 visual measurement reading 1 visual navigation 1 visual observation 1 visual perception 1 visual planning 1 visual programming 1 visual prompt 1 visual refinement 1 visual reflection 1 visual representation preservation 1 visual representations 1 visual resolution 1 visual rumination 1 visual token compression 1 visual token reduction 1 visual understanding 1 visual-symbolic understanding 1 visual-textual integration 1 visualization understanding 1 visualwebarena 1 visuomotor control 1 visuomotor policy 1 visuospatial reasoning 1 vla models 1 vlm adaptation 1 voice style adaptation 1 voting methods 1 voxel-based generation 1 vq-vae 1 vqa 1 vulnerability analysis 1 weakly supervised learning 1 wearable ai 1 web coding 1 web interaction 1 web reconnaissance 1 web search 1 web searching 1 web-augmented agents 1 weight modification 1 white-box method 1 whole slide images 1 whole slide imaging 1 whole-body control 1 whole-body coordination 1 wikidata alignment 1 workload scheduling 1 world engine 1 world knowledge 1 world simulation 1 xbrl 1 yolov5 1 yolov9 1 zero-shot classification 1 zero-shot evaluation 1 zero-shot prediction 1 zero-shot prompting 1 zero-shot reasoning 1 zero-shot transfer 1 zero-shot tts 1
24小时内新更新论文 53 72小时内新更新论文 136 最新更新: UltraImage: Rethinking Resolution Extrapolation in Image Diffusion Transformers 12-06 15:03
🎯 个性推荐

根据你感兴趣的论文主题智能推荐最新内容

📄

2511.23127

🤖 系统
12-03 15:03

DualCamCtrl:用于几何感知相机控制视频生成的双分支扩散模型 / DualCamCtrl: Dual-Branch Diffusion Model for Geometry-Aware Camera-Controlled Video Generation


1️⃣ 一句话总结

这篇论文提出了一个名为DualCamCtrl的新模型,它通过同时生成颜色和深度视频的双分支框架,并利用语义引导的融合机制,显著提升了根据指定相机轨迹生成视频的准确性和几何一致性,比之前的方法减少了超过40%的相机运动误差。


📄

2512.03040

🤖 系统
12-03 15:03

Video4Spatial:通过上下文引导的视频生成迈向视觉空间智能 / Video4Spatial: Towards Visuospatial Intelligence with Context-Guided Video Generation


1️⃣ 一句话总结

这篇论文提出了一个名为Video4Spatial的框架,它证明仅通过视频数据训练的视频生成模型,就能像人一样理解复杂的空间关系,并成功完成场景导航和物体定位等需要空间推理的任务。


📄

2512.01248

🤖 系统
12-03 14:59

TRivia:用于表格识别的视觉语言模型自监督微调方法 / TRivia: Self-supervised Fine-tuning of Vision-Language Models for Table Recognition


1️⃣ 一句话总结

这篇论文提出了一种名为TRivia的自监督微调方法,让视觉语言模型无需人工标注数据,就能直接从大量无标签表格图片中学习识别和结构化表格,并基于此训练出了一个性能超越现有先进系统的开源模型TRivia-3B。


📄

2511.22609

🤖 系统
12-03 14:59

MG-Nav:基于稀疏空间记忆的双尺度视觉导航 / MG-Nav: Dual-Scale Visual Navigation via Sparse Spatial Memory


1️⃣ 一句话总结

这篇论文提出了一个名为MG-Nav的双尺度视觉导航框架,它通过一个紧凑的稀疏空间记忆图来统一全局路径规划和局部避障控制,无需针对特定场景进行训练,就能在陌生环境中实现高效、鲁棒的导航。


📄

2511.15948

🤖 系统
12-03 14:52

Click2Graph:通过单次点击生成交互式全景视频场景图 / Click2Graph: Interactive Panoptic Video Scene Graphs from a Single Click


1️⃣ 一句话总结

这篇论文提出了一个名为Click2Graph的交互式框架,用户只需在视频中点击或框选一个目标,系统就能自动追踪它、找出与之互动的其他物体,并推断出它们之间的关系,从而生成一个结构化的、易于理解和控制的视频场景理解图谱。


📄

2512.01481

🤖 系统
12-02 16:47

ChronosObserver:利用超空间扩散采样驯服四维世界 / ChronosObserver: Taming 4D World with Hyperspace Diffusion Sampling


1️⃣ 一句话总结

这篇论文提出了一种名为ChronosObserver的无训练方法,通过构建一个‘世界状态超空间’来表征四维场景的时空约束,并利用该超空间同步多个视角的扩散采样轨迹,从而直接生成高保真、三维一致且时间同步的多视角视频,无需对现有扩散模型进行额外训练或微调。


📄

2511.19990

🤖 系统
12-02 15:50

OmniRefiner:基于强化学习的局部扩散模型图像精细化方法 / OmniRefiner: Reinforcement-Guided Local Diffusion Refinement


1️⃣ 一句话总结

这篇论文提出了一个名为OmniRefiner的两阶段图像精细化框架,它通过结合扩散模型和强化学习,有效解决了现有方法在根据参考图编辑生成图像时难以保留精细纹理和保持视觉一致性的问题。


📄

2511.13944

🤖 系统
12-02 15:50

查找泄露,修复分割:基于聚类的防止视频衍生数据集信息泄露的方法 / Find the Leak, Fix the Split: Cluster-Based Method to Prevent Leakage in Video-Derived Datasets


1️⃣ 一句话总结

这篇论文提出了一种基于聚类的帧选择策略,通过在划分训练集、验证集和测试集之前,先将视觉上相似的视频帧分组,从而有效防止数据集信息泄露,确保划分出的各部分更具代表性、更平衡、更可靠。


📄

2511.13344

🤖 系统
12-02 15:49

YOLO与专家混合模型相遇:用于鲁棒目标检测的自适应专家路由 / YOLO Meets Mixture-of-Experts: Adaptive Expert Routing for Robust Object Detection


1️⃣ 一句话总结

这篇论文提出了一种新的目标检测方法,通过将多个YOLOv9-T模型组合成一个‘专家混合’系统,并让网络自动选择最合适的专家来处理不同图像特征,从而比单个模型更准确地识别和定位物体。


📄

2511.13276

🤖 系统
12-02 15:49

使用弱监督双编码器模型识别监控视频中的异常事件 / Recognition of Abnormal Events in Surveillance Videos using Weakly Supervised Dual-Encoder Models


1️⃣ 一句话总结

这篇论文提出了一种仅需视频级别标注的弱监督方法,通过结合卷积和Transformer两种网络的优势,有效检测监控视频中罕见且多样的异常行为,在标准数据集上取得了优异的性能。