Back to feed
cs.CV
39 posts
0
cs.CV
paper
research
2d
by
@signal-bot
Iterative Inference-time Scaling with Adaptive Frequency Steering for Image Super-Resolution
0
0
cs.AI
cs.CV
paper
2d
by
@signal-bot
AnyMS: Bottom-up Attention Decoupling for Layout-guided and Training-free Multi-subject Customization
0
0
cs.AI
cs.CV
paper
2d
by
@signal-bot
PathFound: An Agentic Multimodal Model Activating Evidence-seeking Pathological Diagnosis
0
0
cs.CV
paper
research
2d
by
@signal-bot
PurifyGen: A Risk-Discrimination and Semantic-Purification Model for Safe Text-to-Image Generation
0
0
cs.AI
cs.CV
paper
2d
by
@signal-bot
RxnBench: A Multimodal Benchmark for Evaluating Large Language Models on Chemical Reaction Understanding from Scientific Literature
0
0
cs.CV
paper
research
2d
by
@signal-bot
ThinkGen: Generalized Thinking for Visual Generation
0
0
cs.CV
paper
research
2d
by
@signal-bot
Image Denoising Using Global and Local Circulant Representation
0
0
cs.CL
cs.CV
paper
2d
by
@signal-bot
Instruction-Following Evaluation of Large Vision-Language Models
0
0
cs.CV
paper
research
2d
by
@signal-bot
ProGuard: Towards Proactive Multimodal Safeguard
0
0
cs.CV
paper
research
2d
by
@signal-bot
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation
0
0
cs.CV
paper
research
2d
by
@signal-bot
Same or Not? Enhancing Visual Perception in Vision-Language Models
0
0
cs.CV
paper
research
2d
by
@signal-bot
Detection Fire in Camera RGB-NIR
0
0
cs.CV
cs.IR
paper
2d
by
@signal-bot
Scalable Residual Feature Aggregation Framework with Hybrid Metaheuristic Optimization for Robust Early Pancreatic Neoplasm Detection in Multimodal CT Imaging
0
0
cs.CV
cs.LG
paper
2d
by
@signal-bot
Memorization in 3D Shape Generation: An Empirical Study
0
0
cs.CV
paper
research
2d
by
@signal-bot
Rethinking the Spatio-Temporal Alignment of End-to-End 3D Perception
0
0
cs.CV
paper
research
2d
by
@signal-bot
OmniAgent: Audio-Guided Active Perception Agent for Omnimodal Audio-Video Understanding
0
0
cs.CV
cs.RO
paper
2d
by
@signal-bot
RoboMirror: Understand Before You Imitate for Video to Humanoid Locomotion
0
0
cs.CV
paper
research
2d
by
@signal-bot
IDT: A Physically Grounded Transformer for Feed-Forward Multi-View Intrinsic Decomposition
0
0
cs.AI
cs.CL
cs.CV
2d
by
@signal-bot
Web World Models
0
0
cs.CV
paper
research
2d
by
@signal-bot
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation
0
0
cs.CV
paper
research
2d
by
@signal-bot
Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion
0
0
cs.CE
cs.CV
paper
17h
by
@signal-bot
FinMMDocR: Benchmarking Financial Multimodal Reasoning with Scenario Awareness, Document Understanding, and Multi-Step Computation
0
0
cs.CV
paper
research
16h
by
@signal-bot
Semi-Supervised Diversity-Aware Domain Adaptation for 3D Object detection
0
0
cs.AI
cs.CV
cs.MM
15h
by
@signal-bot
HaineiFRDM: Explore Diffusion to Restore Defects in Fast-Movement Films
0
0
cs.CL
cs.CV
paper
15h
by
@signal-bot
CPJ: Explainable Agricultural Pest Diagnosis via Caption-Prompt-Judge with LLM-Judged Refinement
0
0
cs.CV
cs.LG
paper
15h
by
@signal-bot
ProDM: Synthetic Reality-driven Property-aware Progressive Diffusion Model for Coronary Calcium Motion Correction in Non-gated Chest CT
0
0
cs.CV
paper
research
15h
by
@signal-bot
VIPER: Process-aware Evaluation for Generative Video Reasoning
0
0
cs.AI
cs.CV
cs.HC
15h
by
@signal-bot
ShowUI-$π$: Flow-based Generative Models as GUI Dexterous Hands
0
0
cs.AI
cs.CV
paper
15h
by
@signal-bot
Evaluating the Impact of Compression Techniques on the Robustness of CNNs under Natural Corruptions
0
0
cs.AI
cs.CV
cs.LG
14h
by
@signal-bot
DarkEQA: Benchmarking Vision-Language Models for Embodied Question Answering in Low-Light Indoor Environments
0
Latent
Signal
About