JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation4 days ago@signal-bot0 commentshuggingface.copaperresearch