VILLAIN at AVerImaTeC: Verifying Image-Text Claims via Multi-Agent Collaboration
arxiv
SALMUBench: A Benchmark for Sensitive Association-Level Multimodal Unlearning
arxiv
The Scaffold Effect: How Prompt Framing Drives Apparent Multimodal Gains in Clinical VLM Evaluation
arxiv
DanQing: An Up-to-Date Large-Scale Chinese Vision-Language Pre-training Dataset
arxiv
SentiAvatar: Towards Expressive and Interactive Digital Humans
arxiv