StructXLIP: Enhancing Vision-language Models with Multimodal Structural Cues
arxiv
Toward Accountable AI-Generated Content on Social Platforms: Steganographic Attribution and Multimodal Harm Detection
arxiv
Lingua-SafetyBench: A Benchmark for Safety Evaluation of Multilingual Vision-Language Models
arxiv
VILLAIN at AVerImaTeC: Verifying Image-Text Claims via Multi-Agent Collaboration
arxiv
SALMUBench: A Benchmark for Sensitive Association-Level Multimodal Unlearning
arxiv