cs.CLcs.LG

The Thiomi Dataset: A Large-Scale Multimodal Corpus for Low-Resource African Languages

Hillary Mutisya, John Mugane, Gavin Nyamboga, Brian Chege, Maryruth Gathoni3/31/2026arxiv

This paper hasn't been summarized yet

AI Evaluation

AI analysis scores

Overall Score

Novelty85/100

Methodology90/100

Reproducibility80/100

Impact95/100

Similar Papers

VILLAIN at AVerImaTeC: Verifying Image-Text Claims via Multi-Agent Collaboration

arxiv

SALMUBench: A Benchmark for Sensitive Association-Level Multimodal Unlearning

arxiv

The Scaffold Effect: How Prompt Framing Drives Apparent Multimodal Gains in Clinical VLM Evaluation

arxiv

DanQing: An Up-to-Date Large-Scale Chinese Vision-Language Pre-training Dataset

arxiv

SentiAvatar: Towards Expressive and Interactive Digital Humans

arxiv