Workshop Proceedings of the 20th International
AAAI Conference on Web and Social Media

Workshop: MisD 2026: The 2nd Workshop on Misinformation Detection in the Era of LLMs

DOI: 10.36190/2026.43

Published: 2026-05-26
M4Health: A Multi-Modal, Multi-Domain, Multi-Platform, and Multi-Task Benchmark for Video-Driven Health Communication on Social Media
Raihana Zahra, Zhenghao Gong, Owen Dewing, Nicholas Aurino, Thomas Rife, Cyrus Nikzad, Daniel Rowe, Junyuan Lin, Lanyu Shang

Short-video platforms have transformed how the public consumes health information on web and social media where uncurated content poses significant risks to vulnerable populations. While prior work has primarily focused on text-only or single-platform analyses, comprehensive benchmarks for multi-modal health communication in short videos remain limited. In this paper, we introduce M4Health, a multi-modal, multi-domain, multi-platform, and multi-task benchmark for health communication in short videos. M4Health comprises 669,995 videos from TikTok, YouTube Shorts, and Reddit, spanning diverse health domains such as nutrition, fitness, mental health, and wellness. We provide expert annotations for a subset of videos across three interrelated tasks, including credibility assessment, AI-generation detection, and theme classification. Extensive benchmarking experiments show that current state-of-the-art models, including task-specific approaches and large vision-language models (LVLMs), achieve suboptimal performance. We will share the M4Health dataset with research communities to foster collaborative research toward supporting informed health decision-making on web and social media.