Workshop Proceedings of the 20th International
AAAI Conference on Web and Social Media

Workshop: MisD 2026: The 2nd Workshop on Misinformation Detection in the Era of LLMs

DOI: 10.36190/2026.44

Published: 2026-05-26
Health Misinformation Detection in Hidradenitis Suppurativa Communities Using Multi-Model LLM Ensembles
Jayasimha Shivanna, Shagun Saboo, Drashtiben Ankurbhai Bhavsar, Divya Chaudhary

Health misinformation in online patient communities poses significant risks for individuals managing chronic conditions, yet no detection benchmarks exist for rare dermatological dis- eases. We present the first unsupervised multi-model frame- work for detecting health misinformation in Hidradenitis Suppurativa (HS) Reddit communities, requiring zero hu- man annotation. Our framework deploys 28 models across five experimental stages: aspect-based sentiment and emo- tion analysis using RoBERTa models, supervised classifi- cation via auto-labeled RoBERTa-family architectures, un- supervised pattern discovery with UMAP/HDBSCAN clus- tering, zero-shot detection via NLI classifiers, and prompt- based detection via open-source LLMs including Llama-3, Mistral-7B, Flan-T5, BioMistral, and GatorTron. We intro- duce the HS Misinformation Index (HSMI), a composite risk metric fusing multi-model consensus, domain-specific key- word heuristics, and contextual sentiment signals. Applied to 9,838 Reddit texts, our pipeline identifies recurring mis- information patterns including cure claims, pseudoscientific narratives, and commercial promotion, along with their emo- tional correlates. Experiments reveal that NLI classifiers ex- hibit more conservative detection behavior than LLMs, while a closed-source validation using GPT-4o, Claude Sonnet 4, and Gemini models on a seven-category misinformation tax- onomy achieves substantially higher inter-model agreement. Emotion-risk analysis shows that texts expressing disgust and anger carry the highest misinformation risk, while joyful texts carry the lowest. General-purpose models consistently out- perform domain-specific ones across all experimental stages. This work establishes the first annotated HS misinformation benchmark and demonstrates that multi-model consensus can reliably surface health misinformation without human labels, providing a scalable template for underserved chronic disease communities.