Workshop Proceedings of the 19th International AAAI Conference on Web and Social Media

Workshop: #SMM4H-HeaRD 2025: Joint 10th Social Media Mining for Health and Health Real-World Data Workshop and Shared Tasks

DOI: 10.36190/2025.64

Published: 2025-06-05
Gooseek@SMM4H-HeaRD 2025: Detection of Personal Adverse Reactions to Shingles Vaccines in Reddit Posts using Large Language Models
Shichao Feng, Donger Chen, Yuan Li, Bailu Zhang

Traditionally, pharmacovigilance (PV) relies on formal reporting systems or official databases, which often underreport patient experiences. Social media, particularly Reddit, offer valuable resources for real-time detection and tracking of personal adverse reactions, especially in the context of rapidly spreading epidemic diseases. However, extracting meaningful insights from Reddit posts remains challenging due to their inherent noise and unstructured nature. In this study, we leverage the Llama 3.1-8B-Instruct large language model (LLM), enhanced through supervised fine-tuning (SFT) and integrated with few-shot or zero-shot learning strategies using chain-of-thought (CoT) prompts, to automatically identify personal adverse reactions to shingles vaccines from Reddit discussions in the context of Task 6 of the SMM4H-HeaRD 2025 shared task. Experimental evaluations demonstrate the model's capability to accurately and efficiently detect mentions of adverse reactions in Reddit posts, achieving F1-score of 96.3% on the holdout dataset. Additionally, an ablation study reveals that incorporating CoT prompts significantly improves overall performance, whereas few-shot learning somewhat introduces bias. The code is publicly available at https://github.com/ShichaoFeng92/SMM4H_2025_Task6.git