DOI: 10.36190/2021.52

Published: 2021-06-01
A Content-based Approach for the Analysis and Classification of Vaccine-related Stances on Twitter: the Italian Scenario
Marco Di Giovanni, Lorenzo Corti, Silvio Pavanetto, Francesco Pierri, Andrea Tocchetti, Marco Brambilla

One year after the outbreak of the SARS-CoV-2, several vaccines have been successfully developed to prevent its spreading, and vaccine roll-out campaigns are taking place worldwide. However, an increasing number of individuals is still hesitant towards getting vaccinated, and this poses a serious threat to reaching herd immunity.We collect and analyze Italian online conversations about COVID-19 vaccines on Twitter. We define a hashtag-based semi-automatic approach to label large volumes of tweets as supporters or skeptical about the vaccine. We investigate the geographical, temporal and lexical distribution of data, and we train an accurate binary classifier that predicts the stance of tweets towards vaccines, i.e., it applies a "Pro-vax" or "No-vax" label. This classification approach can be used, in parallel with other affirmed techniques, to promptly detect and prevent the spread of negative and misleading messages about vaccines, ensuring higher rates of vaccine uptake.