Workshop Proceedings of the 16th International AAAI Conference on Web and Social Media
Workshop: Workshop on Novel Evaluation Approaches for Text Classification Systems on Social Media (NEATCLasS)
DOI: 10.36190/2022.91Social media platforms are essential for information sharing and, thus, prone to coordinated dis- and misinformation campaigns. Nevertheless, research in this area is hampered by strict data sharing regulations imposed by the platforms, resulting in a lack of benchmark data. Previous work focused on circumventing these rules by either pseudonymizing the data or sharing fragments. In this work, we will address the benchmarking crisis by presenting a methodology that can be used to create artificial campaigns out of original campaign building blocks. We conduct a proof-of-concept study using the freely available generative language model GPT-Neo in this context and demonstrate that the campaign patterns can flexibly be adapted to an underlying social media stream and evade state-of-the-art campaign detection approaches based on stream clustering. Thus, we not only provide a framework for artificial benchmark generation but also demonstrate the possible adversarial nature of such benchmarks for challenging and advancing current campaign detection methods.