Workshop Proceedings of the 19th International AAAI Conference on Web and Social Media

Workshop: NLPSI 2025: First Workshop on Integrating NLP and Psychology to Study Social Interactions

DOI: 10.36190/2025.39

Published: 2025-06-05
The Utility of LLM Text Generation in Longitudinal Psychological Datasets
Jari Zegers, Bennett Kleinberg

As part of this ongoing work, we prompted an LLM with three waves of texts from a longitudinal panel dataset to generate a text for wave 4. We compared generated to ground truth texts using cosine similarity on embeddings and tested whether text similarity was associated with psychological variables. We found limited evidence for an association but do find differences in the topics used in generated versus ground-truth texts. An explanation for differences in text similarities remains the subject of ongoing investigation.