Abstract

Objective

To mine Twitter and quantitatively analyze COVID-19 symptoms self-reported by users, compare symptom distributions across studies, and create a symptom lexicon for future research.

Materials and Methods

We retrieved tweets using COVID-19-related keywords, and performed semiautomatic filtering to curate self-reports of positive-tested users. We extracted COVID-19-related symptoms mentioned by the users, mapped them to standard concept IDs in the Unified Medical Language System, and compared the distributions to those reported in early studies from clinical settings.

Results

We identified 203 positive-tested users who reported 1002 symptoms using 668 unique expressions. The most frequently-reported symptoms were fever/pyrexia (66.1%), cough (57.9%), body ache/pain (42.7%), fatigue (42.1%), headache (37.4%), and dyspnea (36.3%) amongst users who reported at least 1 symptom. Mild symptoms, such as anosmia (28.7%) and ageusia (28.1%), were frequently reported on Twitter, but not in clinical studies.

Conclusion

The spectrum of COVID-19 symptoms identified from Twitter may complement those identified in clinical settings.