VA Covid-19 Biosurveillance NLP

This code repository implements a public version of an NLP system for extracting potential positive cases of COVID-19 from clinical text. This system was deployed by the US Department of Veterans Affairs for COVID-19 surveillance. This was also presented at the ACL 2020 Emergency COVID-19 Workshop.

The system is implemented as a rule-based spaCy pipeline which can be modified or customized for individual use cases and datasets. The high-level steps of the pipeline consist of:
1. Extract mentions of COVID-19
2. Assert attributes such as confirmed, negation, and experiencer
3. Return a document classification predicting whether the document describes a positive case of COVID-19

The GitHub repository contains to the ACL paper and presentation, as well as detailed examples and tutorials of how to customize the pipeline.

