What is the COVID-19 Outcome Risk Assessment system?
Developed by Bayesiant, the COVID-19 Outcome Risk Assessment (CORA) accurately identifies individuals likely to develop symptomatic COVID-19 requiring hospitalization and significantly higher chance of death from COVID-19. The system is unique in its ability to correlate social determinants of health (SDOH) factors as well as known COVID-19 risk factors to determine individual risk prior to infection, allowing for broad screening across population segments to determine individual risk. For those that are symptomatic, the system evaluates reported symptoms along with other factors to determine risk of an outcome likely to result in hospitalization or death.
The system classifies risk into four risk categories:
- LOW – The individual is at minimal risk of severe COVID-19
- MEDIUM – The individual has an average risk of severe COVID-19
- HIGH – The individual has a high than average risk of severe COVID-19
- EXTREME – The individual is in the highest risk category for severe COVID-19
A limitation of the system is that it cannot determine risk to individuals under the age of 18. However, the system has determined the risk of severe COVID-19 for approximately 400,000 individuals that are residents of San Joaquin county and over 21 million Californians.
Why is it important?
Utilization of the system will allow public health officials, healthcare administrators and policymakers immediate demographic and geographic insight as to who will likely develop severe COVID-19 to efficiently and proactively:
- Develop equitable testing and vaccination plans based on risk
- Prepare for and inform of potential hospitalizations with specificity
- Develop and deploy highly targeted and localized public health initiatives to mitigate COVID-19 transmission
- Quantify and document the impact of SDOH in COVID-19 outcomes within specific communities
- Develop effective re-opening strategies based on individual outcome risk as opposed to arbitrary methods
What is it based on?
The ability of the system to predict COVID-19 outcomes is based on actual confirmed cases. The system utilized advanced machine learning techniques and artificial intelligence to classify and analyze over 11,000 confirmed COVID19 cases in San Joaquin county, identifying correlations between reported symptoms, social determinants of health and the likelihood of
an adverse COVID19 outcome; defined as severe COVID-19 requiring hospitalization and or resulting in death.
The resulting model used by the system is known as the San Joaquin County COVID-19 (SJCCOVID) model and has been shown to be highly predictive of outcomes within San Joaquin county.
Is it accurate?
The system currently is predicting COVID-19 adverse outcome with 77% accuracy. We believe that with more COVID-19 case data, the accuracy of the system will improve.
How do I get my risk assessment?
Currently, we do not provide assessments to individuals, though this might change. The system is available to public health officials via a secure, web-based application.
How has this been used?
- The San Joaquin County Office of Public Health used the system to identify individuals at extreme risk that were non-English speaking and performed educational outreach to them.
- The system correctly identified disproportionate Hispanic and LatinX for severe COVID-19 prior to an influx of hospitalizations within the community.
- Working with San Joaquin County Office of Public Health, Lodi Memorial-Adventist Health and El Concilio, the system was used to identify geographic areas of high concentrated risk for outreach and testing.
How was it developed?
In March of 2020, Bayesiant executed a Business Associate Agreement (BAA) with San Joaquin county to provide COVID-19 data analytics and modeling support to the San Joaquin County Office of Public Health. As COVID-19 cases were reported to the county and entered into the California Reportable Disease Information Exchange (CalREDIE) database, COVID-19 case data was provided to Bayesiant for analysis and modeling.
Utilizing advanced machine learning and artificial intelligence, Bayesiant classified and analyzed over 11,000 confirmed COVID19 cases in San Joaquin county, identifying correlations between reported symptoms, social determinants of health and the likelihood of an adverse COVID19 outcome; defined as severe COVID-19 requiring hospitalization and or resulting in death.
To create the SJCCOVID model, 8,801 confirmed cases were isolated from the CalREDIE case dataset of San Joaquin county cases, of which 2,227 were cross matched with the Bayesiant Population Health Information Management System (PHIMS) dataset, a proprietary database comprised of over 400 factors that include sociodemographic, psychographic, environmental and SDOH data. Using regression analysis, the relationship between individual factors COVID-19 cases and their outcomes was classified and modeled.
Is the data secure?
Per our agreement with the San Joaquin county, the data is highly secure, and no personally identifiable information is made available outside of the system or online.
Where does the data come from that is not provided by the county?
The Bayesiant supplied data is from our proprietary database that is carefully curated and comprised of sociodemographic, psychographic, environmental and SDOH data. By agreement, we cannot divulge our third-party data sources, although we do leverage several government databases that have released data publicly.
You mentioned CalREDIE, aren’t there problems with that data?
Aside from CalREDIE not being designed to handle the volume of COVID-19 cases, the issues with CalREDIE data tends to be lack of completeness. For instance, we initially detected that classification of Hispanic individuals was incorrect, and in many cases, simply missing. With age and ethnicity identified as major factors, we were able to infer ethnicity by providing our system with a list of over 200 Hispanic surnames and then “training” the system to recognize individuals of potential Hispanic ethnicity by identifying those that had been properly classified.
Using this method doubled the Hispanic cases within our CalREDIE sample and was verified by the number of hospitalization where the individual was of Hispanic ethnicity.