Like Polling Data, Modeling Data Has Inherent Limitations When Predicting Outcomes
The results of a recent modeling exercise, to be published in the journal Nature, were designed to illustrate health disparities and evaluate the disparate impact of reopening restaurants on economically disadvantaged groups.
Instead, the project created an ill-advised correlation between the exercise and the risk of dining at restaurants. Like polling data, modeling data cannot reliably be used to draw definitive conclusions about the actions of a group of people or determine outcomes.
The model used anonymized data from mobile applications to study mobility patterns from March through May 2020. Researchers used this approach to study the mechanism behind disparities and to quantify how different reopening strategies impact disadvantaged groups.
Due to the complexities of the real world, predictive models are inherently fraught with error even under the best of circumstances. In this case, the researchers' model seems to fit their data, so they conclude that it is reasonably accurate. However, this is not enough to assert that restaurants are a significant source of risk across the entire United States.
Statement on “Mobility network models of COVID-19 explain inequities and inform reopening”
The National Restaurant Association has identified numerous concerns with the accuracy of the model and the determinations contained in the final report, including the following:
- Did not consider restaurant safety measures – Not all full-service restaurants are equal in relative risk. The model did not include the robust safety protocols used at restaurants throughout the country including: face-covering use/enforcement, changes in process for physical and social distancing (seating, waiting areas, lines, etc.), air flow and exchange management, indoor vs. outdoor dining, and surface disinfecting.
- Predicted transmission vs. actual contact tracing – The report determined “predicted” transmission rates based on modeling only, not on real-world contact tracing.
- Used a limited sample size – The study was conducted in just 10 major metropolitan areas and was not a representative sample of points of interest around the country.
- Unable to account for different types of movement – The model was conducted at a time (March through May 2020) in cities where restaurants were closed to indoor dining during most of that period, serving food for takeout and delivery only. Visits to restaurants to pick up takeout orders (which certainly could be captured in mobility data erroneously, as an on-premises visit to the restaurant) were not segregated in the points of interest listings.
- Unable to account for different types of restaurants – The points of interest did not discriminate between types of dining establishments, indoor dining, takeout service, etc. Not all restaurants carry equal risk—all of the management practices that have been published as guidance will reduce individual and population risks.
- Used secondhand datasets – The datasets used to calibrate confirmed case and death counts came from the New York Times and their reporters who, in turn, reported them from public news conferences and public data releases. They were not obtained directly from public health authorities who frequently update and revise the datasets.
- Did not include mask wearing in the analysis – The two biggest factors identified by the Centers for Disease Control and Prevention and other public health officials—use of face coverings and physical distancing—are not mentioned. The model essentially estimated all levels of compliance as equal which, over time since the sampling was done, have proven that many restaurant operators are effectively managing compliance.
- The report states: “The mobility dataset we use has limitations: it does not cover all populations, does not contain all Points of Interest, and cannot capture sub-Census Block Group heterogeneity. Our model itself is also parsimonious, and does not include all real-world features relevant to disease transmission.”
We would not dispute the researchers’ conclusions that disadvantaged individuals had higher mobility during the research period and therefore may have been at higher risk of contracting SARS CoV-2. However, absent contact tracing and public health data determining who among the groups actually contracted the virus and without a higher level of specificity to where they contracted it, using a modeling exercise with anonymized location data pulled from mobile devices is not an acceptable way to determine that restaurants were a likely cause for virus transmission.