Rex W. Douglass PhD 1 2

[@RexDouglass]

3/30/2020

[Working Draft. COMMENTS WELCOME!]3

Introduction

How should non-epidemiologists publicly discuss COVID-19 data and models? When leaders and citizens are especially sensitive to signals on public health, what is our intellectual responsibility to defer to the analysis of more expert speakers? I argue that our responsibility during crisis is the same as it was before; to do good work, to the best of our abilities, with the scientific principles of curiosity and honesty. Alternative shorthands like ‘staying in your lane’ are a poor decision rule for sorting good work from bad, and they ignore the very messy process that underlies real-world scientific inquiry. Lane-keeping is a poor way to learn and become a better consumer of expert findings, and gate-keeping is a missed opportunity to provide the public goods of feedback and review. To demonstrate the point, this note provides a detailed review of a recent piece “Coronavirus Perspective” (Epstein 2020a). By applying and illustrating data science principles point for point to this non-epidemiological take on epidemiological questions, it is hoped that the reader will take away not why they should avoid working on new topics but rather how they should approach those topics in an honest, curious, and rigorous way.

Epstein (2020a, 2020b)

Epstein (2020a) argues that the U.S. ought to shift from a loose shelter in place style quarantine to a more limited shelter in place for just vulnerable populations. He provides two primary rationales. First, the number of cases and number of deaths both in the U.S. and worldwide are likely to be small. Second, mortality for under 60 is relatively low. Together, the two ideas suggest that restrictions on all groups is overkill and some compromise weaker position is preferred. He reiterates this position in a number of interviews. In a follow up piece Epstein (2020b) doubles down on the core argument that the direct health costs of the disease will be moderate and a weakened response should be preferred, “allowing the virus to run its course—is a better path forward for the economy.” After the U.K. briefly flirted with this approach before rejecting it, the option is being circulated publicly at the federal level in the U.S. and this text specifically is reportedly popular among some U.S. policy makers and refered to as a competing projection. For that reason it serves a prime case for consideration on how to think about non-epidemiologists talking about scientific question that are decidedly out of their lane.

Lesson 1: Actually Care About the Answer to a Question

Epstein (2020a) frames itself as being contrarian rather than curious about the true state of the world.

Much of the current analysis does not explain how and why rates of infection and death will spike, so I think that it is important to offer a dissenting voice.

These are deeply contrarian estimates.

Perhaps my analysis is all wrong, even deeply flawed. But the stakes are too high to continue on the current course without reexamining the data and the erroneous models that are predicting doom.

Science is about being curious about the true state of the world, and through application of evidence and methods, forming new more true beliefs than we held the day before. Contrarianism is not a search for truth, it’s a search for political influence in a market that rewards diversity of opinion for diversity’s sake. Performative controversy, fake horse races, hypotheses that don’t follow from theory, no examination of model fit or out of sample performance, and so on, are immediate red flags the author doesn’t actually care what the right answer is.

As a consumer of analysis, the second I can tell the author doesn’t actually care about the answer to the underlying question, they’re dead to me.

As a producer of analysis, the struggle is how to think about and do science alongside actors who generate controversy out of self-interest using a lot of the same language as science. The only real solution is to learn how to tell good work from bad work no matter the wrapping.

Lesson 2: Pose a Question and Propose a Research Design that Can Answer It

Instead of an assertion, we should present Epstein’s idea as a concrete research question: What will the number of deaths from COVID-19 in the United States be by say September 1? To be concrete, here are our outcomes, confirmed COVID-19 cases (red) and deaths (black) compiled by Johns Hopkins CSSE.

To make this easier to compare across time and across countries, let’s log transform the outcome and change date to number of days since the 100th reported case. This puts our forcasting horizon at about 180 days from the start of the U.S. episode.

Two immediate things to take away are first, we are interested specifically in deaths and are forced to understand spread of all cases incidentally as a means to understand deaths. The second is that our forecasting horizon is far. A lot can happen between now and then, and experts have wildly varying expectations about what will actually happen in this window. Even though there is a great deal of expert certainty about the underlying mechanics, what will happen or more precisely what we will choose to let happen, are unknowns.

Lesson 3: Use Failures of Your Predictions to Revise your Model

In the first draft of the piece dated and posted March 16, 2020 Epstein (2020a) predicts the following about future counts of deaths:

From this available data, it seems more probable than not that the total number of cases world-wide will peak out at well under 1 million, with the total number of deaths at under 50,000 (up about eightfold). In the United States, if the total death toll increases at about the same rate, the current 67 deaths should translate into about 500 deaths at the end. Of course, every life lost is a tragedy—and the potential loss of 50,000 lives world-wide would be appalling—but those deaths stemming from the coronavirus are not more tragic than others, so that the same social calculus applies here that should apply in other cases.

This is great. It makes a sharp testable prediction that we can use to validate in a timely manner a radical alternate model of disease spread.4

When the fatality number passed 500, Epstein edited the online copy of the original March 16th piece to read 5,000 instead and added a footnote

From this available data, it seems more probable than not that the total number of cases world-wide will peak out at well under 1 million, with the total number of deaths at under 50,000 (up about eightfold). In the United States, if the total death toll increases at about the same rate, the current 67 deaths should reach about 5000 (or twn percent of my estimated world total, which may also turn out to be low). [See correction & addendum at the end of this essay.]5

Correction & Addendum, added March 24, 2020: That estimate is ten times greater than the 500 number I erroneously put in the initial draft of the essay, and it, too, could prove somewhat optimistic. But any possible error rate in this revised projection should be kept in perspective. The current U.S. death toll stands at 592 as of noon on March 24, 2020, out of about 47,000 cases. So my adjusted figure, however tweaked, remains both far lower, and I believe far more accurate, than the common claim that there could be a million dead in the U.S. from well over 150 million coronavirus cases before the epidemic runs its course.

And then published a follow up note saying he really meant to type 2,500 the first time.

In my column last week, I predicted that the world would eventually see about 50,000 deaths from the novel coronavirus, and the United States about 500. These two numbers are clearly not in sync. If the first number holds, the total US deaths should be about 4 to 5 percent of that total, or about 2,000–2,500 deaths. The current numbers are getting larger, so it is possible both figures will move up in a rough proportion from even that revised estimate.

This is not great. This alters the prediction but does not bother to alter the logic which led to the calculation. Multiplying the then global death count by 8 would lead to a global prediction of 50,000 and so multiplying the then U.S. death count of 67 by 8 would be 536, hence the forecast of 500. Likewise the 50,000 global total is left unchanged. Simply adding zeros to the prediction every time it is proven wrong doesn’t alter the underlying model given in the same sentence.

Don’t do this. When you feel comfortable enough to share your predictions publicly, create a concrete record and stick by it. You can always make new predictions based on new models, but don’t go back and massage past predictions after the fact. The temptation to try to gaslight others (and yourself) that you were really right the entire time is too great.6

We can visually examine this prediction in light of the data up to now, adding global counts of cases and deaths, and marking the predicted maximums in the March 16th draft and then the revised March 24th draft.

These forecasts in light of the actual data trajectory should immediately give you pause. For the United States, it would require a very soon departure from the current exponential trend which wasn’t visible yet in either the confirmed or death trends. The U.S. trend in deaths was actually accelerating slightly here.

For the world, growth starts off exponential, levels off, and then goes exponential again as it reaches a new part of the world. For Epstein’s 50k estimate to be true, the trend in Europe and the U.S. would have to start leveling off now, and there would have to not be an exponential growth when the disease fully hits Latin America, Africa, and South East Asia. As of this writing the world is already three quarters to that prediction of a million confirmed cases. It’s not that these outcomes aren’t possible, it’s that it’s unclear what in the time trend up until now or in the theory suggests that’s what is about to happen.

Lesson 4: Form Meaningful Prior Beliefs with a Thorough Literature Review

Thanks to the growing revolution in Open Science and preprint outlets like medrxiv.org and arxiv.org, the barrier to performing a decent review of recent COVID-19 research is membership in an academic institution with Google Scholar access, like Starbucks, Southwest Airlines, or your bathroom.

Don’t Use Straw Men to Represent the State of the Art.

The piece begins with an incorrect and on face unlikely summary of the current state of the art

Right now, the overwhelming consensus, based upon the most recent reports, is that the rate of infection will continue to increase so that the most severe interventions are needed to control what will under the worst of circumstances turn into a high rate of death.

Much of the current analysis does not explain how and why rates of infection and death will spike, so I think that it is important to offer a dissenting voice.

Only two citations are provided. The first is an educational infographic in the opinion section of the NYT. It includes a direct quote from the scientific consultants not to interpret the model as a production forecast

“The point of a model like this is not to try to predict the future but to help people understand why we may need to change our behaviors or restrict our movements, and also to give people a sense of the sort of effect these changes can have,”

The second is a medium blog post, by an MBA/engineer who runs an education website, and collects a number of plots and infographics on COVID-19 from around the web.

If these two examples are representative of something other than the scientific state of the art, e.g. media commentary, then they should be cited specifically in that context and then followed by the long list of actual state of the art academic research that ought to be in the popular discourse. Instead, these two sources are presented as straw men.

Don’t claim the state of the art ignores factors that it actually takes into account.

Epstein’s central criticism is that the epidemiological models at the heart of the scientific consensus share an absurd and incorrect assumption that the rate of the spread of the disease, R0, remains constant over time. R0 is the number of persons, an infected person is likely to pass the disease on to, where above 1 means the diseases will accelerate in growth in the population, 1 means it will remain constant, and below 1 means it will slowly die out. The R0 for COVID-19 is estimated to lie somewhere between 2 and 3 at the start of outbreak in an area.

He believes that state of the art epidemiological models predict very high infection and death rates, because they erroneously set R0 as a fixed constant, and do not adjust it downward over time as individuals and government take action to reduce the spread.

Plainly, no. That is not what state of the art epidemiological models do, and that’s not why they reach high predictions of cases and deaths.

The easiest way to tell that that interpretation is wrong is to simply think through what a time constant, greater than 1, R0 implies. It would mean the disease’s progression never slows down, the entire population of 330 million Americans eventually catch it, with a fatality rate of 1% producing 3.3 million fatalities.

The NYT’s infographic he cites suggests 100 million infected and 1 million dead. So R0 mechanically must not remain constant, it must decline from the initial value (R0=2.3 in that model).

The second easiest way to tell that that interpretation is wrong is to actually read the article where it directly specifies a schedule of R0 reduction based on “mild intervention”

The mild intervention as modeled here is where we are now in the United States: It is a status quo in which some gatherings are canceled and there is promotion of social distancing and work from home, but with inadequate testing and unaddressed supply shortages.

The third easiest way to tell that that interpretation is wrong, is to look at any of the COVID-19 pandemic models available online or in published research and see how they handle R0 over time.

For example COVID-19 Scenarios out of the University of Basel allows you to choose between strong, moderate, weak, none, or a user customizable schedule of R0 decline. Epidemic Envisioner in development at MIT has 7 different possible decay functions for R0 to choose from. The Epidemic Calculator allows you to vary the timing of an intervention and its effect on R0. The Swiss-epidemic-model explicitly compares the effect of reducing transmission rates starting today over a range of values from 0 to 100%.

The agent based model behind the Imperial College study examines many different adaptive non-government strategies including case isolation in the home after individual symptoms, voluntary home quarantine if anyone has symptoms, social distancing for those over 70, and social distancing for the entire population. With those in place, it still predicts an 81% infection rate and 2.2 million deaths. The Framework for Reconstructing Epidemiological Dynamics (FRED) out of the University of Pittsburgh is also agent based and takes into account adaptive response that drive R0 down over time. Or this stochastic transmission model fit to data from Wuhan and has a time varying R0. Prem et al. 2020 simulate lifting the control measures in Wuhan, with a fixed R0, but transmission modeled directly by intergenerational mixing as restrictions are lifted again over time.

In general, COVID-19 models, certainly models in production, take into account changes in R0 over time. More precisely, the high end models simulate the direct behaviors that lead to transmissions, of which R0 is a summary statistic. Default parameters are usually set to current levels of mitigation. The completely unmitigated case is no longer a relevant counterfactual, as the world now knows about the disease, and the relevant policy choices are between greater and lesser degrees of mitigation.

Don’t get basic facts wrong.

The piece repeats twice the incorrect fact that COVID-19 has a “relatively short (two-week) incubation period.” Even a cursory search would point to estimates of an incubation period of about 5 days, and an even shorter serial interval of about 4.5 days.

Better yet, when starting research on a new area, reading one of the many existing literature reviews for lay audiences would explain the difference between a serial interval and an incubation period, the time until infectious and the time until symptomatic, and that COVID-19 is particularly dangerous precisely because many people can pass on the disease prior to knowing they have it.

Don’t cherry pick data.

Epstein (2020) chooses to focus on the Case Fatality Rate of South Korea, ostensibly not because its relatively low estimate of 0.92% fits his story, but because its more comprehensive testing makes it more accurate.

It is instructive to see how this analysis fares by taking into account the Korean data, which is more complete than the American data. South Korea has been dealing with the coronavirus since January 20. Since that time, the Korean government has administered a total of 261,335 tests to its citizens. In press releases updated every day, the Korean CDC is reporting (as of March 15) 8,162 total infections against 75 deaths for an overall mortality rate of 0.92 percent.

Selecting on data quality is problematic because South Korea’s draconian testing regime is part of its success in combating COVID-19 and its low CFR. Looking for your car keys under the streetlight not because that’s where they are but that’s where you can see is a universally problematic approach to scientific inquiry, especially so where measurement, governance, and disaster all are strongly related.

Cherry picking a case with good data and good news also misrepresents certainty over the measure. Estimating the Case Fatality Ratio CFR is difficult and can change in either direction over time as data come in. The CFR may not be knowable for months.

Because of this actual CFR estimates for COVID-19 vary wildly across countries. There are heroic attempts to combine several different kinds of data taking into account undercounting to estimate the CFR for cases we care about like China, and they are still finding CFRs north of 1% (1.6%).

Take the consensus explanation seriously.

There are at least two types of fatalities from COVID-19. The first are persons who receive adequate medical care but die anyway. The second are persons who would have survived but received insufficient or degraded medical care.

There is a vein of research on why COVID-19 actually kills you, which in addition to age finds specific vulnerabilities like hypertension, heart disease, diabetes, cardiovascular disease, cancer, chronic respiratory disease, and kidney impairment. The mechanism by which COVID-19 kills includes pneumonia as well as myocardial injury. It may also damage the liver.

Stress on the lungs necessitates oxygen, ventilation, and even intubation which places an enormous strain on healthcare resources. Hospitalization rates in early U.S. CDC data are 25%, and are currently 12% in New York. A proper literature review would find growing , evidence for a relationship between available health resources and mortality. Including that Italy took the start of the disease quite seriously. Further, the early U.S. data show older than 65 accounting for 80% of deaths but only 45% of hospitalizations and 53% of ICU admissions. Young Americans are more likely to survive, but many still require medical intervention to do so.

This is the cause for the sudden spike in mortality reported in the scientific consensus. It’s why the alternative proposal of sheltering in place only older Americans wouldn’t work. Doing so wouldn’t even remove half of the current burden on the healthcare system, and would instead greatly add to it when most adults under 60 contract the disease together at about the same time, and 10% of them head to the hospital at about the same time.

Lesson 5: Don’t Form Strong Prior Beliefs Based on Cherry Picked Data

In setting up our research design, we want to predict U.S. cases but we want to base that prediction on a model fit more broadly on other representative data.

For example, Epstein sets up this small comparison between early U.S. death rates in Washington to peak death rates in Italy or China.

What, then, does all of this portend for the future of COVID-19 in the United States? Good news is more likely than bad, notwithstanding the models that predict otherwise. The deaths in Washington have risen only slowly, even as the number of infections mount.

We should instead phrase this as a question, how does the U.S.’s growth rate in COVID-19 deaths compare to elsewhere? Is early data from one or more states likely to be representative of the U.S. average rate?

To answer this question, we compare each country and each U.S. state side by side, aligning their episodes by the first date each crossed 100 confirmed cases. We fit a linear trend line to the cumulative count of deaths in each country (log transformed). From the slope of that line we can calculate the day over day average percent change in deaths in each location.

Here are the raw data for countries with more than 100 cases, with coloring and trend lines fit to the 4 suggested case comparisons, Italy, China, Washington State, and the U.S. as a whole.

And here is the distribution of average growth rates in deaths in the first 30 days across countries and U.S. states.