Rex W. Douglass PhD 1 2

[@RexDouglass]

3/30/2020

[Working Draft. COMMENTS WELCOME!]3

Introduction

How should non-epidemiologists publicly discuss COVID-19 data and models? When leaders and citizens are especially sensitive to signals on public health, what is our intellectual responsibility to defer to the analysis of more expert speakers? I argue that our responsibility during crisis is the same as it was before; to do good work, to the best of our abilities, with the scientific principles of curiosity and honesty. Alternative shorthands like ‘staying in your lane’ are a poor decision rule for sorting good work from bad, and they ignore the very messy process that underlies real-world scientific inquiry. Lane-keeping is a poor way to learn and become a better consumer of expert findings, and gate-keeping is a missed opportunity to provide the public goods of feedback and review. To demonstrate the point, this note provides a detailed review of a recent piece “Coronavirus Perspective” (Epstein 2020a). By applying and illustrating data science principles point for point to this non-epidemiological take on epidemiological questions, it is hoped that the reader will take away not why they should avoid working on new topics but rather how they should approach those topics in an honest, curious, and rigorous way.

Epstein (2020a, 2020b)

Epstein (2020a) argues that the U.S. ought to shift from a loose shelter in place style quarantine to a more limited shelter in place for just vulnerable populations. He provides two primary rationales. First, the number of cases and number of deaths both in the U.S. and worldwide are likely to be small. Second, mortality for under 60 is relatively low. Together, the two ideas suggest that restrictions on all groups is overkill and some compromise weaker position is preferred. He reiterates this position in a number of interviews. In a follow up piece Epstein (2020b) doubles down on the core argument that the direct health costs of the disease will be moderate and a weakened response should be preferred, “allowing the virus to run its course—is a better path forward for the economy.” After the U.K. briefly flirted with this approach before rejecting it, the option is being circulated publicly at the federal level in the U.S. and this text specifically is reportedly popular among some U.S. policy makers and refered to as a competing projection. For that reason it serves a prime case for consideration on how to think about non-epidemiologists talking about scientific question that are decidedly out of their lane.

Lesson 1: Actually Care About the Answer to a Question

Epstein (2020a) frames itself as being contrarian rather than curious about the true state of the world.

Much of the current analysis does not explain how and why rates of infection and death will spike, so I think that it is important to offer a dissenting voice.

These are deeply contrarian estimates.

Perhaps my analysis is all wrong, even deeply flawed. But the stakes are too high to continue on the current course without reexamining the data and the erroneous models that are predicting doom.

Science is about being curious about the true state of the world, and through application of evidence and methods, forming new more true beliefs than we held the day before. Contrarianism is not a search for truth, it’s a search for political influence in a market that rewards diversity of opinion for diversity’s sake. Performative controversy, fake horse races, hypotheses that don’t follow from theory, no examination of model fit or out of sample performance, and so on, are immediate red flags the author doesn’t actually care what the right answer is.

As a consumer of analysis, the second I can tell the author doesn’t actually care about the answer to the underlying question, they’re dead to me.

As a producer of analysis, the struggle is how to think about and do science alongside actors who generate controversy out of self-interest using a lot of the same language as science. The only real solution is to learn how to tell good work from bad work no matter the wrapping.

Lesson 2: Pose a Question and Propose a Research Design that Can Answer It

Instead of an assertion, we should present Epstein’s idea as a concrete research question: What will the number of deaths from COVID-19 in the United States be by say September 1? To be concrete, here are our outcomes, confirmed COVID-19 cases (red) and deaths (black) compiled by Johns Hopkins CSSE.

To make this easier to compare across time and across countries, let’s log transform the outcome and change date to number of days since the 100th reported case. This puts our forcasting horizon at about 180 days from the start of the U.S. episode.

Two immediate things to take away are first, we are interested specifically in deaths and are forced to understand spread of all cases incidentally as a means to understand deaths. The second is that our forecasting horizon is far. A lot can happen between now and then, and experts have wildly varying expectations about what will actually happen in this window. Even though there is a great deal of expert certainty about the underlying mechanics, what will happen or more precisely what we will choose to let happen, are unknowns.

Lesson 3: Use Failures of Your Predictions to Revise your Model

In the first draft of the piece dated and posted March 16, 2020 Epstein (2020a) predicts the following about future counts of deaths:

From this available data, it seems more probable than not that the total number of cases world-wide will peak out at well under 1 million, with the total number of deaths at under 50,000 (up about eightfold). In the United States, if the total death toll increases at about the same rate, the current 67 deaths should translate into about 500 deaths at the end. Of course, every life lost is a tragedy—and the potential loss of 50,000 lives world-wide would be appalling—but those deaths stemming from the coronavirus are not more tragic than others, so that the same social calculus applies here that should apply in other cases.

This is great. It makes a sharp testable prediction that we can use to validate in a timely manner a radical alternate model of disease spread.4

When the fatality number passed 500, Epstein edited the online copy of the original March 16th piece to read 5,000 instead and added a footnote

From this available data, it seems more probable than not that the total number of cases world-wide will peak out at well under 1 million, with the total number of deaths at under 50,000 (up about eightfold). In the United States, if the total death toll increases at about the same rate, the current 67 deaths should reach about 5000 (or twn percent of my estimated world total, which may also turn out to be low). [See correction & addendum at the end of this essay.]5

Correction & Addendum, added March 24, 2020: That estimate is ten times greater than the 500 number I erroneously put in the initial draft of the essay, and it, too, could prove somewhat optimistic. But any possible error rate in this revised projection should be kept in perspective. The current U.S. death toll stands at 592 as of noon on March 24, 2020, out of about 47,000 cases. So my adjusted figure, however tweaked, remains both far lower, and I believe far more accurate, than the common claim that there could be a million dead in the U.S. from well over 150 million coronavirus cases before the epidemic runs its course.

And then published a follow up note saying he really meant to type 2,500 the first time.

In my column last week, I predicted that the world would eventually see about 50,000 deaths from the novel coronavirus, and the United States about 500. These two numbers are clearly not in sync. If the first number holds, the total US deaths should be about 4 to 5 percent of that total, or about 2,000–2,500 deaths. The current numbers are getting larger, so it is possible both figures will move up in a rough proportion from even that revised estimate.

This is not great. This alters the prediction but does not bother to alter the logic which led to the calculation. Multiplying the then global death count by 8 would lead to a global prediction of 50,000 and so multiplying the then U.S. death count of 67 by 8 would be 536, hence the forecast of 500. Likewise the 50,000 global total is left unchanged. Simply adding zeros to the prediction every time it is proven wrong doesn’t alter the underlying model given in the same sentence.

Don’t do this. When you feel comfortable enough to share your predictions publicly, create a concrete record and stick by it. You can always make new predictions based on new models, but don’t go back and massage past predictions after the fact. The temptation to try to gaslight others (and yourself) that you were really right the entire time is too great.6

We can visually examine this prediction in light of the data up to now, adding global counts of cases and deaths, and marking the predicted maximums in the March 16th draft and then the revised March 24th draft.

These forecasts in light of the actual data trajectory should immediately give you pause. For the United States, it would require a very soon departure from the current exponential trend which wasn’t visible yet in either the confirmed or death trends. The U.S. trend in deaths was actually accelerating slightly here.

For the world, growth starts off exponential, levels off, and then goes exponential again as it reaches a new part of the world. For Epstein’s 50k estimate to be true, the trend in Europe and the U.S. would have to start leveling off now, and there would have to not be an exponential growth when the disease fully hits Latin America, Africa, and South East Asia. As of this writing the world is already three quarters to that prediction of a million confirmed cases. It’s not that these outcomes aren’t possible, it’s that it’s unclear what in the time trend up until now or in the theory suggests that’s what is about to happen.

Lesson 4: Form Meaningful Prior Beliefs with a Thorough Literature Review

Thanks to the growing revolution in Open Science and preprint outlets like medrxiv.org and arxiv.org, the barrier to performing a decent review of recent COVID-19 research is membership in an academic institution with Google Scholar access, like Starbucks, Southwest Airlines, or your bathroom.

Don’t Use Straw Men to Represent the State of the Art.

The piece begins with an incorrect and on face unlikely summary of the current state of the art

Right now, the overwhelming consensus, based upon the most recent reports, is that the rate of infection will continue to increase so that the most severe interventions are needed to control what will under the worst of circumstances turn into a high rate of death.

Much of the current analysis does not explain how and why rates of infection and death will spike, so I think that it is important to offer a dissenting voice.

Only two citations are provided. The first is an educational infographic in the opinion section of the NYT. It includes a direct quote from the scientific consultants not to interpret the model as a production forecast

“The point of a model like this is not to try to predict the future but to help people understand why we may need to change our behaviors or restrict our movements, and also to give people a sense of the sort of effect these changes can have,”

The second is a medium blog post, by an MBA/engineer who runs an education website, and collects a number of plots and infographics on COVID-19 from around the web.

If these two examples are representative of something other than the scientific state of the art, e.g. media commentary, then they should be cited specifically in that context and then followed by the long list of actual state of the art academic research that ought to be in the popular discourse. Instead, these two sources are presented as straw men.

Don’t claim the state of the art ignores factors that it actually takes into account.

Epstein’s central criticism is that the epidemiological models at the heart of the scientific consensus share an absurd and incorrect assumption that the rate of the spread of the disease, R0, remains constant over time. R0 is the number of persons, an infected person is likely to pass the disease on to, where above 1 means the diseases will accelerate in growth in the population, 1 means it will remain constant, and below 1 means it will slowly die out. The R0 for COVID-19 is estimated to lie somewhere between 2 and 3 at the start of outbreak in an area.

He believes that state of the art epidemiological models predict very high infection and death rates, because they erroneously set R0 as a fixed constant, and do not adjust it downward over time as individuals and government take action to reduce the spread.

Plainly, no. That is not what state of the art epidemiological models do, and that’s not why they reach high predictions of cases and deaths.

The easiest way to tell that that interpretation is wrong is to simply think through what a time constant, greater than 1, R0 implies. It would mean the disease’s progression never slows down, the entire population of 330 million Americans eventually catch it, with a fatality rate of 1% producing 3.3 million fatalities.

The NYT’s infographic he cites suggests 100 million infected and 1 million dead. So R0 mechanically must not remain constant, it must decline from the initial value (R0=2.3 in that model).

The second easiest way to tell that that interpretation is wrong is to actually read the article where it directly specifies a schedule of R0 reduction based on “mild intervention”

The mild intervention as modeled here is where we are now in the United States: It is a status quo in which some gatherings are canceled and there is promotion of social distancing and work from home, but with inadequate testing and unaddressed supply shortages.

The third easiest way to tell that that interpretation is wrong, is to look at any of the COVID-19 pandemic models available online or in published research and see how they handle R0 over time.

For example COVID-19 Scenarios out of the University of Basel allows you to choose between strong, moderate, weak, none, or a user customizable schedule of R0 decline. Epidemic Envisioner in development at MIT has 7 different possible decay functions for R0 to choose from. The Epidemic Calculator allows you to vary the timing of an intervention and its effect on R0. The Swiss-epidemic-model explicitly compares the effect of reducing transmission rates starting today over a range of values from 0 to 100%.

The agent based model behind the Imperial College study examines many different adaptive non-government strategies including case isolation in the home after individual symptoms, voluntary home quarantine if anyone has symptoms, social distancing for those over 70, and social distancing for the entire population. With those in place, it still predicts an 81% infection rate and 2.2 million deaths. The Framework for Reconstructing Epidemiological Dynamics (FRED) out of the University of Pittsburgh is also agent based and takes into account adaptive response that drive R0 down over time. Or this stochastic transmission model fit to data from Wuhan and has a time varying R0. Prem et al. 2020 simulate lifting the control measures in Wuhan, with a fixed R0, but transmission modeled directly by intergenerational mixing as restrictions are lifted again over time.

In general, COVID-19 models, certainly models in production, take into account changes in R0 over time. More precisely, the high end models simulate the direct behaviors that lead to transmissions, of which R0 is a summary statistic. Default parameters are usually set to current levels of mitigation. The completely unmitigated case is no longer a relevant counterfactual, as the world now knows about the disease, and the relevant policy choices are between greater and lesser degrees of mitigation.

Don’t get basic facts wrong.

The piece repeats twice the incorrect fact that COVID-19 has a “relatively short (two-week) incubation period.” Even a cursory search would point to estimates of an incubation period of about 5 days, and an even shorter serial interval of about 4.5 days.

Better yet, when starting research on a new area, reading one of the many existing literature reviews for lay audiences would explain the difference between a serial interval and an incubation period, the time until infectious and the time until symptomatic, and that COVID-19 is particularly dangerous precisely because many people can pass on the disease prior to knowing they have it.

Don’t cherry pick data.

Epstein (2020) chooses to focus on the Case Fatality Rate of South Korea, ostensibly not because its relatively low estimate of 0.92% fits his story, but because its more comprehensive testing makes it more accurate.

It is instructive to see how this analysis fares by taking into account the Korean data, which is more complete than the American data. South Korea has been dealing with the coronavirus since January 20. Since that time, the Korean government has administered a total of 261,335 tests to its citizens. In press releases updated every day, the Korean CDC is reporting (as of March 15) 8,162 total infections against 75 deaths for an overall mortality rate of 0.92 percent.

Selecting on data quality is problematic because South Korea’s draconian testing regime is part of its success in combating COVID-19 and its low CFR. Looking for your car keys under the streetlight not because that’s where they are but that’s where you can see is a universally problematic approach to scientific inquiry, especially so where measurement, governance, and disaster all are strongly related.

Cherry picking a case with good data and good news also misrepresents certainty over the measure. Estimating the Case Fatality Ratio CFR is difficult and can change in either direction over time as data come in. The CFR may not be knowable for months.

Because of this actual CFR estimates for COVID-19 vary wildly across countries. There are heroic attempts to combine several different kinds of data taking into account undercounting to estimate the CFR for cases we care about like China, and they are still finding CFRs north of 1% (1.6%).

Take the consensus explanation seriously.

There are at least two types of fatalities from COVID-19. The first are persons who receive adequate medical care but die anyway. The second are persons who would have survived but received insufficient or degraded medical care.

There is a vein of research on why COVID-19 actually kills you, which in addition to age finds specific vulnerabilities like hypertension, heart disease, diabetes, cardiovascular disease, cancer, chronic respiratory disease, and kidney impairment. The mechanism by which COVID-19 kills includes pneumonia as well as myocardial injury. It may also damage the liver.

Stress on the lungs necessitates oxygen, ventilation, and even intubation which places an enormous strain on healthcare resources. Hospitalization rates in early U.S. CDC data are 25%, and are currently 12% in New York. A proper literature review would find growing , evidence for a relationship between available health resources and mortality. Including that Italy took the start of the disease quite seriously. Further, the early U.S. data show older than 65 accounting for 80% of deaths but only 45% of hospitalizations and 53% of ICU admissions. Young Americans are more likely to survive, but many still require medical intervention to do so.

This is the cause for the sudden spike in mortality reported in the scientific consensus. It’s why the alternative proposal of sheltering in place only older Americans wouldn’t work. Doing so wouldn’t even remove half of the current burden on the healthcare system, and would instead greatly add to it when most adults under 60 contract the disease together at about the same time, and 10% of them head to the hospital at about the same time.

Lesson 5: Don’t Form Strong Prior Beliefs Based on Cherry Picked Data

In setting up our research design, we want to predict U.S. cases but we want to base that prediction on a model fit more broadly on other representative data.

For example, Epstein sets up this small comparison between early U.S. death rates in Washington to peak death rates in Italy or China.

What, then, does all of this portend for the future of COVID-19 in the United States? Good news is more likely than bad, notwithstanding the models that predict otherwise. The deaths in Washington have risen only slowly, even as the number of infections mount.

We should instead phrase this as a question, how does the U.S.’s growth rate in COVID-19 deaths compare to elsewhere? Is early data from one or more states likely to be representative of the U.S. average rate?

To answer this question, we compare each country and each U.S. state side by side, aligning their episodes by the first date each crossed 100 confirmed cases. We fit a linear trend line to the cumulative count of deaths in each country (log transformed). From the slope of that line we can calculate the day over day average percent change in deaths in each location.

Here are the raw data for countries with more than 100 cases, with coloring and trend lines fit to the 4 suggested case comparisons, Italy, China, Washington State, and the U.S. as a whole.

And here is the distribution of average growth rates in deaths in the first 30 days across countries and U.S. states.

In answer to the first question, the U.S. growth rate in deaths is 24.48% which is above the median across countries 18.8%. The U.S. rate is below Italy (29.32%) and China (29.58%), but just below.7

In answer to the second question, Washington State was and continues to be an example of lower growth in death rates (11.7%), which at half of the average U.S. rate is not representative of the country as a whole.8

The Epstein piece goes on to say

The New York cases have been identified for long enough that they should have produced more deaths if the coronavirus was as dangerous as is commonly believed.

I don’t entirely know what this sentence is supposed to mean, but New York’s day on day growth in fatalities is the second highest in the world in these data, just behind Massachusetts at 52.28%.

The lesson here being that intentionally looking for early “good news” means necessarily ignoring actual news relevant to the research question.9

Lesson 6: Be Specific and Concrete About Your Theory

To proceed further, we need to attempt to distill Richard Epstein’s Model of Epidemiology and Disease (here after REMED). Getting REMED requires holding several simple but contradictory ideas at the same time.

The first idea is that the one full cycle of COVID-19 spread and decline, China, is representative of future inflection points for other countries. Overlay the China curve, the China curve breaks, so another country’s curve should break similarly too. (CHINA_INFLECTION)

Overlooked is the good news coming out of China, where the latest report shows 16 new cases and 14 new deaths, suggesting that the number of deaths in the currently unresolved group will be lower than the 5.3 percent conversion rate in the cases resolved to date. In my view, we will see a similar decline in Italy, for reasons that I shall outline in the remainder of this article.

In dealing with this point, it is critical to note that the rapid decline in the incidence of new cases and death in China suggests that cases in Italy will not continue to rise exponentially over the next several weeks.

The second is that, in contrast, the high death rate of China is not representative, it’s idiosyncratic and not what should be expected in other countries because of its high rate of smoking and pollution (SMOKERS, POLLUTION).

My own guess is that the percentage of deaths will decline in Korea for the same reasons that they are expected to decline in the United States. It is highly unlikely that there will ever be a repetition of the explosive situation in Wuhan, where air quality is poorer and smoking rates are higher.

Italy’s death rate is also not representative, it’s idiosyncratic and not what should be expected in other countries. (ITALY)10

Moreover, it is unlikely that the healthcare system in the United States will be compromised in the same fashion as the Italian healthcare system in the wake of its quick viral spread.

The third idea is that China’s heavy-handed policy response is not why its growth curve in deaths broke, and presumably neither will Italy’s heavy handed policy response be responsible when it breaks as well. Therefore he argues, the U.S. should not copy them. (POLICY)

As of March 16, the data from the United States falls short of justifying the draconian measures that are now being implemented. As of two days ago, 39 states have declared states of emergency, and they have been joined at the federal level with President Trump’s recent declaration to the same effect. These declarations are meant to endow governments with the power to impose quarantines and travel bans, close schools, restrict public gatherings, shut down major sporting events, stop public meetings, and close restaurants and bars. Private institutions are imposing similar restrictions. The one-two punch of public and private restrictions has caused a huge jolt to the economy.

The irony here is that even though self-help measures like avoiding crowded spaces make abundant sense, the massive public controls do not. In light of the available raw data, public officials have gone overboard.

My own guess is that the percentage of deaths will decline in Korea for the same reasons that they are expected to decline in the United States.

Despite saying multiple times elsewhere policy does have an independent effect

Various institutional measures, both private and public, have also slowed down the transmission rate.

The amount of voluntary and forced separation in the United States has gotten very extensive very quickly, which should influence rates of infection sooner rather than later.

The fourth idea is the growth and decline curves of COVID-19 are a function of time, and naturally will burn itself out because of

  1. Societal Response (ADAPTATION)

But once people are aware of the disease, they will start to make powerful adaptive responses, including washing their hands and keeping their distance from people known or likely to be carrying the infection. Various institutional measures, both private and public, have also slowed down the transmission rate.

  1. Seasonality (SEASONALITY)

And finally, the model explicitly ignores the possibility that the totals will decline as the weather gets warmer.

  1. Natural selection will breed weaker COVID-19 (WEAKENING)11

At some tipping point, the most virulent viruses will be more likely to kill their hosts before the virus can spread. In contrast, the milder versions of the virus will wreak less damage to their host and thus will survive over the longer time span needed to spread from one person to another. Hence the rate of transmission will trend downward, as will the severity of the virus. It is a form of natural selection.

Given that the coronavirus can spread through droplets and contact, the consequences of selection should manifest themselves more quickly than they did for AIDS.

  1. Natural selection will remove weaker humans (SUSCEPTIBLE)

Nor does the model recognize that if the most vulnerable people are hit first, subsequent iterations will be slower because the remaining pool of individuals is more resistant to infection.

Summary

To get REMED means to believe that:

Let that sink in.

Lesson 7: Choose Enough Cases to Actually Test Your Theory

We can formalize REMED as a series of equations

The rate of deaths at time t is a function of the distance before or after when China’s inflection took place, the degree of societal adaptation, how much the disease has evolved to become less deadly, the number of suceptible people still left in the population, and the season.

\(\partial DEATHS_t =F(CHINAINFLECTION, ADAPTATION,WEAKENING,SUSCEPTIBLE,SEASONALITY, Time)\)

The total number of expected DEATHS in country c, however, is just a function of the policy chosen, smoking population, pollution, and whether or not you are the country of Italy.

\(DEATHS_c=F(Policy, Smokers, Pollution, Italy)\)

Almost all social sciences grad students will be forced to read King, Keohane, and Verba (1997) or something similar in their first year which details how to do small n case selection with an eye toward having enough cases and the right cases to test your explanations. We can express this directly by coding the available cases and putting them into a table.12

Country Deaths Policy Pollution Smoking Italy
China 3,296 Shelter in Place High 2043 No
Italy 9,134* Shelter in Place Low 1493.3 Yes
U.S. 1,475* Shelter in Place Low 1016.6 No

A number of problems should become immediately apparent. First, the number of observations with a final episode count of deaths is just 1, China. Both Italy and the U.S. still show an increasing count. Further, we have 4 different explanations to test but only 1 or at most 2 data points. We don’t have enough degrees of freedom to mathematically demonstrate a relationship between total deaths and every one of these explanations.

Second, we don’t have any variation on the dependent variable, both Italy and China have a high death toll.13 The sample does not include any examples of low death toll countries for us to infer from.

Third, we don’t have variation on the independent variable of interest, policy, either. All three of these countries have shelter in place style policies of one form or another. And, both China’s and Italy’s are more extreme, not less, than the U.S. so far.

We could start to improve on this. First, collapse smoking and pollution into a single lung health index, or for convenience choose just smoking. Second, drop the idiosyncratic explanation for Italy just being different somehow. Third, expand the sample of cases to include variation on both the dependent AND independent variable. That is, we need examples of countries with low death totals. We also need examples of countries with no shelter in place. Here’s what that looks like for the set of countries that have made it to 30 days since 100 confirmed cases.

Country Deaths_at_30_Days Shelter_In_Place Smoking
U.S. ? Yes 1,016.6
Italy 6077 Yes 1493.3
Iran 2234 No 936.5
China 1766 Yes 2,043
South Korea 94 No 1,667
Japan 35 No 1,583

Smoking no longer seems to explain China’s high deaths, Iran has less smoking but a high death count, and South Korea and Japan both have high smoking but a low death count.

Likewise, two of the high death rate countries, Italy and China, have a shelter in place policy, but the third Iran does not. Neither of the low death rate countries South Korea and Japan have a shelter in place policy. Japan’s slow but increasing growth rate remains a mystery but might simply be on a delayed slope.

This stresses the limits of this kind of simple pattern matching exercise. High death rates also cause shelter in place policies. South Korea which pursued early sophisticated containment was able to forgo the need to switch to a delay policy. A simple cross-sectional comparison of countries isn’t sufficient to tell us whether a shelter in place policy worked.

In sum, cross-national analogies are insufficient to answer this research question. What we need are the fine grained time-series cross-sectional data and simulations that empidimiologists are using to construct concrete counterfactuals to guide policy makers.

Lesson 8: Convey Uncertainty with Specificity not Doublespeak

The Epstein piece presents itself as if its communicating confidence and uncertainty, e.g. using the words likely/unlikely/probable/guess 9 times. There are mathematical ways of explicitly communicating uncertainty, and where that uncertainty lies in measurement, parameter, or prediction. There are even ways of communicating uncertainty to a qualitative audience, e.g. the CIA’s desperate mapping from probabilities to adjectives for lay policymakers.

There is also a wrong way to convey uncertainty which is to pepper your language with contradictory hedging and doublespeak that generates uncertainty in the reader’s mind about what you actually mean.

For example:

That estimate is ten times greater than the 500 number I erroneously put in the initial draft of the essay, and it, too, could prove somewhat optimistic.

Epstein’s first guess was immediately wrong, so here’s a new guess that might also be immediately wrong.

Perhaps my analysis is all wrong, even deeply flawed. But the stakes are too high to continue on the current course without reexamining the data and the erroneous models that are predicting doom.

The stakes are too high not to say completely flawed things!

Don’t do this. It’s not honestly conveying uncertainty, it’s attempting to cover your ass from being held accountable later.

Conclusion

The challenge in reviewing analyses like these lies in their incurious and insincere construction. They’re not an earnest search for scientific truth in themselves, they’re at best a Socratic call for others to point out mistakes and explain the state of the art. Epstein explicitly frames his approach to inquiry as the construction of arguments and not rigorous study.

I don’t care who’s a professional or not you don’t win an argument by showing you got a PhD you win an argument by entering into a public debate with people who disagree.14

That is not a viable or preferred way to collectively learn about COVID-19. It is an order of magnitude less effort to spam poorly constructed hypotheticals than it is to deconstruct them. This review took a substantial amount of time, and in the meantime the original piece was poorly revised, several interviews and a podcast were released, and a second post trying to cover for the first went live.15 More will no doubt soon continue to move the goal posts and argument. In a world where actual life or death policy analysis is being treated like a high school debate round, the only strategic move is to step back, slow down, and draw methodological lessons for our students and colleagues that will apply to a broad set of current and future analyses.

Epstein (2020a,2020a2,2020b) and analysis like it shouldn’t be rejected because the author is out of their lane. It should be rejected because it’s bad work. I’m not an epidemiologist either, but even as just as a lowly social scientist, I can show lots of different specific ways that the analysis fails to meet basic standards of scientific inference. An epidemiologist would have even more detailed empirically relevant issues to point out. In this current time of crisis, we should resist the urge to gate-keep and instead encourage honesty, curiosity, high standards, and good work. Even blatantly incurious and bad work can serve as a pedagogical tool to train young researchers what not to do. We need to take these opportunities to learn so that we are all smarter and better prepared for the next crisis.


  1. If you found this note useful, please consider donating a few minutes to contribute examples of government COVID-19 quarantine measures to the TIGR Project↩︎

  2. Director of the Machine Learning for Social Science Lab, Center for Peace and Security Studies, University of California San Diego↩︎

  3. Acknowledgments: I thank a good chunk of the NYU Law Class of 2013 for suggesting the subject of this review, and am grateful for many helpful comments and corrections from friends and colleagues.↩︎

  4. Can you imagine if this turned out to be right? Someone has to be holding the winning lotto number, and picking long odds outcomes can be high risk high reward.↩︎

  5. Typos are verbatim.↩︎

  6. And hilarious.↩︎

  7. This plot and sentence are the most sensitive to new incoming data because the U.S. is so early in its outbreak. Between runs, the U.S. average rate of increase in deaths has risen several percent to above the median and closer to Italy and China. A few countries with only a few days of data have pulled back from outliers to within the rest of the distribution. The reader is encouraged to run the R code themselves downloadable directly from this notebook and regenerate the figures based on the latest data.↩︎

  8. Some of these states entered their episode after the article was posted on March 16th. Where possible I will try to give leeway in that regard, but on the other hand one of the immediate consequences of cherry picking early data from one state is that you’ll be shown immediately wrong when a few more days of data come in.↩︎

  9. Left as an exercise to the reader to see how hard exactly you have to squint before you see “good news” from these data.↩︎

  10. The implied variable is unknown because neither the sentence nor linked NYT article provides an argument for why the U.S. health system is different or should perform better than the Italian health system. The implied theoretical variable remains a mystery. https://web.archive.org/web/20200319191049/https://www.nytimes.com/2020/03/12/world/europe/12italy-coronavirus-health-care.html↩︎

  11. I defer to actual epidemiologists to tackle this idea. My current understanding is that COVID-19 is asymptomatic in as many as 80% of cases, it’s not killing those hosts before it spreads. It’s also communicable up to a couple of days before symptoms do emerge in those who are symptomatic, it’s spreading before killing those hosts as well. And the people who it is killing are heading to hospitals first, where it’s being transmitted to healthcare workers at alarming rates. This strikes me as a basic lack of understanding about how fitness constraints encourage change in agents over time. Banking on it to occur in a timeline soon enough to avoid a peak in cases in the U.S., without citing a single empirical work, is willfully ignorant.↩︎

  12. Smoking coded as cigarette consumption per person per year from Wikipedia, https://en.wikipedia.org/wiki/List_of_countries_by_cigarette_consumption_per_capita ↩︎

  13. What counts as low is a moving target because the original piece predicted only 500 for the U.S. and said it wasn’t likely to reach the high number of China’s because China had more smokers. Now Italy’s high death count is the new high. But he forecasts 5,000 for the U.S. which would then make it the new high. But this is contrast to the NYT estimate of a million, so maybe these are all really low. In either case, there’s still no variation on the dependent variable. ↩︎

  14. In this friendly interview for Reason he brags about being able to win arguments through content and not appeals to authority, but a couple of days later in an unfriendly interview with The New Yorker he literally challenges the journalist to compare resumes, “You just don’t know anything about anything. You’re a journalist. Would you like to compare your résumé to mine?” Tells like this are very convenient signals for separating speakers who are sincere but wrong from speakers who are insincere grifters. https://www.newyorker.com/news/q-and-a/the-contrarian-coronavirus-theory-that-informed-the-trump-administration↩︎

  15. And the schema for the underlying COVID-19 data changed breaking a lot of code.↩︎

