Cubic fits and department of D’OH

by David Anderson| May 5, 20204:17 pm| 70 Comments

This post is in: Anderson On Health Insurance, COVID-19 Coronavirus

The first thing a data analyst trainee should learn is that playing with Excel’s functions and tools is a great way to get into trouble when you don’t have an underlying understanding of the fundamental data’s behaviors AND don’t understand the functions and tools core assumptions. This is important. The second or third lesson a data analyst trainee will learn is to not use Excel but that is advanced training.

Why does this matter?

It seems like the White House is using Excel and not understanding the phenonoman they are trying to model.

To better visualize observed data, we also continually update a curve-fitting exercise to summarize COVID-19’s observed trajectory. Particularly with irregular data, curve fitting can improve data visualization. As shown, IHME’s mortality curves have matched the data fairly well. pic.twitter.com/NtJcOdA98R

— CEA (@WhiteHouseCEA) May 5, 2020

Eyeballing the data, there sure as hell seems to be a day of the week seasonality. But let’s go beyond that.

If we were to assume that a cubit fit is an appropriate choice to model the data, and that we can project out of the current data to the near future so that there are almost no deaths on May 15th, that requires a What the Hell response.

We know that COVID-19 does not kill quickly.

A person infected today is unlikely to show up as a death until Memorial Day.

We looked at this basic math in March:

If we can safely add up median time from infection to symptoms and then symptoms to hospitalization, that sums to a back of the envelope span of 12 to 13 days.

We’ll have to add time from hospitalization to death. But this morbid math has a point. The next 7 to 10 days of deaths have almost entirely been baked into the cake as these are individuals who were infected before states started trying to open up again. We’re not going to get a reliable signal on mortality due to policy changes for at least another two weeks in the states that have been early and aggressive in re-opening.

For a cubit curve fitting exercise to be valid, we need to bring this basic mechanical reality into play. And that reality is that the people who are highly likely to die on or before May 15th are already infected. For there to be no deaths on May 15th, we basically need no one to have been infected after April 27th or so.

This is basic data analysis guidelines — understand the fundamental phenomenon you are trying to model while also understanding the modeling assumptions of the tools being used. The White House decision making systems are disregarding both tenets of basic data analysis.

« Every Day Ends in Wasted Motion

GOP Stupidity Open Thread: Party of Lincoln Disinherits Itself »

70Comments

1.

Mike J

May 5, 2020 at 4:25 pm

Is a cubit fit always describing gopher wood?
2.

Roger Moore

May 5, 2020 at 4:28 pm

As someone else pointed out, some of the IHME projections they’re comparing themselves to are at least a month old. It’s a lot easier to look good when you’re comparing yourself to projections from that long ago. Even worse, the IHME projection from late March is both a better fit to the data and less optimistic about when the death rate will zero out than the CEA projections. Not to mention that the IHME projections have already been severely criticized for assuming a rapid fall-off in deaths that doesn’t match experience. Seriously these guys deserve to be laughed out of the room.
3.

Ryan

May 5, 2020 at 4:28 pm

There may be a more obvious clue available for the layman.

https://en.wikipedia.org/wiki/Dow_36,000
4.

Shalimar

May 5, 2020 at 4:33 pm

What really stunned me about that chart when CNN showed it during Hassett’s interview was that it was updated today. I had been giving him a small benefit of the doubt and assuming his incredibly stupid May 15th projection was made a week or two ago. I was naive.
5.

moops

May 5, 2020 at 4:33 pm

I’ve been surprised really that none of the machine learning/deep learning tools haven’t been slapped at this forecasting thing yet. As for the “cubic” model: your model is supposed to be derived from first principles of some kind, and result in free parameters, which are then derived by fitting to data, then assessed for quality. So, what infection process is captures the the coefficients of ax^3 + bx^2 +cx + d? what does ‘a’ represent? well, it seems if you ever want to get back to zero then a had better be negative. If we go back in time we had no COVID deaths, so I guess d better be zero. and so on. I don’t think there is anything meaningful here. the initial rise in cases can look like a polynomial, that’s about as smart as this looks. No polynomial model is going to have outbreaks and resurgent cases and long plateaus and all the things we know happen in outbreaks.
6.

rjm

May 5, 2020 at 4:34 pm

Obligatory Kevin Hassett curve fit disaster slam, in case BJ hasn’t done this to death yet

https://www.discovermagazine.com/the-sciences/best-curve-fitting-ever

https://images.ctfassets.net/cnu0m8re1exe/7gVG4hcR69xKOsWI7IZsZa/233355c368e66c5ec2c788f27d4ea142/thoma1.png?w=650

Yes! really published in teh WSJ. What a maroon
7.

jl

May 5, 2020 at 4:35 pm

Reliable statistics and models that use them, for control versus just observation and forecasting in the absence of attempts at active control, require a very sound causal model behind the statistics. Every statistical model, no matter how sound, has some blind by-guess-by-gosh curve fitting buried someplace down in the inner guts of the model. But, you want to keep that to a minimum.

Problem is that when you use statistical models to try to control a process, there are always many causal models consistent with the statistics. Some causal models that you can’t rule out with the stats will mean your control efforts will work, others mean that your attempts at controlling the process will flop. A sane and good faith effort at forecasting effect of control policies, past present and future, aims in a good faith way, to minimize the chances of that latter possibility.

Looks like the ‘cubic’ model violates all those considerations, and also common sense.

Edit: thing to remember that a fast and furious reopen, is a control policy just as much as a shutdown. It is the assertion that if some signal shows that reopening won’t prompt another big epidemic wave, then you can actually reopen, and won’t be another big wave. For Trumpster/GOP a good signal seems to be that the epidemic is not raging like a wildfire destroying everything in its path faster than you can keep track. Which, by common sense standards, seems a tad odd.
8.

VOR

May 5, 2020 at 4:35 pm

It does explain why Trump thinks the virus is just going away. He is being shown charts where cases just stop May 15. Magic!
9.

Roger Moore

May 5, 2020 at 4:38 pm

@VOR:

I think you have your cause and effect backward. This chart was made to support Trump’s belief; it wasn’t the cause of it.
10.

sukabi

May 5, 2020 at 4:39 pm

@VOR: more than likely he’s demanding visuals that support his belief in the magically disappearing virus theory, which is what this graph does.
11.

sukabi

May 5, 2020 at 4:40 pm

@Roger Moore: yep. I owe you a fizzy beverage.
12.

VOR

May 5, 2020 at 4:41 pm

@sukabi: Almost certainly correct. They are responding to the whims of the Great Leader and giving him what he wants. And then Trump says “See! I was right, my top advisors agree”.
13.

Hunter Gathers

May 5, 2020 at 4:44 pm

Watch out, you’re playing with fire.
There are countless dipshits in the workforce whose only hope of continued employment rests on the perception that they are total, complete badasses in Excel.
14.

jl

May 5, 2020 at 4:48 pm

@Hunter Gathers: Fire is a good analogy to how epidemics work. Often a long period of smoldering that you might not even notice. Then when enough very high heat from fire you don’t see, interfaces with enough fuel, it explodes. The the fire burns until not enough fuel left to keep the fire going.

So, in the absence of an effective control policy (fire extinguisher) saying there are just 15 cases, or 10, or 5, and that is a small number and that will go away, is like saying there is just a small flame at the bottom of your living room drapes, and no reason to worry, it will go away. Unless you are very bizarrely lucky, the fire won’t just go away.
15.

Boris Rasputin (the evil twin)

May 5, 2020 at 4:50 pm

I’d figured Billy Jim Blob is out there now, getting his “I survved the Crovid-19 Hoax” tattoo today. If I understand correctly, he’d better show it off fast, as he’ll be dead on Memorial Day.

The question that remains is: Does he live in Texas, or Florida?
16.

Cam-WA

May 5, 2020 at 4:51 pm

@Mike J: “What’s a cubit?

https://m.youtube.com/watch?v=CgsFCyD4nEw&feature=emb_logo&ebc=ANyPxKrTLi4pVt3yP0T6bOyOYlgWKgZ5od8WENntsv0qoW4DQw91A38XCEMwVlMlcin0K379eVVacvAOLTxaMbxwE-8G2wbysg

(Back from when he was actually funny and not a serial abuser)
17.

lashonharangue

May 5, 2020 at 4:53 pm

Hassett, Kudlow, and Laffer are prime examples of Krugman’s hack gap.
18.

rikyrah

May 5, 2020 at 4:53 pm

Just look at the headline…

412 asymptomatic workers at western Missouri food plant test positive for coronavirus

BY LUKE NOZICKA

MAY 05, 2020 02:38 PM, UPDATED 1 HOUR 12 MINUTES AGO

More than 400 workers at a St. Joseph food plant have tested positive for the new coronavirus despite showing no symptoms, health officials said Tuesday.

Comprehensive testing of employees and contract workers at Triumph Foods in St. Joseph found 412 of 2,367 people had COVID-19 with no symptoms, according to the Missouri Department of Health and Senior Services.

https://www.kansascity.com/news/coronavirus/article242515976.html?utm_source=pushly&intcid=%7B__explicit:pushly_525458%7D
19.

glc

May 5, 2020 at 4:53 pm

The IHME model is pretty silly as well (at least in the form where (a) it gives a number for total deaths; (b) has a public-facing interface; (c) is constantly updated by recent data to look plausible; and (d) doesn’t distinguish between the Chinese approach to shelter in place and anyone else’s). This seems to be widely recognized now.

Even a straight line can be good for short term projections, the cubic tends to add at least visual plausibility and possibly some things present in the data. But curve fitting is not, by itself, modeling, and it helps to use at least some information about what one is trying to model, somewhere in the process. And epidemiological models actually do that.

The general theory is that if the curve is sufficiently well-behaved – a strong assumption! – it can be approximated by a power series, in which case taking the first 4 terms may be better than taking the first 2. However, when making judgments that can get people killed, one should probably read the warning labels – whether bleach or software.
20.

Jinchi

May 5, 2020 at 4:53 pm

To better visualize observed data, we also continually update a curve-fitting exercise to summarize COVID-19’s observed trajectory. Particularly with irregular data, curve fitting can improve data visualization. As shown, IHME’s mortality curves have matched the data fairly well.

I’m amazed they bothered to post the figure with that tweet. The (3/27) IHME model performs better than the (4/5) update and about as well as the (5/4) version. The (5/5) cubic fit model is , well, the best ‘cubic fit’ to the data up to today (5/5). It looks like it will fail tomorrow, though.

If your standard of success is the best fitting curve to the data that you included in your model, why not simply use the data itself.
21.

Cheryl from Maryland

May 5, 2020 at 4:55 pm

As my husband said about my mathematically challenged supervisors — Just because you can put a number on it doesn’t make it math.
22.

bbleh

May 5, 2020 at 4:57 pm

@Roger Moore: Concur; your causal model is sound.
23.

dmsilev

May 5, 2020 at 4:58 pm

The second or third lesson a data analyst trainee will learn is to not use Excel but that is advanced training.

Truth.

The Post had a big article earlier today about how Jared’s task force is failing miserably (surprise!) because basically he brought in a bunch of McKinsey etc. management consultant types rather than people with actual expertise. Excel jockeying, combined with what I’m sure are very impressive PowerPoint skills.
24.

David Anderson

May 5, 2020 at 5:03 pm

@dmsilev: I would also imagine their SharePoint was on Fleek and the tableaus look pretty.
25.

lollipopguild

May 5, 2020 at 5:09 pm

Trump is going to write an executive order telling the virus to go away and not come back.
26.

jl

May 5, 2020 at 5:10 pm

@glc: The initial IHME model assumed that from start to finish, the epidemic curve would act like a symmetric bell shaped normal distribution, and that a control policy that had the same effectiveness over time, would be put in place and maintained throughout the epidemic. All those assumptions were sketchy for a large epidemic with a new disease and a lot of unknowns, as is the case for covid-19.

If the IHME group can’t abandon some of those basic assumptions, the model might look OK in terms of constant fiddling to make its curve fit the data, but I don’t see how it will be good at forecasting.

As I noted in previous thread, we are in the stage of the epidemic where it is very difficult to produce forecasts that look good. Statistical forecasting in situations where you are constantly imposing new policies to try to control the process, works very differently than when you are just observing a process and not trying to control it.

I think need to check not only with the fancy pants math and stats and epi modellers, but also with people with a lot of very practical field experience in using those models to control a real epidemics. They rely on models, and respect the models, but they use them and evaluate them in a very different way than most of us, and that includes me, who have never faced the challenge of the very hard risky and exhausting work of trying to control a dangerous process in real time. Been that way for 100 years now. Malaria control seemed hopeless until a person who had a lot of real world on the ground experience in trying to control it, could also come up with a mathematical and stats model to explain to himself what he was trying to do. that was Ross.

You always need to check with the epi, medical, and public health workers doing the WWI trench warfare control work in the real world, I think.
27.

Jay

May 5, 2020 at 5:17 pm

He was pressured to invest in drugs and vaccines that lacked scientific merit, because the people selling them had friends in the Trump administration, up to and including the president’s son-in-law, Jared Kushner. He was forced to transfer funds to acquire drugs for the Strategic National Stockpile, America’s most important reserve of lifesaving medications, based not on health needs but on “political connections and cronyism.” He was instructed to use his department’s budget to purchase flu medications of questionable efficacy. And when the COVID-19 crisis erupted, he was pressured to approve a plan that would “flood” cities with unproven and untested doses of chloroquine drugs, from uninspected manufacturing plants in Asia. When his efforts to work through the system failed, he decided he had a “moral obligation to the American public” to ring the alarm about the plan, “which he believed constituted a substantial and specific danger to public health and safety.” In retaliation, he was “smeared,” with officials unfairly accusing him of dropping the ball on vaccine development and PPE preparation.

These are just some of the allegations contained in a blistering, 63-page complaint that Dr. Rick Bright, former head of the Biomedical Advanced Research and Development Authority (BARDA), filed today with the U.S. Office of Special Counsel. (Vanity Fair has submitted requests for comment to the White House, the Food and Drug Administration, and the Department of Health and Human Services, and will update this article with any responses.)

https://www.vanityfair.com/news/2020/05/whistleblower-complaint-rick-bright-blasts-team-trumps-pandemic-response
28.

Mallard Filmore

May 5, 2020 at 5:23 pm

rjm @6 … does the tax-revenue chart mean we should be like Norway?
29.

Mom Says I*m Handsome

May 5, 2020 at 5:36 pm

@jl: The initial IHME model assumed that from start to finish, the epidemic curve would act like a symmetric bell shaped normal distribution

I’m not an epidemiologist or biologist or statistician or modeler, but even I know that a bell curve is a fucking terrible choice to model a pandemic. (Most of my technical expertise is from the physical sciences, so I’d lead with a nice exponential decay model & I’d be more right than these glory hogs.)
30.

Poe Larity

May 5, 2020 at 5:45 pm

Mission Accomplished

WASHINGTON – President Donald Trump’s administration is considering plans to wind down its Coronavirus Task Force as early as this month, a major shift in the White House strategy for responding to the greatest health crisis in a century.

Vice President Mike Pence, who has led the group since it was created in January, said Tuesday thatthe work of the group will be transferred to other parts of the government, including the Federal Emergency Management Agency.
31.

BruceFromOhio

May 5, 2020 at 5:46 pm

… understand the fundamental phenomenon you are trying to model while also understanding the modeling assumptions of the tools being used.

Recall this is the Tax Cuts Will Pay For Themselves gang.

Also – if you can’t or won’t post your data sets and methodology, your results are meaningless. My Magic 8-Ball tells me this is so.
32.

jl

May 5, 2020 at 5:51 pm

@Mom Says I*m Handsome: The infectious disease epidemic diffyQ equations are very similar to pharmacokinetics, many chemical reactions. You have one or two compartments with a nonlinear reaction process where two stocks interact, and then exponential decay processes that govern flows into and out of other compartments.
33.

Fair Economist

May 5, 2020 at 5:51 pm

@Mom Says I*m Handsome:

I’m not an epidemiologist or biologist or statistician or modeler, but even I know that a bell curve is a fucking terrible choice to model a pandemic. (Most of my technical expertise is from the physical sciences, so I’d lead with a nice exponential decay model & I’d be more right than these glory hogs.)

Surprisingly, a bell curve is a pretty good model for a completely uncontrolled epidemic. As the disease spreads, more and more are resistant and the spread slows, and then reverses. It’s pretty symmetrical for a moderately contagious disease.

It can be an acceptable process for a controlled disease where additional controls are added over time and never removed. And there, precisely is why it’s a total disaster at modeling COVID-19. Once the epidemic is contained, we *don’t* keep adding restrictions, we start taking them away, or ignoring them. And so the overall epidemic is very UNsymmetrical, because the declining tail is much longer and more stretched out than the initial spread. It’s even possible to have additional peaks later (already happened in several countries), which is totally beyond the capabilities of any hump-based model.
34.

Jinchi

May 5, 2020 at 5:52 pm

@moops: machine learning would fail because it’s incapable of mimicking malicious stupidity.
35.

jl

May 5, 2020 at 5:55 pm

@Fair Economist: Good for small epidemics, and long right hand tail can be more or less ignored for uncontrolled epidemics. If you want to use a symmetric distribution, other curves like hyperbolic secant or logistic better, if you need to take account of the long tails describing the long smoldering lead up and then long wind down as dynamics damps down to equilibrium (either disappearance of endemic equilibrium).

The people who came up with the IHME model have a long and very successful history of modelling chronic disease and endemic infectious diseases that are near equilibrium. I don’t know that they ever worked with an acute epidemic with explosive dynamics before. I don’t think they did due diligence in checking with theoretical and practical people who have worked in that area.
36.

jl

May 5, 2020 at 6:02 pm

@Fair Economist: And I think what you say about the long tail asymmetry after peak in the knock on process is an especially big problem for fraction of cases that move on to hospitalization and ICU. And the certainty that those resources will be maxed out is the one huge reason that extreme control measures are justified. The rapidity of approach to peak, and the height of the peak swamps resources, and takes a very long time to play out and die out. The externality if infectious people walking around spreading the disease is just so huge and has such dire consequences for others, that it needs to be just shut down with extreme measures after epidemic gets to an explosive stage.

If we can stay away from that situation, then we can pay some serous attention to arguments about when OK to let people take their own chances in getting the disease. Most regions in the US not there yet.
37.

glc

May 5, 2020 at 6:04 pm

The IHME refers to their so-called model as a statistical model – but there is no relevant underlying theory. Getting best fit to a curve with a few free parameters is not, in itself modeling. They are on the one hand open about the fact that they are not using an epidemiological model, and at the same time they are promoting their “model” aggressively for use in an epidemic. The nuances are quickly lost.

Anyway I try not to look at them or complain about them anymore (hopefully this is my last excursion on the subject), but they have taken up public space that could have been put to better uses. And I think more people are going to die in part because of the way they have conveyed their projections.
38.

Jinchi

May 5, 2020 at 6:04 pm

@jl:
Statistical forecasting in situations where you are constantly imposing new policies to try to control the process, works very differently than when you are just observing a process and not trying to control it

Right. Especially since policy people will judge you on your top line number (e.g “We predict 60,000 deaths’, versus ‘We predict 60,000 deaths if everyone shelters in place for the next 12 weeks, testing is widely available, clusters are isolated rapidly and idiot governors don’t reopen days after passing the initial peak’)
39.

cain

May 5, 2020 at 6:09 pm

@rjm:

We should maybe also bring in Art Laffer – he can show how this fits with the laffer curve.
40.

jl

May 5, 2020 at 6:14 pm

Hey you dang kids, you can make a little metabolic chemical explosive epidemic inside you. Probably not good to try at home, unless you like sudden death, brain damage or irreversible loss of sight. Same type of processes: Check out this youtube channel from a medical doc:

Chubbyemu

Throw a bottle nutmeg into a protein shake and chug it, chug moonshine, do extreme exercise until your muscles fail and then force your self to keep going…
41.

lumpkin

May 5, 2020 at 6:23 pm

David, do you really think they were trying to model the data? I think you are too kind. A more plausible explanation for Hasset’s murderous “cubic model” is that it’s more than sufficient to bamboozle trump into thinking the virus goes away by mid May so let’s open the economy. And the national press, being only slightly less gullible than the president can talk about the white house cubic model as though it has some actual merit.
42.

rjm

May 5, 2020 at 6:23 pm

@cain: Yeah Laffer could fit a curve that would really cut the death toll.

I’d seen a more complete takedown (as if it was needed) of the WSJ graph, and it turns out the Norway data point was manipulated by including carbon tax income along with corporate tax which moved it much higher than the rest of the data.

https://www.bradford-delong.com/2017/07/paul-gigot-and-kevin-hassett-monday-smackdownhoisted-from-2007-the-most-mendacious-graph-the-wall-street-journal-ever-publi.html
43.

JaneE

May 5, 2020 at 6:25 pm

When I heard that zero deaths by May 15th, the first thing I did was check the last few days number of new cases. It was over 25,000. Nope, not going to zero.
44.

?BillinGlendaleCA

May 5, 2020 at 6:28 pm

@cain:

We should maybe also bring in Art Laffer – he can show how this fits with the laffer curve.

Tax cuts will make the virus go away, it works better if they’re capital gains tax cuts.
45.

jl

May 5, 2020 at 6:33 pm

@?BillinGlendaleCA: OK, now I finally do believe that you do have serious training in economics. That’s the way we were trained to think!
46.

Baud

May 5, 2020 at 6:39 pm

I’m heartened that the model shows that all the dead people will come back to life.
47.

prostratedragon

May 5, 2020 at 6:40 pm

@rjm: That graph is indeed a “laffer.” Don’t know how I missed it at the time, especially since I was reading DeLong more frequently then.

Pro tip: if the fitted curve is the outer envelope of the data points, there’s something fishy going on. The fit is supposed to be in the nature of an average.
48.

Baud

May 5, 2020 at 6:41 pm

@?BillinGlendaleCA:

Do you think tax cuts alone will be enough this time, or should Congress also mandate a medically unnecessary vaginal ultrasound?
49.

zzcool

May 5, 2020 at 6:42 pm

Do you think it was a specific choice someone made to have ‘actual fit’ look like Trump drew it himself with a sharpie?
50.

jl

May 5, 2020 at 6:43 pm

@Baud: I personally think giving me all the money will be an effective policy to Solve All Problems. I have my ‘quartic’ model that produces fantastic forecasts

Edit: I also have scientific proof that Baud deserves a nice cut of the loot if the virtual eminence provides strict geometrical logic in support.
51.

Roger Moore

May 5, 2020 at 6:45 pm

@Hunter Gathers:

There are countless dipshits in the workforce whose only hope of continued employment rests on the perception that they are total, complete badasses in Excel.

Honestly, Excel is a pretty useful tool. It has limitations, but somebody who knows a lot about data analysis can use Excel to do some fairly sophisticated stuff. The thing to be wary about is that serious data analysis people generally prefer more sophisticated tools than Excel, so use of Excel is often a warning sign that the user is not as good at data analysis as they think.
52.

?BillinGlendaleCA

May 5, 2020 at 7:04 pm

@Baud:

should Congress also mandate a medically unnecessary vaginal ultrasound?

As long as they don’t get in way of the tax cuts, they’re ok.
53.

?BillinGlendaleCA

May 5, 2020 at 7:06 pm

@Roger Moore:

use of Excel is often a warning sign that the user is not as good at data analysis as they think.

I think it’s also a sign they went to B-School.
54.

?BillinGlendaleCA

May 5, 2020 at 7:07 pm

@jl: Did you know that tax cuts will cure the clap and erectile disfuction?
55.

Obvious Russian Troll

May 5, 2020 at 7:12 pm

@moops: I’m sure somebody is going to try it or more likely is in the process of trying, but that doesn’t mean they’re going to get results that are any better than this guy.

I will confess that I am cynical about the current state of AI and machine learning. It works, but not for all problems; I think people are going to waste a lot of money throwing AI and machine learning at problems where it’s a bad fit.

The shitty testing data we have is unlikely to help, I suspect.
56.

jl

May 5, 2020 at 7:12 pm

@?BillinGlendaleCA: I’ve pointed out repeatedly that I get ads on BJ for fat, balding, broke ass deabeats with cars that don’t run, who want to solve all their problems with zero effort. I don’t know what kind of ads you get.
57.

Croaker

May 5, 2020 at 7:23 pm

@Jay: The good Doctor needs to hang w the rest.
58.

?BillinGlendaleCA

May 5, 2020 at 7:24 pm

@jl:

I’ve pointed out repeatedly that I get ads on BJ for fat, balding, broke ass deabeats with cars that don’t run, who want to solve all their problems with zero effort.

Have you tried at Tax Cut™?

(Tax Cut™ is a registered trademark of the Republican National Committee.)
59.

Another Scott

May 5, 2020 at 8:00 pm

@dmsilev:

So, did an Excel coding error destroy the economies of the Western world?

Excel – Is there anything it can’t do??!

Cheers,
Scott.
60.

Roger Moore

May 5, 2020 at 8:05 pm

@?BillinGlendaleCA:

Honestly, I know a lot of scientists who use Excel. I use Excel for a lot of stuff. I know how to program in several languages, but often if I want to do some quick calculations and put together a graph or two, Excel is the easiest way of doing it. It’s also nice to demonstrate some data analysis techniques, because the tables make it easy for people to see what’s happening to their data and really easy to see how things change if some of the numbers change. Horses for courses.
61.

?BillinGlendaleCA

May 5, 2020 at 8:28 pm

@Roger Moore: I’ve used Excel since it came out for Windows back in the late 80’s.
62.

LongHairedWeirdo

May 5, 2020 at 8:29 pm

The first thing a data analyst trainee should learn is that playing with Excel’s functions and tools is a great way to get into trouble when you don’t have an underlying understanding of the fundamental data’s behaviors AND don’t understand the functions and tools core assumptions.

Absolutely. But some time before then, there should be a glimmering of “the deaths of millions of people could depend on this; so I should be absolutely sure that this *might* work, at least, hypothetically.”

Sometime after the beginning, you should learn not to do a cubic fit *past* the endpoint of a function’s data, if that function is not a cubic. Sure, if you have data to May 5 and hope to extrapolate May 6, it *might* work. But the further you go, the more you lose..

A cubic fit is not bad for some things, but it’s best used for extrapolating data *between* known points. For example, if you didn’t have direct reporting of cases/deaths on Saturday and Sunday, a cubic fit estimate would let you estimate deaths and cases on Saturday, and on Sunday, extrapolating from the previous, and following, days.

Now, having see this, I see that we’re dealing with the worst sort of moron.

Generally speaking, a curve fitting match both the endpoint, *and* the first derivative (rate of change).

There’s a dip in the final number – that means a cubic fit will show a decreasing value, and, by the nature of cubics, will probably go toward 0 as it does on the graph.

So, this is a moron who didn’t understand the tool he was using, didn’t understand the type of data he was trying to model, and didn’t understand that a cubic fit would never be good for predicting the future, even if it was a decent model, of data that was well understood. And, let’s face it: not one person reading this is surprised that he has the ear of Trump.
63.

LongHairedWeirdo

May 5, 2020 at 8:34 pm

@Another Scott: Not a coding error, and not from Excel. The tool worked fine, but this situation had the tool being used like a hammer being used to loosen a bolt. It was the wrong tool, used in the wrong way, and anyone who knows anything about tools and nuts/bolts knows that it’s wrong.

So much winning. SAD!
64.

moops

May 5, 2020 at 9:01 pm

@Baud: well, on May 15th we will start having “negative dead”, I would state that these are undead. So….. zombie apocalypse is how we end our pandemic. By September there would be more zombies than people.
65.

schrodingers_cat

May 5, 2020 at 9:46 pm

@Roger Moore: Agreed. Excel is a pretty power tool for data analysis.
66.

Another Scott

May 5, 2020 at 10:03 pm

The Acting Chair of the President’s Council of Economic Advisers resorts to personal insults after being caught trying to grossly mislead the American people about the likely death toll of the COVID-19 pandemic. https://t.co/YHETkAMVPT

— Rep. Don Beyer (@RepDonBeyer) May 6, 2020

This is my shocked, shocked face.

Cheers,
Scott.
67.

LongHairedWeirdo

May 5, 2020 at 11:07 pm

@Another Scott: You know, this reminds me of a comment I saw about the author of Liberal Fascism, that “it will be a perpetual source of pain to many that he will never realize just how stupid he truly is.”

You know, folks who’ve said that it’s a big deal with Excel being at fault here? I’ll grant you this much; Microsoft did give the proverbial pack of matches to the child in the wooden shed.

I’ll grant, my background might affect my point of view here, but: I can’t imagine having me, personally, look at that *beautiful* graph, showing just what we want, and not finding someone, either at Microsoft, or a PhD who’s really good at modeling, and saying “How does Excel do this? Is it applicable?”

I mean, isn’t it a given that confirmation bias is the worst sort of wishful thinking, and the first place where you *have* to check *all* of your assumptions? Getting the right answer, almost by magic, should make a person deeply suspicious. And the thing is, you don’t even *need* an expert – I have “only” a master’s degree in math, I did a search on “cubic spline”, and didn’t need my old Numerical Analysis text to piece together the problems. The most critical problem: this sorts of models always depend on the slope used for the final endpoint, and you can eyeball how the “actual data” from the graph is on a downslope at that point, and the reason why the model goes to 0 is that is what downward sloping cubic functions usually do.

(The other method I know of for setting the slope is to set it to 0 – which is a really good way of illustrating how you shouldn’t extrapolate beyond your data!)
68.

Another Scott

May 6, 2020 at 12:07 am

@LongHairedWeirdo: I don’t think anyone here is seriously blaming Excel – even for the “spreadsheet that destroyed the world” thing that Krugman was talking about. We know it’s not the tool used, here.

(Even experts can go crazy with, say, peak-fitting software (“Look – I extracted 7 peaks from this bumpy, noisy curve!!”) – one has to step back and think about the limitations of the data and the models.)

In my case, I’ve had a personal, er, distaste for Microsoft going back to the DOS days. It’s “fun” to pick on them, also too.

HTH! ;-)

Cheers,
Scott.
69.

SW

May 6, 2020 at 9:29 am

The “Z” axis is pointed at you!
70.

Dupe1970

May 6, 2020 at 10:33 am

@rjm: Oh the curves I can build when I take two outliers that rely heavily on petroleum to fund their gov’ts…..

Comments are closed.

Putin must be throwing ketchup at the walls.

I’d hate to be the candidate who lost to this guy.

At some point, the ability to learn is a factor of character, not IQ.

There are no moderate republicans – only extremists and cowards.

“Cheese and Kraken paired together for the appetizer trial.”

Relentless negativity is not a sign that you are more realistic.

Come on, media. you have one job. start doing it.

Biden: Oh no. We’ve upset Big Pharma again.

…and a burning sense of injustice to juice the soul.

Every reporter and pundit should have to declare if they ever vacationed with a billionaire.

There is no compromise when it comes to body autonomy. You either have it or you don’t.

Republicans got rid of McCarthy. Democrats chose not to save him.

This year has been the longest three days of putin’s life.

A sufficient plurality of insane, greedy people can tank any democratic system ever devised, apparently.

The most dangerous place for a black man in America is in a white man’s imagination.

You cannot shame the shameless.

Whatever happens next week, the fight doesn’t end.

Fani Willis claps back at Trump chihuahua, Jim Jordan.

Nancy smash is sick of your bullshit.

Republicans: slavery is when you own me. freedom is when I own you.

Shut up, hissy kitty!

There are a lot more evil idiots than evil geniuses.

‘Museums aren’t America’s attic for its racist shit.’

Everything is totally normal and fine!!!

Cubic fits and department of D’OH

Reader Interactions

70Comments