Somebody posts this comment somewhere on Fb and I see it in my feed: “Because you are more likely to be killed by lightning then police.”

To which I immediately call bullshit. I don’t find this absurd because of the specific nature of the things being compared (police- versus lightning-caused deaths), but because of the stupidity of the statistics involved. Note that this is a post about statistics, not police killings.

The attached graphic shows NOAA data ( on lightning deaths by state for a period of 10 years from 2005-2014. You can see that in Delaware, New Hampshire, North Dakota, Washington DC, Alaska, Hawaii, Puerto Rico, the Virgin Islands, and Washington there are no lightning caused deaths. This means that if there were any killings by police in any of those states during any year in that time frame, YOU WERE INFINITELY MORE LIKELY TO BE KILLED BY POLICE THAN BY LIGHTNING in those states during that timeframe.

lightning deaths by state 2005-2014

Wow. Infinitely more likely…that’s major. Except, it probably isn’t because it’s not descriptive of anything that people should actually care about nor is it at all predictive. For example (and this is just a hypothetical example that is meant to be plausible but is not described herein as true), if you live in Florida but you make a point of never being outside when there is a storm, then you are probably never going to have any chance of dying from a lightning strike.

People generally seem to think that: if X people die from some activity (in some place, for some period of time), and there are Y people in the target demographic, then X/Y is the probability of some other person dying from the same activity. But this is not at all what anybody actually cares about (and it’s wrong, but more on the wrong-ness in the next paragraph). Because as individuals at risk of death we don’t care about the aggregate; in as much as we think about it at all, we care about our own particular situation in any given place at any given time.

The notion that X/Y is the probability of some other person dying from some cause for which X people did die is completely wrong. Why is it wrong? Part (or all) of the reason most (or all) death rate statistics are wrong/meaningless, is that we have absolutely no notion of the frequency of attempts of the death causing activity on any particular individual’s life. If you get a kick out of playing golf during thunderstorms in Florida, regardless of whatever X/Y is, you’re more likely to get struck by lightning than anybody who stays inside during storms. As a scientist, I want to specify all of the parameters that go into a particular calculation (X/Y). As soon as I do that for an individual death, however, what I discover is that I have described only how an individual died. X/Y no longer tells me anything about how some other person in some other situation (similar though it might be) may die.

-JD Cross