One Million Attaboys
Why cool videos about computer drivers avoiding crashes have essentially no predictive value for safety.
People love to share video clips of an automated driving feature making a dramatic crash avoidance maneuver on the road. Look at that safety move! So cool! To be sure, they are cool, and it’s fine to celebrate those videos as a sign that the technology is so much more capable than it was 25+ years ago when I started working in this field. But don’t confuse those anecdotes with anything having to do with proving safety.
On a good day computer drivers can do amazing things, and it’s fun to point it out when it happens. But, those videos have almost no practical meaning for establishing safety. It is important to understand why, and decouple the false conclusion that “this means they are safe” from those anecdotal stories. A similar situation exists for videos of a single amazing robotaxi ride — cool, but not predictive of safety. The reasoning goes back to my military career, in which I became acquainted with the concept of an “Attaboy.”
At one point in my military career I was given a handsome photocopied certificate with the title “Attaboy” for having done something worthy of minor praise. (Wish I’d kept it so I could put a copy of it here, but I didn’t.) This was a pretty typical experience at some point in one’s career and it was appreciated at the time. But the certificate also contained a bit of standard military wisdom that has stuck with me to this day. The general wording varies, but those certificates say something like this: “You are hereby awarded ONE ATTABOY for _________. One thousand Attaboys qualifies you to be a leader of men <and so on>. Note: One Awshit wipes out one thousand Attaboys.”
Much military work is mission-critical, and this concept extends well to safety. In particular, the wins are nice, and celebrating them builds morale. However, the wins must be ultra-consistent to result in long-term mission success. Individual wins help the team, but are only very weakly predictive of avoiding mission failures in the long haul.
Rare but high-consequence losses are pretty much all that matter for safety outcomes.
It is a simple matter of numbers. Concentrating on the “saves” looks at the numbers backwards if what you want is safety. Adding a few saves makes no difference. Adding even one loss dramatically changes the story.
Consider a ballpark goal of matching a human driver rate of one fatality per 100 million miles of travel (including all the drunk drivers). Let’s say you deploy a robotaxi and after a few billion miles of experience you get statistically significant fatality data that can be summarized in two ways:
For every hundred million miles you have 99,999,998 fatality-free miles instead of the required 99,999,999. This is a super-tiny difference of 1 part in 100 million in non-fatality rates.
You got 2 fatal crashes instead of the 1 fatality per 100 million miles that was your target goal — double the fatality rate.
Every high-consequence crash matters. A lot. Boasting about how many crash-free miles is essentially meaningless.
To be useful, safety metrics need to focus on how often losses happen. Talking about millions of miles without a crash (10 million, or even 30 million miles) says nothing particularly useful about safety with regard to fatalities given that those numbers are a factor of 10 to 100 less than the number of miles needed to know how fatalities will turn out. (One might argue that low-severity crash rates could help predict high severity crashes, but given that computer drivers fail for different reasons than human drivers, I find that argument unpersuasive. Mass common-cause failures of computer-based systems cannot be ignored as a possibility, and it will only take one such problem to wipe out billions of miles of fatality rate budget. We’re just going to have to see how it turns out.)
To put things in context, if there is a video clip of a computer driver doing a nice job driving down the road, we might be inclined to give the system an Attaboy – praise for being amazing technology. Fair enough, because the idea of cars driving themselves down the road is truly amazing. To pick a number, perhaps we find reasons to award 1 Attaboy per 100 miles on average. (If it is more often than that, we need to start asking ourselves why the car is driving in a way that it needs to be saved so often! To be sure, this is a very approximate number.)
However, a single fatal crash for an AV is a big deal, because someone has died. Such events must be exceedingly rare to meet the average safety levels of human drivers. (Don’t forget that the average includes all the drunk drivers too!) If we call such an event an Awshit, then using the 1 Attaboy per 100 miles rate we get the ratio for automated vehicle features:
==> One Awshit wipes out a million Attaboys <==
With that kind of ratio, any individual example of safe driving has a vanishingly small, essentially insignificant statistical contribution to the outcome. Near-perfect safe driving (no fatality) is expected essentially all the time. The thing that matters is not the wins (the good miles), but rather the losses (the bad miles).
Drunk drivers are mentioned as being included in the human-based fatality rate that (unlike for automated systems) is available. Is another relevant point that currently the robotaxi companies are limiting their operation to low-speed zones, so the comparable fatality rate (for humans in the same operational design domain) is again even lower?
“Mass common-cause failures of computer-based systems cannot be ignored as a possibility…”
I agree, and think that this is a big point. If a human driver behaves in a way which leads to catastrophe, society limits blame to that driver, and does not presume that their actions reflect onto other drivers. (Hence the term “accident”.)
If a robot driver commits the same action, leading to catastrophe, what would distinguish that robot from all other robots, at least of a certain class? After the pedestrian dragging incident in SF, many/all Cruise vehicles were removed from service, not just the one which dragged the pedestrian. I don’t know that we label a robot harming a human as an “accident”. (How often does SkyNet, of Terminator fame, appear in your lectures? It appears in some of mine! 😁)
But Waymo was seen as distinct from Cruise, and let’s imagine there is some objective criteria by which Waymo is “better”. Should any AV class be allowed to operate if they are not the best? The pedestrian is not party to the market contract between AV provider and passenger, but still they face a risk over which they have no control. We have a legal system in place which addresses the heinous human actions of individual drivers, but I don’t see that carrying forward to individual robots. I posit that Asimov’s First Law of Robotics will describe the cultural, and eventually legal, norm.
This week I’m giving a speech regarding organizational dysfunction which led to minor (compared to AV) software lapses in the GM Ignition Switch and Boeing 737 Max MCAS crashes. Imagine a sociopath introducing a virus into a control system which randomly and rarely killed people. If caught, we have a means to deal with this person as an individual. But If organizational dysfunction of a similar magnitude is found, all products of that class are removed from operation. (GM and Boeing had recalls/groundings, and Boeing has yet to recover.)
Is there a viable business model in AVs? How would you calculate this risk?