All Robotaxis Have Remote Drivers
The challenge is how to minimize their involvement while remaining acceptably safe.
So much hype has been spread about robotaxis completely getting rid of human drivers. That is not going to happen this decade, and probably not the next decade either. Rather, the challenges will be reducing the time a human driver needs to be involved and ensuring safety when that happens. Reduce the human involvement enough and you might be profitable. Reduce it too much or in the wrong way, and you might go out of business after a high-profile crash.
The reason we will need human drivers for the foreseeable future is simple: machine learning-based technology is vulnerable to rare but high-consequence edge cases. Newer developments such as the use of foundation models might help chase further into the heavy tail of edge cases, but they won’t get us to acceptably safe all on their own anytime soon. Barring some major autonomous driving innovation, we’re going to need humans to clean up the loose ends for the foreseeable future.
The story of robotaxis is not really the story of making cars that can drive themselves. That part was 98.2% handled thirty years ago. That last 2% is a real bear — one that is still being worked on today — currently at 98.52%.
The story of robotaxis is the gradual weaning of them from human driver oversight from 2% down to essentially zero percent. First by letting people steer with automated speed control (which apparently dates as far back as 1948). More recently, lots of car owners have taken to watching while the car steers itself. In a few cars the driver can look away from the road with a promise from the car to call them back when needed. (In other cars the drivers look away from the road for extended intervals even though they are not supposed to; sometimes with catastrophic results.) With some vehicles the driver can be telepresent from a remote location to perform all or just some of the driving task.
We’ve been at this removal of human oversight for more than a century now. Anyone who thinks this will somehow, magically, be 100% solved next year is delusional. Yet somehow people seemed shocked when it was revealed that a robotaxi company needed frequent human driver intervention. This was treated as something of a scandal for Cruise — but it shouldn’t have been. Other companies do this too, and for good reason. (Waymo and Cruise both coyly call their remote support personnel “remote operators”. But they are drivers in a practical sense. We’ll get to that.)
Below I identify five challenges in this area. All five need to be resolved if we are to have robotaxis or robotrucks operating safely at scale.
Knowing when the robotaxi does not know what to do
At some point, any computer driver will encounter something it cannot handle. That on its own is not necessarily a safety issue, although the frequency with which this happens will have a significant influence on the viability if its operational business model.
However, if the robotaxi does not know it is outside of its area of competence, truly bad things can happen. If it fails to detect a pedestrian because of an unusual clothing color and does not detect that particular color is outside its pedestrian clothing training data scope, the vehicle might simply not see that person. Bad things can happen from there.
Challenge #1 is ultra-reliable detection of situations the system has not been trained to handle.
The Achille’s Heel of machine learning is being over-confident in situations it does not detect as being beyond the scope of training data. Situations that cause this issue are commonly known as edge cases. Another edge case example, which has now bitten two robotaxi companies, is encountering freshly poured concrete with construction zone markings that are not sufficiently close to training data to guide the robotaxi away from plunging into the mess.
Reacting to a situation it cannot handle
Once the system detects it is not designed to handle a situation, it needs to respond to ensure safety. It is OK for a robotaxi to not be able to continue operation when encountering edge cases. But it must respond in a way that avoids unreasonable risk.
An example of getting this wrong is the Cruise robotaxi pedestrian dragging mishap, in which a robotaxi did not account for an entrapped pedestrian, dragging a woman underneath it after having collided with that same woman. What was intended to be a vehicle movement to improve post-crash safety ended up causing serious further injury to the entrapped pedestrian.
Challenge #2 is responding safely to untrained situations. This is nearly paradoxical, and yet is required for safety.
Knowing when to ask for help
As part of the reaction to a situation it cannot handle, the robotaxi must determine if and when to ask for remote human assistance. There are various approaches to this with different tradeoffs.
Asking for help more often than needed requires additional remote assistance resources, increasing costs and potentially delaying trips. But failing to ask for help risks doing something dangerous under automated control. Metrics kept on how often a robotaxi asks for help are counterproductive to safety if there is pressure to minimize the number of remote assistant requests.
Again referring to the Cruise pedestrian dragging mishap, the robotaxi did not wait for remote help before initiating the dragging portion of the mishap.
Challenge #3 is making sure to ask for remote help when it is needed for risk mitigation — even if that makes efficiency metrics look worse.
Providing safe human oversight
Assuming a human driver is asked to intervene before it is too late to avoid a mishap, we cannot simply assume the outcome will be perfect every time. That driver — whether in the vehicle or remote — needs time to shift attention away from whatever previous task they were performing, gain situational awareness, understand what hazards might be in play, and create a plan to react to them.
An oft-quoted 10-second response time might work for drivers who are mentally ready to jump in so long as nothing unusual is happening. “Here, take the wheel to keep driving straight on this empty highway” might work out well enough in 10 seconds for many drivers much of the time.
However, if the computer driver is asking for help because all hell is breaking loose right in front of the car, we should expect the human driver to need a lot more time to get up to speed and be able to react. Maybe 45 seconds. Maybe a minute. If it is a remote driver who has reduced sensory capabilities (e.g., lack of directional audio, inability to feel bumps and shakes in the vehicle, poor video quality at night), it will be even more challenging in some circumstances.
Challenge #4 is providing enough safe driving time and sensory inputs to let the human backup driver properly engage with the driving task.
This implies that by the time something dangerous starts happening on the road it can easily be too late to ask a human driver to jump in to save the day. In practice, it might well be than all an episodic driver can do reliably is help out after the computer driver has gotten itself stable in a still-messy situation. Even doing this might require significant limitations on vehicle speed and operational concept to assure safety under remote assistant supervision due to communication channel latency and drop-outs.
Perhaps remote driving can be made viable for small, highly curated physical spaces with super-robust dedicated high-speed connectivity. But most of the world does not fit that description.
Driving vs. Assisting
Many discussions of the topic of the role of a human driver in automated vehicle safety revolve around trying to make the distinction between “driving” vs. “assisting”. The idea is that someone who is “driving” is responsible for safety, but someone who is “assisting” is not. The reality is not clear-cut, and rather is a spectrum of different roles and responsibilities. We are still trying to figure out where the cutoff should be. For current systems, all remote assistance activities that involve vehicle motion probably should be called “driving”.
Example: a Waymo remote assistant is said not to be a driver, and the car is said to be able to ensure driving safety all on its own. However, in San Francisco a remote assistant told a Waymo robotaxi that a traffic light was green when it was really red. (Link to my commentary on the original data report.) The robotaxi entered the intersection and this resulted in a mishap involving a crossing vehicle that had the right of way with a green light. Fortunately there was no serious injury involved. But if there had been, would that remote assistant have been sued? Usually whoever makes the decision to enter an intersection against a red light would be considered the driver, and blamed accordingly if that provokes a loss event. If the remote assistant was not driving, what exactly were they doing? In some sense, anyone who can provide an input to a robotaxi that proximately causes a mishap is a driver, whether the company wants to admit that or not.
Challenge #5 is properly characterizing the safety role that any remote operator plays in any particular system.
The autonomous vehicle industry narrative is that remote assistants are just there to lend a helping hand. But, it is clear that what they do can compromise safety, meaning they share some driving responsibility even if they are not continuously monitoring for safety issues. That role might be as a partial contributor to safety rather than primary safety oversight, but the safety aspects of that job cannot be ignored. On the other hand, the purposes of improving road safety are unlikely to be served by blaming all crashes on semi-disposable remote assistants. A more nuanced approach that aligns incentives with safety outcomes is required here.
Wrapping Up
The true race is not to perfect autonomy. The race that matters for robotaxis, robotrucks, or any other application is getting the amount of driver attention low enough to become economically viable — without having a catastrophic mishap caused by mismanaging how that is done.
As long as there are heavy-tail edge cases there will need to be people helping robotaxis operate on public roads. But that is not an inherently bad thing. After all, these for-profit companies are there to make a buck. As long as the cost of supervision is more than offset by the net savings from removing in-vehicle drivers things could work at scale. What won’t work is pretending that remote assistants aren’t needed, or pretending that they have no role to play in safety.
Summary of challenges:
Challenge #1 is ultra-reliable detection of situations the system has not been trained to handle.
Challenge #2 is responding safely to untrained situations. This is nearly paradoxical, and yet is required for safety.
Challenge #3 is making sure to ask for remote help when it is needed for risk mitigation — even if that makes efficiency metrics look worse.
Challenge #4 is providing enough safe driving time and sensory inputs to let the human backup driver properly engage with the driving task.
Challenge #5 is properly characterizing the safety role that any remote operator plays in any particular system.
Great overview thank you.
Thanks.
Another systemic problem might be that there is no digital pucker factor functionality in neural networks. Humans, at least in my personal experience, have serious pucker factors and adrenal pumps that kick in when something fails or the driver senses an approaching dangerous situation. Increased alertness and situational awareness rapidly follow. That’s because I (and likely you) care deeply about the outcome and likely consequences of the rapidly deteriorating driving safety margins and adjust not only due to what we see but also to what we expect to see, hear, or feel, and project that onto other vehicles or threatened road users. Sure, AVs don’t get drunk, but they really don’t care.