6 Comments
User's avatar
Mike Smitka's avatar

A more diverse sensor suite potentially extends the operational domain – microwave (radar), infrared (lidar) and visible light (RGB cameras). I don't know of any of the big players using infrared cameras, but I have seen a demo and talked to a German OEM that was contemplating adoption. It offered camera-quality resolution instead of a lidar point cloud, using a bunch of signal enhancement techniques (which I've seen used with radar) to cut through fog.

That's a different set of issues from the software/statistical issues you address. What I don't know is whether Waymo's operational experience validates that, or whether additional sensors don't add enough delta to offset conditions that generate poor RGB camera sensitivity.

Another related issue is that as a practical matter any commercial ride hailing service requires a high underlying demand density to achieve a viable level of capacity utilization. That means that geofencing is not the barrier that many claim, because service will never be available in a rural area such as where I live, where there are no taxis or Ubers for a population of 35,000. Manhattan is a great market for ride hailing. Get much into Long Island or Westchester County or the Jersey suburbs and that's no longer the case.

Roy White's avatar

The tradeoff is also compute power vs sensors. Maybe, just maybe, Tesla can get to FSD with only cameras, but at what cost in time and compute? Tesla has already had to upgrade for free those with Hardware 2.0 or 2.5 who purchased “FSD”. Musk has said those with Hardware 3 may also require an upgrade TBD and if so I assume it has to be a free upgrade for FSD purchasers. Even if camera-only is the ultimate goal, you can make a pretty good argument that in the early training and reinforcement-learning development phases the additional sensors provide multi-modal feedback that makes the camera approach better and cuts training time. And the argument that human drivers use their eyes without radar / lidar is flawed because humans make a lot of mistakes and have a lot of accidents. Yes you can fly a plane on VFR, but the ones with radar and a bevy of instruments are much safer. There’s a time to use a “Cortez burn the boats strategy” to inspire technology advancement. But there’s also times when you are being obstinate just to be “Right”. The camera-only approach of Musk seems closer to the latter.

Jack Browne's avatar

Larger fleet of "limited vision" vehicles give more opportunities for the statistical accident to occur . . so more wrecks, property and loss of life.

What is a life worth. Corvair and Pinto made similar arguments that "crashes" were exceptions.

Jack's avatar

Author was being kind to Tesla. They will never have viable Robotaxis without LiDar. Look at poor weather events that can fog or make a camera lense blurry. How about terrain that is changing or dynamic and construction zones. Having deep data is not going to help much in those environments. Many have been killed driving their Tesla because they trusted its self-autonomy tech. Lending Tree study revealed Tesla has highest crash rate of 30 car brands.

Robert Thibadeau's avatar

I personally think these systems need to be constructed as fundamentally multimodal where the modality modules can plug and play as needed when confidence in one configuration fails. The full panoply of modalities is certainly in the thousands as with human brains. Each is a data source and sink. I think what Tesla is learning with their humanoids will feed back, in an engineering sense, into the many modalities available even to the relatively impoverished sensory decision action modalities available to network connected vehicles. Clearly Tesla has bought into various modalities of communicating with people with natural language which is particularly well suited to the use case associated with robotaxis and cybercabs. Also, local intelligence may change more quickly than global intelligence.

I go back to Freud's Ego, Id, and Superego as not a bad way to think about configuration management among the modalities. https://medium.com/liecatcher/freud-and-mendaciology-5bff28651972 Current technology has all three at work but perhaps not as explicitly as Natural Language systems might find useful. Being able to understand and tell stories is essential to modality management. And cars and taxies ought to be able to dream when they need to reorganize modal use.

People are just things in the modal mix. Teslas do have microphones and speakers already. Robotaxis and cybercabs should never go silent.