A practical list of embodied AI safety concerns: Evading Accountability (Part 1)

May 17, 2025

There are deep questions being asked in the area of AI safety, and numerous AI incidents to ponder, with most current examples coming from the non-embodied AI world.(1) There is a lot to ponder, and we will likely be working on those questions for decades. But much of the discussion is based on speculation about the potential future capability of these systems.

https://www.pexels.com/photo/code-projected-over-woman-3861969/

A super-intelligence that deviously revolts and starts killing off humans on a global scale is still science fiction that assumes capabilities simply not present in current systems. Nor is it something that seems credible with technologies under development, despite what scare-mongers are promoting.

Rather, fears of existential threats from an Artificial General Intelligence (AGI) system are a distraction from the very real and present ethical threats presented by this technology – all of which have to do with misuse and abuse by people deploying the technology rather than the technology itself developing a malicious will of its own.

We can break down practical near-term threats into four categories: evading accountability, evading transparency, creating harmful content, and polluting the information space.

Evading accountability

Deploying an opaque computer-based capability is one way of evading accountability for harm done by a defective design or nefarious intent. Keeping a conventional software design secret makes it difficult for anyone to prove there is a defect, bias, or other harmful behavior. This is especially true if the defect is a subtle one that is highly sensitive to timing or seemingly innocuous aspects of input data. Even better if there is a human to act as a Moral Crumple Zone readily available.

Using AI/ML amplifies this effect, because even if someone has access to all the details of the implementation, it might still be difficult or impossible to explain why such a function is misbehaving. It can easily turn out that the burden of proof is put on victims of occasional misbehavior rather than the design team of a system that seems to work most of the time in these circumstances.

While putting the burden of proof on victims seems like it should not happen, that is precisely what tends to happen in real life in computer safety product liability cases, as well as situations in which computer software is presumed to be correct. The British Post Office / Horizon IT Scandal is a case study in just this issue.2 Even if system designers become aware that their system is causing problems, they might choose to wait to see if victims can mount a convincing campaign to find the faults and prove that they have caused harm.(3)

A way that accountability issues might manifest for robotaxis is that a computer driver might perform a dangerous maneuver while being supervised by a human. If the person has succumbed to automation complacency or the maneuver is too dramatic for a person to be able to react in time a crash might occur, which is blamed on that person for failure to properly supervise. Anyone trying to prove there was a computer driver design defect might well be facing a huge burden of time and expense to prove a software defect. If the crash involves harm to someone other than the driver, especially a pedestrian, the consequences could involve significant civil liability or even criminal liability.4

Given the immaturity of AI/ML technology, we should expect that such systems will have various types of design defects and misbehaviors. A presumption that such a system is non-defective provides protective cover for anyone who wants to evade accountability by putting the burden of proof on victims. This especially includes systems for which shortcuts have been taken in safety engineering if there is no externally imposed requirement to conform to industry safety standards.

While the issue of identifying defects in software-based systems is a thorny one, basing accountability on behavior rather than a requirement to find and prove a particular design defect can help in this area. If the system displays some adverse behavior, proof of that behavior occurred should in itself be sufficient for a victim to seek compensation without the additional burden of having to find the technical cause of the misbehavior. This is a primary motivation for the duty of care proposal in section 5.16.

Computer software, and especially AI/ML functionality, should not be given a strong presumption of correctness. Rather, the possibility that a computer system has produced incorrect or unsafe behaviors should be routinely considered as a possibility, just as human error is routinely considered a possibility.

Next posting: Evading transparency

This post is a draft preview of a section of my new book that will be published in 2025.

For numerous examples, see the AI Incident Database: https://incidentdatabase.ai/

See section 9.1.2.1 for a discussion of the Horizon IT scandal. (This is a reference to another section in the final book, not included in this post.)

This video explains the numerous technical issues found with an automotive electronic throttle control system that put crash victims in a similar position of having to prove that equipment was defective:

See Smiley 2022: https://www.wired.com/story/uber-self-driving-car-fatal-crash/

Autonomous System Safety by Phil Koopman

Discussion about this post