A practical list of embodied AI safety concerns: Evading transparency (Part 2)

May 24, 2025

Another issue with adopting AI/ML technology is that it can be misused or abused to create harmful outcomes by evading transparency. The harm is often indirect, such as biased decisions that affect some portions of a population. Or it might be statistical harm such as increasing profits of a health plan by denying a slightly higher fraction of claims without leaving any obvious patterns to the denials.1

https://www.pexels.com/photo/frost-covered-windowpane-in-quebec-winter-30359006

The transparency evasion happens when the AI/ML system makes a decision based on criteria that are either unstated or presented in a way that makes meaningful review impractical. Biases in training data can easily result in biases in decision-making that are invisible in that the learned behavior corresponds well to the training data. Only an external analysis of biases in the training data or in statistical biases of outcomes might reveal the issues.2

A related issue might be how decision thresholds are set. In any decision-making system there will be situations that are near the decision boundary. Examining any individual decision near the boundary will be difficult to evaluate. A much larger evaluation would be required to detect that the decision boundary has been pushed slightly in some direction to benefit the system owner, such as by using training data that is biased in that direction.3

The issue with transparency is similar to the issue with accountability. The system makes the decisions it makes, and in practice the onus is on anyone harmed by those decisions to prove that there were systematic issues with biased training data or other problems that resulted in harm. The problem can be exacerbated when the company operating the system claims that the techniques are proprietary trade secrets that should not be subject to independent review.

Problems of transparency, and especially with analyzing proprietary software are not new. However, with AI/ML techniques even if software is made available for analysis it might be impractical to conclude much about bias just from inspection of the software itself, without also having the ability to review the training data, validation data, and engineering process used to create the AI/ML system. That will make analysis significantly more expensive, with that burden placed on anyone who thinks they have been harmed by the system under investigation.

The obvious way to combat evasion of transparency is to require system designers to prove a lack of bias in dimensions that are relevant to stakeholders. In other words, designers should be doing the analysis up-front that would need to be done by someone trying to prove they had been harmed by bias. The burden should not be on an accused to do that work. The process used to prove a lack of bias should be independently reviewed along the lines of independent review of a safety case.

Next posting: Creating harmful content

This post is a draft preview of a section of my new book that will be published in 2025.

Hypothetically, imagine an ML system is used to evaluate the probability of a non-covered claim. The threshold for denial is changed from an ML-estimated 95% probability of non-covered claim to 90% probability, increasing profits by paying out fewer claims. Reasons for denial are given that involve subjective judgment, or that involve opaque policy statements which amount to “claim not covered because our AI says it should not be covered.” How would an individual claimant, or even a small group of claimants determine this change had been made without insider information?
For more on algorithms/AI that deny care, see: Talia 2025, https://doi.org/10.1109/MC.2024.3387012

And even then, the bias might be coincidental rather than causal. For example, an AI/ML training data set that hypothetically resulted in discrimination against people whose surnames start with the letters C, L, W, Y, and Z would likely be biased heavily against people of Chinese heritage without actually encoding ethnicity directly due to the prevalence of these letters in common Chinese surnames.
See: https://en.wikipedia.org/wiki/List_of_common_Chinese_surnames

Consider a hypothetical outpatient medical monitoring device that alerts the patient it is time to go to the emergency room. Consider if the health insurance company provides that device to reduce needless hospital visits. People come to rely on that embodied AI device’s advice over their own common sense. And perhaps they are financially penalized for going to the hospital when the device has not registered an alert. What if that company retrains the AI/ML functionality to give slightly fewer alarms and thus save on claim costs?

F Perkins

May 29

Thanks. Sometimes overlooked is that increased reliance on AI/ML and the opacity it causes means diminished reliance on the traditional paradigm of cause and effect. Physical mechanisms and most software can be validated by proof of bidirectional relationships between requirements and outputs. AI/ ML cannot. Instead, validation of AI/ML relies on statistical cross-correlation analysis of inputs vs. results. Neither industry nor government standards exist for reliability and statistical confidence of safety-critical capabilities of AVs reliant on AI/ML technology. Extraordinary claims need extraordinary proof. Industry claims that abandoning reliance on the simple notion of ‘cause and effect’ for validation of safety-critical AV functions in favor of murky AI/ML aspirations is in the best interests of public safety. They have not.

Expand full comment

1 reply by Phil Koopman

Thanks again.

1 more comment...

Autonomous System Safety by Phil Koopman

Discussion about this post