Independence and Safety

We are unlikely to get robotaxi safety without effective independent oversight

Aug 16, 2025

Safety does not happen without independent oversight. It would be great if the world were otherwise, but that is simply the way it is. There is always a combination of incentives to cut corners on safety due to cost/schedule pressure, complacency, and a simple lack of understanding as to why some safety-related requirements are in place. This is generally true of all computer-based systems.

Robotaxi design is no different. The design team needs independent oversight to ensure they get safety right. The manufacturer needs independent oversight to ensure that the safety team did not succumb to incentives or biases to issue an unjustified release approval. And so on.

This is not a matter of good intentions or discipline. It is a basic case of a need for checks and balances. No system can be expected to work well without checks and balances.

https://www.pexels.com/photo/a-person-holding-a-document-7735769/

The road to unacceptable risk is paved with good intentions

Independence can be a slippery thing. Statements of good intentions might carry some weight, and one would hope that anyone performing independent confirmation of some aspect of safety would in fact have good intentions.

However, it is wise to assume that outcomes will ultimately align with incentives in any given system. No matter the intentions, it is unreasonable to expect anyone to routinely work against their own incentivized interests.

To motivate the discussion, let us consider a situation in which someone is asked to independently determine whether a robotaxi is safe enough to operate on public roads. That person works directly for the head of engineering. The head of engineering gets a huge bonus if the deployment sign-off occurs on time. The head of engineering makes it clear that if the independent assessor does not sign off, they will be immediately fired and replaced by someone who will. Moreover, the head of engineering will make it a personal mission to ensure the safety assessor has reputational tarnish so severe that they never work in the industry again. The assessor, meanwhile, might have personal circumstances (health issues, family needs, etc.) that make the prospect of losing employment an existential threat.

Regardless of good intentions, there is tremendous pressure on the assessor to sign off on safety regardless of whether the system is ready to deploy or not.1 One might argue that the head of engineering would not want an unsafe system to deploy because it will just be found out later. However, in practice, everyone up the chain of command is in a similar position, all the way up through the CEO, who will get fired by the Board of Directors if the next funding round is not unlocked by an on-time deployment.

Absent some enforced requirement for independent assessment, for a somewhat safe — but not-really-safe-enough — system, internal stakeholders are likely to be highly incentivized to roll the safety dice and hope the Big Mishap will happen after the relevant liquidity event. Or at least after their performance bonuses pay off. Without independent review, there might be nobody who is properly incentivized to call out that a deployment deadline must be missed to ensure acceptable safety.

If we want to get a safe outcome, we should not be putting people in an untenable position, and then demanding they provide an “independent” opinion in a situation which has significant incentives to do otherwise. While individual cases might still work out OK, expecting a system with misaligned incentives to produce safe outcomes consistently is a mindset more aligned with fantasy than reality.

Defining independence

The ISO 26262 safety standard defines four levels of independence for safety confirmation measures, I0 through I3. The person doing the confirmation might be:
(I0, I1): a different person than the one who created the work product being reviewed;
(I2): someone independent from the team that created the work product; or
(I3): someone independent from the department that created the work product.2

The ISO 26262 independence levels are a start, but make some assumptions about how traditional automotive companies run, and do not consider plausible dysfunctions of higher-level management.

We define independence for safety review as follows:

A safety review is independent if the reviewers are technically competent to perform the review and there are no substantive incentives to stating a particular outcome beyond an obligation to exercise sound professional judgment.

This definition exposes some underlying assumptions in the ISO 26262 definitions that are not always true – especially for an organization such as an AV manufacturer that is not even attempting to conform to ISO 26262.

You know what happens when you ass/u/me

The first assumption is that a company will assign a competent reviewer. This might not be the case if the goal is to get approval. It can be easy for technical experts to tell a story that someone without sufficient technical expertise cannot find issue with. Consider, for example, a technical review board for a Machine-Learning-based AI system whose members have strong operational or system-level safety skills, but only shallow knowledge of the unique challenges presented by AI/ML used in the relevant domain.3 Assessor team competence is not just based on competence in each person’s specialty. It also requires coverage of all relevant topics across the assessment team.

The second assumption is that there are no indirect influences on a person that might compromise independence. Fear of being fired is an obvious influence. But close personal relationships between the reviewer and the person whose work is being reviewed could be an issue. Something as indirect as the effect on company valuation might be an issue if the reviewer has substantive exposure to gain or loss in company valuation.4

While it is difficult to create a compact summary of all the ways independence might be compromised, UL 4600 sets a number of requirements, including a disclosure of any potential conflicts of interest, as well as disclosure of the basis for asserting the technical competence of any reviewers. Other requirements include not having been involved in engineering development5, and having a sufficiently independent management chain from the design team. Another requirement for life-critical systems is that independence must be accredited by an external entity.6

The idea is that there should be no substantive issues regarding a feeling of ownership for the work product (not grading one’s own homework), management pressure, or other reasons to compromise the exercise of sound technical judgment in assessing a safety-relevant work product.

What happens if the answer is no?

Beyond ensuring that an independent assessment is sound, there is the issue of what action is taken on the basis of the assessment. Is the independent assessment opinion written down to form a paper trail – or kept verbal to avoid a paper trail in case things go wrong later? Does a negative assessment amount to a veto of a release to public roads? Is it treated as a mandatory action item list to resolve before deployment? Or is it more of a soft guideline that can be (and is regularly) ignored by upper management?7

Independent safety assessments are of limited value if the results are ignored. However, they can provide important protective cover for high-level management if there is a requirement for a positive independent assessment result before release from company stakeholders. That converts pressure to meet the release date to pressure to succeed at independent assessment, better aligning incentives with acceptable safety outcomes.8

Internal vs. external independent oversight is an important, but different discussion, covered in a different section of the book.

This post is a preview draft of a section of my new book (currently undergoing final editing) that will be published in September 2025.

In a real-world situation only a small fraction of the conditions mentioned is likely sufficient to corrupt independence. The long list is given to illustrate various factors that might come into play.

See ISO 26262-2:2018 section 6.5.9.1, notes in Table 1.

The reader is invited to review the membership of technical review boards for their favorite robotaxi or robotruck company and see how many members have relevant experience in AI/ML technology, software safety, ground vehicle safety, and other relevant technical areas. We are not implying that every single member should have those skills. But a board completely missing any of those skills should ring alarm bells.

Consider a situation in which an independent assessor is not being told by management to say “yes.” However, that person is vested in a few million dollars in stock options that become worthless if they say no and the company fails to deliver on time.

This imposes a critical limitation on external consultants. They can help prepare for an assessment. Or they can do an assessment. But if the same external team does both, they have lost independence for the assessment.

This means an external organization which has itself been accredited as competent and independent for the purpose of evaluating assessors says that the assessor is in fact independent. See ANSI/UL 4600 section 17.3.2 for more on this topic.

Journalist exercise: When a company argues that an independent advisory board ensures safety, ask whether that advisory board has absolute veto power over a product release due to safety concerns (unless/until they agree all their concerns have been adequately addressed). Such an advisory board might provide substantive value, but probably it is not a substitute for an independent safety assessment.

Of course any incentive system can be dysfunctional. But incentivizing an independent assessment of safety seems more productive than incentivizing based on a deadline if the desired outcome is acceptable safety.

Autonomous System Safety by Phil Koopman

Discussion about this post

Ready for more?