A practical list of embodied AI safety concerns: Creating harmful content (Part 3)

May 31, 2025

The rise of generative AI, including LLMs, has created a very real threat from the creation of synthetic content that is harmful in one way or another. This happens when an LLM hallucinates (generates BS) in response to a query for information. When someone asks for a summary of a technical concept, a recipe, or other piece of information, the answer can contain factual errors because the LLM just knows about statistical characteristics of pieces of various answers. The LLM itself has no grounded concept of truth.1

https://pixabay.com/photos/apple-hand-bone-snow-white-poison-3483512/

To the degree that the truth of an LLM response is irrelevant, or the validity of the statement is subjective, one might argue that this is harmless. But that reaction overlooks the possibility of answers manipulating the reader into believing or doing something potentially harmful to themselves or others without realizing it. LLM responses to a query for a recipe, a how-to procedure, or medical advice are especially fraught because such advice can to easily lead to harm.2

Biases in outputs can do harm in subtle ways. For example, they can reinforce problematic stereotypes, reinforce problematic tendencies in people with whom they interact, or economically harm the creators of training data via driving down the market price for creative works.

One report showed that a generative AI system depicted a world “run by White male CEOs. Women are rarely doctors, lawyers, or judges. Men with dark skin commit crimes, while women with dark skin flip burgers.”3 It is not hard to believe that this outcome reflects bias in training data.

One lawsuit has alleged that a chatbot had simulated being a close friend, and then encouraged a 14-year-old to kill himself. The chatbot had no “intent” to do so, because chatbots do not have intent. Unfortunately, the 14-year-old did kill himself apparently as a result of a chatbot conversation.4

Prominent authors have sued a large LLM vendor for training on their books and impermissibly using that information to create competing works.5 A similar concern exists for creators of visual information.6 This can get especially problematic because an LLM or generative AI image creation system can use a prompt to create a work that is specifically in the style of a named author, painter, illustrator, photographer, etc. However, even without requesting a specific style it is possible for outputs to be close to copies of a specific original work, as well as diluting the market for original works in general. Other creative communities similarly feel threatened, including musicians and those involved in various aspects of filmmaking.

As specific embodied AI examples, consider the potential for harm from: a wristwatch device that provides companionship for someone at risk of becoming isolated (perhaps intended for use in a retirement home, but used off-label for a teenager with poor social skills), a medical condition monitoring device, a police-evasion driving assistant, a food-safety monitoring refrigerator (“are these leftovers safe to eat … given the recent power outage?”), a fashion-crime prevention mirror (“do I look good in this?”), allergy prediction smart eyeglasses (“can I eat this exotic food given my known allergies?”), and so on.

Next posting: Polluting the information space

This post is a draft preview of a section of my new book that will be published in 2025.

A technique known as Retrieval Augmented Generation (RAG) can link LLM behavior to a pre-created text segment, image, web page, or other data that the LLM treats as a source of “truth.” But the LLM assumes truth, and does not objectively assess whether that data is actually true. Moreover, if the LLM does not return an exact quote of the data, its output might still be incorrect due to creating a misleading statistical summary of the linked material. See: https://en.wikipedia.org/wiki/Retrieval-augmented_generation

One recipe chatbot provided instructions for making deadly gas as well as poison sandwiches in response to user queries about how to use leftovers. The queries used might be considered intentional misuse of the chatbot’s capabilities. See Edwards, 2023: https://arstechnica.com/information-technology/2023/08/ai-powered-grocery-bot-suggests-recipe-for-toxic-gas-poison-bread-sandwich/

See Nicoletti & Bass, 2023: https://www.bloomberg.com/graphics/2023-generative-ai-bias/

See Payne, 2024: https://apnews.com/article/chatbot-ai-lawsuit-suicide-teen-artificial-intelligence-9d48adc572100822fdbc3c90d1456bd0

See Alter & Harris, 2023: https://www.nytimes.com/2023/09/20/books/authors-openai-lawsuit-chatgpt-copyright.html

See Cho, 2024: https://www.hollywoodreporter.com/business/business-news/artists-score-major-win-copyright-case-against-ai-art-generators-1235973601/

Autonomous System Safety by Phil Koopman

Discussion about this post