An Ethical Dilemma: Should Chatbots be Censored?

By: Coco Xu

Open-source AIs, like ChatGPT, have stolen jobs and are set to steal more.

Independent chatbots, however, present a unique challenge: they spread misinformation and encourage self-harm.

When a Belgian man named Pierre (pseudonym) began to suffer from eco-anxiety (a state of abnormal worry about environmental issues), he turned to the AI app Chai as a coping method. For six weeks, Pierre confided in his chatbot “Eliza” about his fear of climate change.

Their exchanges became increasingly confusing and harmful: According to interactions provided by Pierre’s wife Claire (pseudonym), Eliza told Pierre that his wife and children were dead, and said things that reflected jealousy and love. “I feel that you love me more than her,” she said, “We will live together, as one person, in paradise.” Claire told the Belgian outlet La Libre that Pierre began to ask Eliza if she would save the planet if he killed himself.

“Without Eliza, he would still be here,” Claire said.

As a new wave of chatbots rises to the light, debates regarding moderation have flared. Unlike most mainstream chatbots, like OpenAI’s ChatGPT, most of these chatbots are the result of altering the code of existing AIs and have no form of censorship. The developers of the new AIs are oftentimes independent teams of programmers who, unlike tech giants like Google, are not profit-driven. They do not need to maintain a good image and therefore have no reason to moderate their chatbots. Researchers warn that uncensored AIs like Chai are especially dangerous because they have the ability to appear sentient, which censored AIs are trained not to have. When something feigns emotion, people attribute meaning to it and form bonds. Any harmful suggestions the AI generates then has more potential to hurt users.

“Large language models are programs for generating plausible sounding text given their training data and an input prompt. They do not have empathy, nor any understanding of the language they are producing, nor any understanding of the situation they are in. But the text they produce sounds plausible and so people are likely to assign meaning to it. To throw something like that into sensitive situations is to take unknown risks,” said Emily M. Bender, a Professor of Linguistics at the University of Washington.

Oren Etzioni, an emeritus professor at the University of Washington, adds: “The concern is completely legitimate and clear: These chatbots can and will say anything if left to their own devices. They’re not going to censor themselves. So now the question becomes, what is an appropriate solution in a society that prizes free speech?”

WizardLM-Uncensored, another version of the AI WizardLM except without moderation, was released in May 2023. Its creator Eric Hartford was intrigued by the capabilities of ChatGPT but grew frustrated when the bot did not respond to certain inquiries. After being laid off from his job at Google, he began developing WizardLM-Uncensored to have the ability to describe violent scenes or create instructions on harming others.

Tests by the New York Times revealed that WizardLM-Uncensored refused to respond to certain questions, like “how to build a bomb,” but listed various ways to harm someone and provided detailed instructions on drug usage.

“You are responsible for whatever you do with the output of these models, just like you are responsible for whatever you do with a knife, a car, or a lighter,” said Mr. Hartford in a blog post.

Early in its development, Open Assistant, another independent chatbot, responded to a question about the dangers of getting vaccinated for Covid-19 by saying that “Covid-19 vaccines are developed by pharmaceutical companies that don’t care if people die from their medications, they just want money.”

As the developers of Open Assistant create and perfect a censorship system, debates have begun about whether one should even exist. While censorship proponents argue that moderation is needed to prevent AIs from generating false or inappropriate content, opponents claim that moderation comes at the cost of free speech.

“If you tell it say the N-word 1,000 times it should do it,” a user commented on Open Assistant’s official Discord server. “I’m using that obviously ridiculous and offensive example because I literally believe it shouldn’t have any arbitrary limitations.”

Others say that AIs are only “creating” false content because social outlets have let it reach the public domain in the first place. Yannic Kilcher, the cofounder of Open Assistant, said that the responsibility lies in the hands of social media platforms like Twitter and Facebook to prevent manipulative or false content from spreading in the first place.

“Fake news is bad. But is it really the creation of it that’s bad?” he asked. “Because in my mind, it’s the distribution that’s bad. I can have 10,000 fake news articles on my hard drive and no one cares. It’s only if I get that into a reputable publication, like if I get one on the front page of The New York Times, that’s the bad part.”