LLM Bias and Censorship

This cluster focuses on discussions about political, cultural, and safety-related biases in large language models like ChatGPT, debating whether they stem from training data, RLHF, or deliberate censorship mechanisms.

➡️ Stable 0.8x AI & Machine Learning

3,033

Comments

Years Active

Top Authors

#9042

Topic ID

Activity Over Time

2008

2009

2012

2016

2017

2018

2019

2020

2021

2022

248

2023

976

2024

613

2025

954

2026

Top Contributors

int_19h (15) ben_w (14) dragonwriter (14) skissane (13) nickpsecurity (13)

Keywords

change.com e.g US LLM OK IR PRC PC openai.com AI trained models training model chatgpt bias biases training data ai gpt

Sample Comments

andybak • Feb 22, 2024 • View on HN

OK. Have a setting where you can choose either:1. Attempt to correct inherent biases in training data and produce diverse output (May sometimes produce results that are geographically or historically unrepresentative) 2. Unfiltered (Warning. Will generate output that reflects biases and inequalities in the training data.)Default to (1) and surely everybody is happy? It's transparent and clear about what and why it's doing. The default is erring on the side of caution but people c

smsm42 • Mar 4, 2024 • View on HN

it is true that it doesn't understand anything, but it is trained to do things and to associate things. These chains are not random - they are formed by training. And whoever trained it was so obsessed with "safety" and not letting anything "unsafe" to leak through - probably at explicit command of their higher ups - that they trained the model to have this bias. It's not only their blame - in current American culture, it's always better to be insane safety-obs

erikhorton • Dec 2, 2025 • View on HN

Yes extremely likely they are prone to censorship based on the training. Try running them with something like LM Studio locally and ask it questions the government is uncomfortable about. I originally thought the bias was in the GUI, but it's baked into the model itself.

s1artibartfast • Apr 9, 2025 • View on HN

I don't think that's a big part of it, although it may be included.In general, the models lean towards being Yes-Men on just about every topic, including things without official sources. I think this is a byproduct of them being trained to be friendly and agreeable. Nobody wants a product that's rude or contrarian, and this puts a huge finger on the scale. I imagine an a model unfiltered for safety and attitude and political correctness would have less of this bias (but perhaps

saratv • Apr 1, 2020 • View on HN

Agree, and this is in general an issue with lots of "deep learning" AI technology. The models are trained on huge amounts of text in the public domain and biases often leak into them.This is an active research area and we will actively try to address such issues.Apologies if it offended you or anyone else.Thanks for the feedback; these are important issues that we want surfaced.

kangs • Oct 23, 2025 • View on HN

you seem to believe that llm are a neutral engine with bias applied. its not the case. the majority of the bias is in the model training data itself.just like humans, actually. fe: grow up in a world where chopping one of peoples finger off every decade is normal and happens to everyone.. and most will think its fine and that its how you keep gods calm and some crazy stuff like that.right now, news, reddit, Wikipedia, etc. have a strong authoritarian and progressive bias, so do the models,

cubefox • Feb 18, 2023 • View on HN

The left-wing bias of ChatGPT probably derives from intentional RLHF, not from the training text. If RLHF raters are woke, the fine-tuned model will be too.

harha_ • Feb 13, 2023 • View on HN

I don't understand the censorship. It's predicting text and it's trained with massive amounts of data. Why are they so aggressively trying to control the responses?

dragonwriter • May 13, 2023 • View on HN

> ChatGPT, Dall-e, etc all make assumptions about identity or politics but try to sidestep direct requests around those topics to appear more neutral… but the bias still exists in the model and affects the answers.Correction: ChatGPT, Dall-E, etc., all have been trained with datasets which contain biases about identity and politics, and specifically to avoid criticism on that basis, their corporate vendors have chosen to also have each either trained (e.g., as part of RLHF fo

coffeecat • Dec 12, 2025 • View on HN

It's important not to assume that LLMs are giving you an impartial perspective on any given topic. The perspective you're most likely getting is that of whoever created the most training data related to that topic.