LLM Bias and Censorship
This cluster focuses on discussions about political, cultural, and safety-related biases in large language models like ChatGPT, debating whether they stem from training data, RLHF, or deliberate censorship mechanisms.
Activity Over Time
Top Contributors
Keywords
Sample Comments
OK. Have a setting where you can choose either:1. Attempt to correct inherent biases in training data and produce diverse output (May sometimes produce results that are geographically or historically unrepresentative) 2. Unfiltered (Warning. Will generate output that reflects biases and inequalities in the training data.)Default to (1) and surely everybody is happy? It's transparent and clear about what and why it's doing. The default is erring on the side of caution but people c
it is true that it doesn't understand anything, but it is trained to do things and to associate things. These chains are not random - they are formed by training. And whoever trained it was so obsessed with "safety" and not letting anything "unsafe" to leak through - probably at explicit command of their higher ups - that they trained the model to have this bias. It's not only their blame - in current American culture, it's always better to be insane safety-obs
Yes extremely likely they are prone to censorship based on the training. Try running them with something like LM Studio locally and ask it questions the government is uncomfortable about. I originally thought the bias was in the GUI, but it's baked into the model itself.
I don't think that's a big part of it, although it may be included.In general, the models lean towards being Yes-Men on just about every topic, including things without official sources. I think this is a byproduct of them being trained to be friendly and agreeable. Nobody wants a product that's rude or contrarian, and this puts a huge finger on the scale. I imagine an a model unfiltered for safety and attitude and political correctness would have less of this bias (but perhaps
Agree, and this is in general an issue with lots of "deep learning" AI technology. The models are trained on huge amounts of text in the public domain and biases often leak into them.This is an active research area and we will actively try to address such issues.Apologies if it offended you or anyone else.Thanks for the feedback; these are important issues that we want surfaced.
you seem to believe that llm are a neutral engine with bias applied. its not the case. the majority of the bias is in the model training data itself.just like humans, actually. fe: grow up in a world where chopping one of peoples finger off every decade is normal and happens to everyone.. and most will think its fine and that its how you keep gods calm and some crazy stuff like that.right now, news, reddit, Wikipedia, etc. have a strong authoritarian and progressive bias, so do the models,
The left-wing bias of ChatGPT probably derives from intentional RLHF, not from the training text. If RLHF raters are woke, the fine-tuned model will be too.
I don't understand the censorship. It's predicting text and it's trained with massive amounts of data. Why are they so aggressively trying to control the responses?
> ChatGPT, Dall-e, etc all make assumptions about identity or politics but try to sidestep direct requests around those topics to appear more neutral… but the bias still exists in the model and affects the answers.Correction: ChatGPT, Dall-E, etc., all have been trained with datasets which contain biases about identity and politics, and specifically to avoid criticism on that basis, their corporate vendors have chosen to also have each either trained (e.g., as part of RLHF fo
It's important not to assume that LLMs are giving you an impartial perspective on any given topic. The perspective you're most likely getting is that of whoever created the most training data related to that topic.