AI Content Moderation

The cluster focuses on using AI, LLMs, and machine learning for moderating online content, such as flagging suspicious, abusive, or AI-generated posts, with discussions on hybrid human-AI approaches, training data challenges, and potential risks like bias or mis-training.

📉 Falling 0.4x AI & Machine Learning
2,130
Comments
20
Years Active
5
Top Authors
#9408
Topic ID

Activity Over Time

2007
1
2008
6
2009
11
2010
4
2011
23
2012
21
2013
29
2014
37
2015
44
2016
94
2017
136
2018
123
2019
155
2020
154
2021
231
2022
247
2023
343
2024
227
2025
222
2026
22

Keywords

theverge.com LAION AI XKCD IMO LLM ML BS openai.com xkcd.com ai moderation filter human data train ml flag machine learning moderators

Sample Comments

mikefarah Apr 3, 2018 View on HN

you may be able yo use AI to flag suspicious conent. There probably wouldn't be enough readory posts yet do train it on yet, you may be able to train it on similar data...

captn3m0 Jan 26, 2025 View on HN

Clearly the answer is to throw an LLM to filter AI posts.

tyingq Jul 23, 2022 View on HN

It's telling that the AI doesn't just flag something, without banning it, and let a human do a proper review.

srameshc Jun 6, 2023 View on HN

Can AI not moderate posts? asking for a friend.

krzat Feb 23, 2022 View on HN

Oh boy, we need AI powered blockers that filter out this stuff.

andsoitis Oct 28, 2022 View on HN

Interesting. https://openai.com/blog/new-and-improved-content-moderation-...

mudkipdev Oct 6, 2025 View on HN

Wouldn't it be best for them to strip that out of the training data for moderation reasons?

chrisdengso Oct 7, 2021 View on HN

Haha, I call this the "Clbuttic" problem. But nowadays it can be solved with machine learning fairly easy https://moderationapi.com/blog/moderate-text-automatically-u...

EVa5I7bHFq9mnYK Nov 16, 2024 View on HN

Might be a job for an LLM personal moderator? Give it a prompt, what kind of content you want to see, and what to filter out?

Gigachad Jan 26, 2023 View on HN

Seems like the more likely option. They could be used to live scan every post to work out what it's about and it's sentiment. Similar to how ChatGPT can work out if you are asking for something it won't answer, they could be used to work out if you are saying something not allowed.