AI Model Evaluations

The cluster focuses on discussions about evaluating, comparing, and using various AI/ML models, including debates on their quality, accuracy, usability for specific tasks, and the need for transparency or improvements.

➡️ Stable 1.2x AI & Machine Learning
5,076
Comments
20
Years Active
5
Top Authors
#2579
Topic ID

Activity Over Time

2007
2
2008
18
2009
25
2010
25
2011
46
2012
66
2013
48
2014
73
2015
94
2016
143
2017
200
2018
218
2019
238
2020
310
2021
321
2022
349
2023
798
2024
712
2025
1,307
2026
89

Keywords

IMO OP ML O1 USD RAB models model frontier run eyes output bug feature o1 booster stuff want

Sample Comments

spookthesunset Jul 6, 2023 View on HN

If it’s based on models, it’s only as good as the model

compumike Mar 16, 2023 View on HN

Curious, what would have been a better model?

eurekin Jul 26, 2023 View on HN

Hi, so what are you using the models for?

vidarh Apr 30, 2023 View on HN

I have nothing to gain from spending time testing models for you because whatever I pick will just seem like cherry picking to you, and it doesn't matter to me whether or not you agree on the usability of these models. They work for me, and that's all that matters to me. Try a a few completions instead of a question. Or don't

skp1995 Nov 6, 2024 View on HN

Honestly we can, I haven't prompted it enough what do you want to use the model for?

stainablesteel Jan 17, 2024 View on HN

so the takeaway is basically, don't run a model if you don't know where it came from

osigurdson Mar 9, 2024 View on HN

What is going to happen? Please publish your models so we can run them ourselves.

DoofusOfDeath Sep 11, 2018 View on HN

Not sure it's possible. The more accurate the models get, the buggier they are.

aaronblohowiak Oct 25, 2025 View on HN

what's the most effective model you've seen?

politelemon Sep 27, 2025 View on HN

I'm not seeing the equivalence. Isn't the announcement here to let you run any model?