Open Sourcing Datasets

Commenters repeatedly request the public release, open sourcing, and free download availability of datasets collected from sources like Hacker News, GitHub, and StackOverflow to enable community analysis and experimentation.

📉 Falling 0.5x Open Source
4,838
Comments
20
Years Active
5
Top Authors
#2419
Topic ID

Activity Over Time

2007
8
2008
30
2009
143
2010
160
2011
201
2012
172
2013
265
2014
237
2015
234
2016
344
2017
339
2018
333
2019
274
2020
360
2021
295
2022
352
2023
380
2024
354
2025
323
2026
34

Keywords

AI StackOverflow US HN www.bbc gwosc.org opengovernmentdata.org GIS youtube.com NYC data datasets dataset open public raw data data available releasing collected available

Sample Comments

cjonas May 12, 2023 View on HN

Would be nice if there was an open source version of this, where the data was published for the public to learn from

jruohonen Jan 30, 2024 View on HN

It would be interesting to have a public good dataset.Ref.:https://news.ycombinator.com/item?id=38149802

Keyframe Oct 27, 2022 View on HN

Fantastic! Thanks for this. Do you plan to open the raw data you collected for others to tinker with it?

sitkack Nov 5, 2015 View on HN

You could crawl github and bitbucket, make a public dataset available. Others could slice and dice.

darawk Sep 24, 2018 View on HN

Datasets like this need to be public. They're too valuable to be silo'd by some little AI startup.

jessekirchner Jul 6, 2009 View on HN

It is always good to make raw data available to the masses. Nice work!

shubb Nov 14, 2019 View on HN

Probably money to be made grabbing hard to get hold of open datasets and listing them here until someone complains?

donw Jun 25, 2021 View on HN

It’s not very “open source” when your datasets aren’t available for download (unless that has changed?)

krishna2 Mar 30, 2016 View on HN

Super interesting. Thanks for all the curation and relevant set of links that you are promptly posting/replying. StackOverflow data is available for free too - that would be one awesome dataset to let folks get their hands on. Any plans for that?

xwvvvvwx Dec 10, 2020 View on HN

Good. Hopefully this data will be open sourced.