Open Sourcing Datasets
Commenters repeatedly request the public release, open sourcing, and free download availability of datasets collected from sources like Hacker News, GitHub, and StackOverflow to enable community analysis and experimentation.
Activity Over Time
Top Contributors
Keywords
Sample Comments
Would be nice if there was an open source version of this, where the data was published for the public to learn from
It would be interesting to have a public good dataset.Ref.:https://news.ycombinator.com/item?id=38149802
Fantastic! Thanks for this. Do you plan to open the raw data you collected for others to tinker with it?
You could crawl github and bitbucket, make a public dataset available. Others could slice and dice.
Datasets like this need to be public. They're too valuable to be silo'd by some little AI startup.
It is always good to make raw data available to the masses. Nice work!
Probably money to be made grabbing hard to get hold of open datasets and listing them here until someone complains?
It’s not very “open source” when your datasets aren’t available for download (unless that has changed?)
Super interesting. Thanks for all the curation and relevant set of links that you are promptly posting/replying. StackOverflow data is available for free too - that would be one awesome dataset to let folks get their hands on. Any plans for that?
Good. Hopefully this data will be open sourced.