Wikipedia Database Dumps

Comments primarily suggest using official Wikipedia database dumps, Wikidata, DBPedia, and tools like Kiwix for offline access instead of scraping Wikipedia.

➡️ Stable 0.5x Databases

1,856

Comments

Years Active

Top Authors

#7732

Topic ID

Activity Over Time

2007

2008

2009

2010

2011

2012

2013

110

2014

2015

2016

2017

2018

2019

2020

140

2021

225

2022

181

2023

195

2024

111

2025

170

2026

Top Contributors

zozbot234 (25) bawolff (23) ZeroGravitas (15) frik (12) emw (11)

Keywords

kiwix.org WordNet WikiData ZIM wikidata.org kaggle.com robots.txt GB Q2567859 americantheatre.org wikipedia download just download dumps database dump offline archive rdbms crawl

Sample Comments

0x457 • Aug 25, 2025 • View on HN

So weird to scrape wikipedia when you can just download db dumb from them.

stickfigure • Aug 25, 2025 • View on HN

There are downloadable, offline versions of wikipedia.

Phithagoras • Dec 10, 2021 • View on HN

why not just download wikipedia from wikipedia?https://en.wikipedia.org/wiki/Wikipedia:Database_download

gruez • Apr 15, 2024 • View on HN

Can't you just download a Wikipedia archive?

mcjiggerlog • Sep 26, 2018 • View on HN

All from Wikipedia/Wikidata. Now they're served from my own database.

bllguo • Oct 20, 2020 • View on HN

there are many ways to download wikipedia: https://en.wikipedia.org/wiki/Wikipedia:Database_download

TheMichaelJohn • Jul 12, 2021 • View on HN

In lieu of scraping Wikpedia, could this project be sped up by downloading the instance of Wikipedia itself? It's not that jumbo of a file size.

Rebelgecko • Mar 25, 2020 • View on HN

I'm not sure what your use case is so maybe this isn't helpful, but Wikipedia has weekly or so database dumps that you can download, as well as static HTML (although that might be more out of date)https://en.wikipedia.org/wiki/Wikipedia:Database_download

amrrs • Jul 2, 2020 • View on HN

I'm sorry if I didn't understand. Wouldn't a json or xml type data structure (where some Wikipedia stuff is already stored) would support this?

jenno • Feb 18, 2014 • View on HN

You can download Wikipedia data; with only articles, it's around 20gb.