Big Data Size Debates

Discussions revolve around defining thresholds for small, medium, large, and huge datasets, debating data volumes like 500MB to TBs in databases and analytics, and when data fits in RAM versus requiring distributed systems.

📉 Falling 0.4x Databases

4,327

Comments

Years Active

Top Authors

#1917

Topic ID

Activity Over Time

2007

2008

2009

2010

101

2011

2012

172

2013

232

2014

169

2015

259

2016

325

2017

314

2018

239

2019

259

2020

301

2021

327

2022

339

2023

374

2024

378

2025

289

2026

Top Contributors

jandrewrogers (22) riku_iki (18) gaius (16) hobs (14) sgarland (13)

Keywords

RAM GB DB DBA DOS XML ELT SQL AWS i.e data ram datasets rows nodes size dataset fit large db

Sample Comments

nikita • Mar 20, 2018 • View on HN

Can you share the whole use case as well as data size?

morcus • Sep 15, 2025 • View on HN

Surely most queries should process much less than 1 TB of data?

outside1234 • Oct 31, 2019 • View on HN

I see you haven't worked with a truly titantic amount of data then :)

Xorlev • Nov 12, 2014 • View on HN

You probably aren't the target audience if your database fits in RAM.

vicaya • May 22, 2009 • View on HN

Sorry, but 500MB DB size is a tiny dataset these days (anything < 1GB is tiny, < 4GB is small, < RAM on a single node (~8GB-64GB) is medium, < Disks on a single node (~128GB to a few TB) is large, huge dataset requires multiple nodes and typically above 128TBs.)

delive • Nov 8, 2024 • View on HN

What would you consider to be small or medium? I have a use case for analytics on ~1 billion rows that are about 1TB in postgres. Have you tried on that volume?

alexchamberlain • Oct 7, 2013 • View on HN

Maybe they are dealing with a few gig of CSV time series?

subleq • Aug 11, 2017 • View on HN

I don't have any experience with this type of thing, so that sounds like an incredibly large amount of data. What are you doing that requires it? What type of useful queries are you even able to perform over 432 billion records?

dvirsky • Apr 18, 2014 • View on HN

9M rows? Luxury! Big data starts at 100M rows! /s

kcorbitt • Nov 22, 2013 • View on HN

Data too large to fit in memory; anything that you're going to want to query relationally (as opposed to just key/value) in the future.