Big Data Size Debates
Discussions revolve around defining thresholds for small, medium, large, and huge datasets, debating data volumes like 500MB to TBs in databases and analytics, and when data fits in RAM versus requiring distributed systems.
Activity Over Time
Top Contributors
Keywords
Sample Comments
Can you share the whole use case as well as data size?
Surely most queries should process much less than 1 TB of data?
I see you haven't worked with a truly titantic amount of data then :)
You probably aren't the target audience if your database fits in RAM.
Sorry, but 500MB DB size is a tiny dataset these days (anything < 1GB is tiny, < 4GB is small, < RAM on a single node (~8GB-64GB) is medium, < Disks on a single node (~128GB to a few TB) is large, huge dataset requires multiple nodes and typically above 128TBs.)
What would you consider to be small or medium? I have a use case for analytics on ~1 billion rows that are about 1TB in postgres. Have you tried on that volume?
Maybe they are dealing with a few gig of CSV time series?
I don't have any experience with this type of thing, so that sounds like an incredibly large amount of data. What are you doing that requires it? What type of useful queries are you even able to perform over 432 billion records?
9M rows? Luxury! Big data starts at 100M rows! /s
Data too large to fit in memory; anything that you're going to want to query relationally (as opposed to just key/value) in the future.