Database Indexing Performance
Cluster focuses on discussions about indexes in relational databases, particularly clustered and covering indexes in PostgreSQL vs MySQL/SQL Server, table scans, columnar vs row storage, and performance implications.
Activity Over Time
Top Contributors
Keywords
Sample Comments
You absolutely need a covering index in a relational DB. This way the data is read entirely from the index. The table just essentially goes along for the ride. The extra storage is a little distasteful from a conceptual / academic perspective but it works ok below a certain scale (low B rows). Beyond that use ClickHouse (but understand how it works - don't treat it as a "magic data box").
Postgresql does not have real, maintained with each change, clustered index. That itself makes it worse for many workloads than MySQL
unlikely - postgres stores data row-wise and this assumes sequentially stored columns. They even mention that issue in the blog post.It would be more likely to show up in something like Apache Arrow which is designed columnar to leverage these sort of tricks
Columnstores don't use indexes, and many don't even support them (like BigQuery). You may be taking about clustering, which you can use to improve compression and scan speed by sorting data by commonly queried columns but it's unnecessary, and even table scans are fast in modern columnstores that can prune partitions and use sophisticated metadata to calculate your answers.Also it's SQL, what is preventing anyone from searching on any field they need? You don't need i
what about db table scan vs index. that happens all the time.
It would be if it had real clustered indexes like MS SQL, ie. Tables as btrees that are maintained with each update.
Is it? He doesn't mention aggregate queries, but seems to want to load a single row without unnecessary columns. Columnar DBs would spread a row out across the disk which would probably be less efficient.
Does pg have the concept of a clustered index? If so, for frequent inserts/updates it could actually matter.
It is if your table is just a datastore with no foreign key links or indexes.See another reply in this thread: https://news.ycombinator.com/item?id=20841814
Doesn't that make sense if data is clustered by the primary key, as InnoDB does?