Network Partition Handling

Comments focus on how distributed systems, clusters, and databases like Redis and Cassandra handle network partitions, node failures, split-brain scenarios, and maintain consistency using Raft, Paxos, quorums, and replication.

📉 Falling 0.5x Databases
3,473
Comments
19
Years Active
5
Top Authors
#8534
Topic ID

Activity Over Time

2008
5
2009
35
2010
110
2011
110
2012
131
2013
186
2014
250
2015
324
2016
265
2017
292
2018
221
2019
189
2020
224
2021
262
2022
216
2023
221
2024
236
2025
189
2026
7

Keywords

ZK TFA HA US CAP e.g MTTF OK zawodny.com SWIM nodes partition network cluster clusters node servers redis datacenters relay

Sample Comments

sl_ May 4, 2010 View on HN

Partitioning also happens when a node fails. But shouldn't be that much of a problem since they probably use a N=3/W=2 setup.

DanWaterworth Jul 26, 2012 View on HN

You can use Paxos. There will be some partitions that can cause an outage, but this is only the case when there is no majority of nodes that can communicate with each other, which in practice means you are exposing yourself to a vanishingly small risk.

jjoergensen Oct 25, 2012 View on HN

Do not forget about the netsplit scenarios. In the case you have two datacenters, and one has network problems, you will end up in a split-brain scenario. Both datacenters may believe they are available or down. A way to solve this, could be to use more boxes and majority voting, with something like apache zookeeper. But none of this comes out of the box.

cjg Mar 29, 2022 View on HN

And then when two of the nodes die at the same time...?

oomkiller Nov 10, 2009 View on HN

So this isn't a shared-nothing system? How do they handle failing nodes, and prevent data loss?

01HNNWZ0MV43FF May 1, 2024 View on HN

Tbf isn't that equivalent to a network partition and then rebooting or replacing one node? The network will always go down in every middle point of an operation

Alifatisk May 7, 2023 View on HN

That's interesting, I thought a Raft cluster would prevent just that!

atombender Feb 28, 2019 View on HN

How do you do that in a distributed cluster with HA failover using Sentinel, considering that Sentinel is susceptible to partitions and drops?

purpleblue Aug 16, 2022 View on HN

Excellent!A couple of questions:1) In the case of a network partition, the client that is currently connected to the leader, do they get notified that there's a partition, or that the cluster is not in a healthy situation?2) If a client writes to the partition that will get rolled back, and all their transactions get rolled back after the partition heals, do they get notified that their data was rolled back?

imtringued Jun 18, 2018 View on HN

1. Run two beefy nodes and accept dataloss of unreplicated changes during failover.2. Use eventual consistency (not suitable for every workload)