Server Redundancy and Failover
Discussions center on the need for redundancy, failover systems, multiple servers, and multi-provider setups to mitigate outages from hardware failures, datacenter issues, or maintenance.
Activity Over Time
Top Contributors
Keywords
Sample Comments
Just pay 2x for the hardware and have a hot standby, 1990s-style. Practice switching between the boxes every month or so; should be imperceptible for the customers and a nearly non-event for the ops.
isn't it about time they supported a multi-regional failover system?
(outside of GP's reply) Generically, life is messy and unpredictable, never put all your eggs in one basket. Your cloud server is sitting on a physical hyp which will need maintenance or go down, or even something in your VM goes wrong or needs maintenance. Using a basic N+1 architecture allows for A to go down and B to keep running while you work on A - whether that's DNS, HTTP or SQL etc.
"Busting your ass" doesn't make up for the lack of failover servers.
If one machine goes down (updates, hacked, hardware failure) i still need to have a second one that can handle the incoming requests
True. But a failure of a redundant server (say, 1 out of 3 application servers) would then not force you to cancel your night/weekend/vacation.
Because your own datacenters cant go down?
Wouldn't you also need two load balancers then? Otherwise you've still got a single point of failure. And how do you keep the failover system in sync? It's a whole can of worms to promise 100% uptime. It's super rare that a server physically breaks and suddenly goes dark with less than a day's warning, and for most applications such a once-in-5-years event is tolerable if that means the hosting costs are divided by five as well (so far I'm ~10 years in and haven
Good to have 3rd party redundancy, time to fail over to something else now I'd think though.
Why not use this as an opportunity to make your server redundant ? If you don't have the time drbd + front-end load-balancers (e.g. on ec2, but just sending 403's, nothing else) , for example. Really interesting to setup, and there are plenty of cases where good support can't really help you either.