Backblaze Drive Reliability
The cluster focuses on Backblaze's hard drive failure statistics, their strategy of using cost-effective drives with higher failure rates due to software redundancy, and debates on price vs. reliability trade-offs in large-scale storage.
Activity Over Time
Top Contributors
Keywords
Sample Comments
Disclaimer: I work at Backblaze.> No wonder Backblaze has replaced most of their inventory with these two brands.We Reed-Solomon encode each file across multiple drives in multiple machines, and we always use "enough parity" that the failure rate won't result in data loss.So for us, we have a (pretty simple) little spreadsheet that takes into account drive failure rate as a cost, along with drive density (more dense drives means renting less data center space) and we l
It's 135TB worth of drives, but with RAID don't you see a far less useable amount?Also, considering saving money on hardware costs is a key factor in Backblaze staying competitive, they must be saving money elsewhere and/or have other competitive advantages. Otherwise releasing this information would be akin to publishing a restaurant's 'secret sauce'.
I think this comes down to price. They are constantly buying drives and to avoid supply issues, it is acceptable to have a hard drive that fails to a certain degree, if the price is balanced.It would be interesting if these reports included prices, but that might be a problem for Backblaze to reveal so much about their business operational costs.
Backblaze's usage pattern is NOT the same as yours with 99+% certainty. Their drive conditions are also unlikely to match yours.They have a solid system set up for the installation and replacement of large numbers of drives on a regular schedule and the way they segment data would require multiple drive failures to cause data loss (I think it used to be 3 drives out of 20, but don't quote me on that). A slightly higher failure rate on a noticeably cheaper drive still works out in th
Yev from Backblaze here -> Most of the comments have it. They tend to be more affordable and the failure rates aren't that much different from their counterparts. Since we account for failure with our software layer a small failure rate delta isn't a big deal and is eclipsed our cost savings which we pass on to our customers!
I just bought the cheapest "Grade A" drives I could find from eBay. This is not the reliable way to do it, but as I have a 3 layer backup solution anyway, I don't really mind the risk of a drive failure.It depends on what your plans for the storage are. If you're going to fill it with bulk data that gets accessed sequentially (think media files), then performance will be fine with basically any topology or drive choice. If you are going to fill it with data for training ML
Backblaze employee here. We have about 900+ pods, each with 45 drives, so that makes a little less than 45,000 drives. We publish our annual failure rates, but let's say blended between drive types it's like 2-3 percent. Think we replace 4-6 drives every day?But the main reason we buy drives is we deploy 30 new pods a month right now, and it's increasing. So we deploy 1,400 drives a month or so in new pods. Sometimes we buy in 3 month batches (to get a break on price) so
Yev from Backblaze here -> We have relationships with manufacturers but don't directly buy from them (yet - fingers-crossed). It really is a combination of price/reliability. At our scale we can afford slightly higher failure rates if it means less expensive drives, so our #1 data point for purchases is the cost per gigabyte. After that we do look at failure rates, but unless it's something wildly out of "normal" we tend to live with the occasional failures. Thanks to
Not necessarily. You loose a bit of data reliability(which is an important quality for a backup provider) by using a drive that fails more often, and you have to pay wages to workers that have to replace those failing drives. At a scale 5% sooner means additional employees just to replace those drives.
He's saying that performance gain may not be worth huge price tag. It's similar to Backblaze which decided to use consumer grade hard drives for their storage solution rather than pro grade drives because cost increase was not worth it.