Backblaze Drive Reliability

The cluster focuses on Backblaze's hard drive failure statistics, their strategy of using cost-effective drives with higher failure rates due to software redundancy, and debates on price vs. reliability trade-offs in large-scale storage.

➡️ Stable 0.7x Hardware
2,189
Comments
19
Years Active
5
Top Authors
#4971
Topic ID

Activity Over Time

2008
10
2009
57
2010
38
2011
80
2012
78
2013
106
2014
143
2015
107
2016
133
2017
120
2018
101
2019
122
2020
196
2021
199
2022
145
2023
173
2024
151
2025
222
2026
8

Keywords

LOT SSD SATA ML TCO ZFS HD NOT OS RAID drives backblaze drive hard drives failure storage cost disks failure rate grade

Sample Comments

brianwski Aug 6, 2019 View on HN

Disclaimer: I work at Backblaze.> No wonder Backblaze has replaced most of their inventory with these two brands.We Reed-Solomon encode each file across multiple drives in multiple machines, and we always use "enough parity" that the failure rate won't result in data loss.So for us, we have a (pretty simple) little spreadsheet that takes into account drive failure rate as a cost, along with drive density (more dense drives means renting less data center space) and we l

rdoherty Jul 20, 2011 View on HN

It's 135TB worth of drives, but with RAID don't you see a far less useable amount?Also, considering saving money on hardware costs is a key factor in Backblaze staying competitive, they must be saving money elsewhere and/or have other competitive advantages. Otherwise releasing this information would be akin to publishing a restaurant's 'secret sauce'.

xvolter Aug 18, 2020 View on HN

I think this comes down to price. They are constantly buying drives and to avoid supply issues, it is acceptable to have a hard drive that fails to a certain degree, if the price is balanced.It would be interesting if these reports included prices, but that might be a problem for Backblaze to reveal so much about their business operational costs.

fencepost Oct 20, 2020 View on HN

Backblaze's usage pattern is NOT the same as yours with 99+% certainty. Their drive conditions are also unlikely to match yours.They have a solid system set up for the installation and replacement of large numbers of drives on a regular schedule and the way they segment data would require multiple drive failures to cause data loss (I think it used to be 3 drives out of 20, but don't quote me on that). A slightly higher failure rate on a noticeably cheaper drive still works out in th

atYevP Oct 20, 2020 View on HN

Yev from Backblaze here -> Most of the comments have it. They tend to be more affordable and the failure rates aren't that much different from their counterparts. Since we account for failure with our software layer a small failure rate delta isn't a big deal and is eclipsed our cost savings which we pass on to our customers!

aftbit Sep 18, 2023 View on HN

I just bought the cheapest "Grade A" drives I could find from eBay. This is not the reliable way to do it, but as I have a 3 layer backup solution anyway, I don't really mind the risk of a drive failure.It depends on what your plans for the storage are. If you're going to fill it with bulk data that gets accessed sequentially (think media files), then performance will be fine with basically any topology or drive choice. If you are going to fill it with data for training ML

brianwski Dec 16, 2014 View on HN

Backblaze employee here. We have about 900+ pods, each with 45 drives, so that makes a little less than 45,000 drives. We publish our annual failure rates, but let's say blended between drive types it's like 2-3 percent. Think we replace 4-6 drives every day?But the main reason we buy drives is we deploy 30 new pods a month right now, and it's increasing. So we deploy 1,400 drives a month or so in new pods. Sometimes we buy in 3 month batches (to get a break on price) so

atYevP Oct 6, 2017 View on HN

Yev from Backblaze here -> We have relationships with manufacturers but don't directly buy from them (yet - fingers-crossed). It really is a combination of price/reliability. At our scale we can afford slightly higher failure rates if it means less expensive drives, so our #1 data point for purchases is the cost per gigabyte. After that we do look at failure rates, but unless it's something wildly out of "normal" we tend to live with the occasional failures. Thanks to

ignaloidas Aug 6, 2019 View on HN

Not necessarily. You loose a bit of data reliability(which is an important quality for a backup provider) by using a drive that fails more often, and you have to pay wages to workers that have to replace those failing drives. At a scale 5% sooner means additional employees just to replace those drives.

8draco8 Sep 27, 2019 View on HN

He's saying that performance gain may not be worth huge price tag. It's similar to Backblaze which decided to use consumer grade hard drives for their storage solution rather than pro grade drives because cost increase was not worth it.