The author quickly dismisses hard drives because at the time of the Glacier launch SMR drives were to expensive because of the Thai flood. But after a few years of running S3 and EC2 Amazon must have tons of left-over hard drives which are now simply too old for a 24/7 service.
So what do you with those three year old 1 TB hard drives where the power-consumption-to-space ratio is not good enough anymore? Or can of course destroy them. Or you actually do build a disk drive robot, fill the disk with Glacier data, simply spin it down and store it away. Zero cost to buy the drives, zero cost for power-consumption. Then add a 3-4 hour retrieval delay to ensure that those old disk don't have to spin up more than 6-8 at times a day anymore even in the worst case.
The problem with this theory, however, is that tape still would have been cheaper with roughly the same footprint, and tape has the benefit of often being forward-compatible too, as the drives improve, so does the storage capacity of the tape.
The author dismisses that amazon wasn't using tape, but I haven't seen much evidence to support that necessarily.
Maybe the reality is they probably use a bit of everything. A robotic disk library, a robotic tape library, maybe a robotic optical library. Maybe they're secretly ahead of the rest of current tape technology and are getting 20TB out of a single tape. They could be using custom hard drives, or even having a robotic platter library.
One thing for sure is that it's not active disk drives, and I don't believe they keep all the data on optical disks alone, given that optical disks degenerate much quicker than magnetic storage.
I dont understand how tape infrastructure would be cheaper than free used disks that are on only 4-5 times a day for mostly very short periods of time (actually some discs would be off for months). Disks are easy to use, they are not that easy to damage and you can have each copy of data synchronised on 2-3 different DCs easily using Amazon network during night times (low network usage).
Wouldn't be possible to just stack them in custom/modified racks, connect them to the tape and custom switch? Then get very cheap servers to switch through them? The hard drives don't generate much heat anyways so only basic ventilation would be needed. Using enterprise servers for Glacier sounds like an expensive luxury. Most discs would be mostly offline anyways (small amount of changes) and I bet more than a half discs would be used less than once a week so cost of. Then if you want to increase speed of the whole structure put 1u server with 4 x 4TB SATA3 discs in Raid10 for caching the most used/changed parts, connect server to 1Gbps line and you are all done on budget. The cost is just 1 simple server, modification to racks (that would cost really small money), tapes for discs and custom hdd switch.
Disks internally have lots of extra space to handle errors and are not pushed to the limits of their physical storage. Amazon could have partnered with a disk manufacturer or written their own firmware to have disks that are somewhat lossy but 3x of what reliable drives provide.
> HDDs are not designed for long term archive. They simply fail hard when being moved or switched off for a long time.
This is also surprisingly easy to do with tape - simply leaving an LTO tape on its side can push failure rates to significant levels within a year.
If you care about data there's no substitute for multiple copies which are regularly verified. The idea that you can leave something on the shelf and expect to reliably read it is a dangerous myth. If you care about archival, build a system with the staffing and procedures needed to make that happen. This is enormously easier and cheaper to do with spinning disk below a certain level but if you have enough data the lower media cost of tape will balance out the increased overhead.
Thank TheCondor for posting something more official.
I'd never heard about that before, either (not having had tape storage as a primary job), but a colleague was quite surprised by 20-30% failure rates for tapes first used within a year and asked our drive vendor about it. At the time, there was nothing mentioned on the media we bought and the tape vendor didn't have any official docs but the drive technician we dealt with said that it was well known among the support group - apparently a lot of customers either didn't know or didn't assume it'd be so dramatic.
> HDDs are not designed for long term archive. They simply fail hard when being moved or switched off for a long time.
A single disk certainly is bad for long term archive. But when you have thousands of disks and use multiple copies for all data with some fancy error correction you can still get extremely high reliability, you just need to calculate in the expected failure rate.
You don't even need a robot. The electronics needed to digitally "switch" which disk is connected to the server would be extremely inexpensive. I'm not sure why the original article even considers the idea of a disk drive robot.
The main problem I see with this is that if you have old disks the last thing you want to be doing is spinning them up and down all the time.
The drives and firmware you want for 24/7 use are not the same as what you want for intermittent use. On big old raid arrays, the scariest thing was powering them down because for sure, some of the disks would not come back again.
I was under the impressions that old disks generally fail by getting stuck. If they spin them up a few times a day that would seem to solve that problem.
They could even have them just plugged in to power and laying in racks with custom firmware that triggers drives to automatically spin up ever few hours/days. Then they could just be manually plugging them in to retrieve data when data from a drive is requested.
AWS does not re-use disks under any circumstance. It is strictly forbidden to protect customer data. Disks also do not leave the datacenter until they have been degaussed AND crushed.
AWS does not re-use disks under any circumstance. It is strictly forbidden to protect customer data. Disks also do not leave the datacenter until they have been degaussed AND crushed.
So what do you with those three year old 1 TB hard drives where the power-consumption-to-space ratio is not good enough anymore? Or can of course destroy them. Or you actually do build a disk drive robot, fill the disk with Glacier data, simply spin it down and store it away. Zero cost to buy the drives, zero cost for power-consumption. Then add a 3-4 hour retrieval delay to ensure that those old disk don't have to spin up more than 6-8 at times a day anymore even in the worst case.
But that's just my personal theory anyway.