I've been meaning to use AWS Glacier Deep Archive for years because it's so inexpensive. I just uploaded 780.7 GB of data into S3 using Glacier Deep Archive, which will cost me 77¢/month compared to $17.96 if I had stored this data as typical S3 Standard storage.
The tradeoff is that retrieval of my data from Glacier Deep Archive takes either 9 – 12 hours or up to 48 hours depending how much I'm willing to spend to retrieve this data ($5.07 vs $1.56, respectively).
This type of storage is ideal for backing up large amounts of data. In my case, I backed up my entire Apple Photos library consisting of 176,470 Photos and 6,184 Videos.
Challenges
Copying hundreds of gigabytes over the Internet is not a trivial matter. Not only does it require a stable Internet connection for a dozen hours, but one must also consider if your ISP has bandwidth limits, such as Cox, with a 1,024 GB monthly limit. (ISPs that use fiber optics generally don't have a monthly bandwidth limit.)
Initially, I copied my Photos Library.photoslibrary to my iCloud drive in hopes that it would be synced/backed up to iCloud. But, macOS exempts this file from being synced to iCloud since the iCloud Photos service is specifically designed to backup photos and share them across multiple devices.
A simple alternative solution, in theory, would be to compress (ZIP) my Photos Library.photoslibrary "file" (it's technically not a single file, but many files packaged up to look like a single file in Finder). This would have been my first option except my computer's hard drive doesn't have an extra 800GB to store the zipped up file prior to uploading it.
The best option I found, without the need to install any other software, was to use the Unix tar command while pipping the output directly to my AWS bucket.
tar -cf - "./Photos Library.photoslibrary" | \
aws s3 cp - \
s3://joe-apple-photos/Photos_Library_Backup.tar \
--storage-class DEEP_ARCHIVE
It took me two attempts to pipe the data to S3. The first attempt failed with the following error:
An error occurred (InvalidArgument) when calling the UploadPart operation: Part number must be an integer between 1 and 10000, inclusive
I fixed this issue by running the following AWS CLI command to change the multipart chunk size:
aws configure set \
default.s3.multipart_chunksize 512MB
Bonus Points
One downside to this technique is that I could only guesstimate how long the upload would take since there's no progress indicator. Next time I do this, I'll install and use the pv (pipe viewer) command:
tar -cf - "./Photos Library.photoslibrary" \
| pv -s 700g | \
aws s3 cp - \
s3://joe-apple-photos/Photos_Library_Backup.tar\
--storage-class DEEP_ARCHIVE
I haven't used PV before, but the progress indicator will look like this:
142GiB 0:03:15 [12.8MiB/s]
[==============> ] 20% ETA 12:08:45
No comments:
Post a Comment