Busy photographers generate a lot of data and despite wishing for a single photographic data management solution genie, there just isn’t one. At least not without putting your photographic eggs all in one basket – never a good idea.
This article is not intended to be a full review of each of the 3 elements of data management, but rather to show how easily the 3 components can work together when the right individual solutions have been identified for your needs.
Since my primary workstation is an aging 2008 Mac Pro 3.1 (still waiting for Apple to announce the new model in 2018) I simply cannot afford to rely upon internal drives where my data cannot be quickly accessed in the event of a computer failure. Secondly, the process of managing data across multiple matching pairs of external drives has become too cumbersome and the data isn’t exactly what I call accessible when I am in the field.
Galloping to the rescue is Network Attached Storage (NAS) but as you will read, it’s only part of the picture.
After lengthy research, enter the QNAP TS-653A NAS and the first question, why QNAP? The choice ultimately came down to Synology versus QNAP. I did briefly look at Drobo but stories of proprietary file systems and expensive data recoveries turned me off. The QNAP won due to the slightly better price point and better hardware specification for my needs.
The 6 in the TS-653A is the number of hard drive bays that can be configured into RAID to maximize the availability of my data. At this point, it is critical to mention that data availability is simply that, it’s availableness. Any measure of availableness is not a backup, a backup is something else, somewhere else. Availableness can be defined as the ability to survive 1 or more failures and still be available.
The QNAP was very easy to configure, I simply installed 5 Western Digital RED 6 TB drives, started up the unit and configured RAID 10 with a hot spare.
The result, 10.62 TB of data availableness.
Is my storage and availability problem solved now? Almost, but not quite!
Enter another QNAP, this time a (2 bay) TS-253A unit with a single 8 TB WD RED drive – located securely off-site. The 2 QNAP units synchronize in real time over the Internet via an encrypted tunnel. Why? Because in the unfortunate event of a house fire, burglary or hurricane etc., my data is still available and relatively close.
Warning, techie bit! I wrote, I installed the drives and configured RAID 10, in reality, I did do some testing beforehand. RAID, or Redundant Array of Independent Disks, is a storage virtualization technology which combines multiple disks into one or more partitions to achieve better performance, resiliency (the ability of the partition/s (my data) to survive a drive failure) or both, based on the balancing act of prioritizing the features most important to you.
With a 6 bay NAS and 5 drives, I had 2 options, RAID 6 or RAID 10. I initially built a RAID 6 array with all 5 drives, the result was 17 TB of storage, compared to 11 TB with RAID 10, the latter based on 4 active drives and 1 hot spare. The write speed of RAID 6 was 50% slower than RAID 10 and read was 25% slower. The RAID 6 array rebuild time (what happens after a drive fails) was significantly longer, in my case over 24 hours.
RAID 6 is technically more survivable if you only have a single NAS, and only then if you can accept the slower write performance. Given that all of my data resides on the NAS, read/write performance is very important to me so RAID 10 is what I decided upon. In my opinion, the 2nd NAS in real time synchronization mode provides the additional resiliency I lost by choosing 10 over 6, as well as giving me the added benefit of off-site security.
If I need more than 11 TB in storage, I can simply buy another 6 TB drive, configure it and the current hot spare into the array for 18 TB in total. Or I could buy another pair of 6 TB drives, still get 18 TB, but keep a cold spare standing by.
There are so many cloud-based backup solutions that it is impossible to list them all however, a solution exists close to home – Amazon. Amazon Web Services (AWS) S3 to be precise. S3 is Amazon’s enterprise-class cloud storage solution.
As an Amazon Prime member, I have free access to the AWS Free Tier and this comes with a 5GB storage limit, more than enough storage to do some serious testing.
QNAP offers 2 native apps to use S3 for backup, Hybrid Backup Sync, and S3 Plus. Choosing S3 Plus was easy as Hybrid Backup Sync was still in beta at the time. Configuration was straightforward and the backup happened seamlessly. Over a 1 week period, I regularly changed the contents of the local backup folder and S3 Plus/AWS never faltered. The restoration process was also very intuitive.
S3 offers 3 storage classes, Standard, Standard_IA (Infrequent Access) and Glacier, each with a different data retrieval and pricing structure. My data is backed up first to the Standard class. The object lifecycle policy I created within S3 will move my data automatically to the Standard_IA class if it hasn’t changed within 30 days. Cost is the primary reason for the change of storage class after 30 days. Standard_IA is half the price of Standard per GB, per month. Plus, my data is easy to get back in case I need either a full or partial restoration. Glacier is something I don’t currently use as this is a deep freeze back up and the data isn’t accessible in real time.
The first backup of 650 GB took 74 hours to complete and incremental backups occur automatically each night. The first monthly bill was a very reasonable $7.77. Only having my image folders plus a handful of other data directories backing up to S3 is the best backup security to cost ratio. The entire contents of the NAS, images, documents and iTunes library etc., are replicated to the secondary unit.
Now that I have a proper storage and backup solution, what is the best way to use it?
The speed of accessing my data is very important. This is why my Mac Pro is connected to the NAS by Gigabit Ethernet cable. This gives a read/write speed of 109 MB/s or 872 Mbps which is right at the maximum theoretical throughput of Gigabit Ethernet – which is another way of saying, very good and very fast.
Another techie interlude. My NAS is actually connected to my home network with 4 x 1 Gbps interfaces in a single LACP (Link Aggregation Control Protocol) channel. No, this doesn’t give a single device 4 Gbps of bandwidth to my NAS! It does allow up to 4 devices to each access the NAS at maximum single Gigabit Ethernet speeds, 872 Mbps. My Mac is synchronizing data to my NAS over 1 interface. The NAS is syncing to the secondary unit over a 2nd interface. I can then go and stream an HD movie from my NAS using Apple TV on the 3rd interface. All this can occur at the same time without any degradation of performance.
A network is fast, but using SSDs in the computer is faster still. 2 SSDs in RAID 0 (striping) allows the highest read/write speeds that the computer can achieve. In my case 500 MB/s or 4000 Mbps. For this reason, all new project images are editing directly using local Mac SSDs.
Yes, there are NAS units which can directly attach to the Mac via USB 3 or Thunderbolt. As I don’t need all of my data to be available at such high speeds, the significant increase in cost was not justifiable. Gigabit Ethernet speed is fast enough for me.
By now you might be thinking aha, Mac / Time Machine / Hourly Backup! In 1 hour, you can make many images changes. That is a lot of work to lose – got you!
A wonderful application called GoodSync solves that problem. Under local Documents, I have a folder called Working Images (a folder which Time Machine ignores) with further folders underneath. GoodSync is configured to perform a constant, real-time synchronization to a folder on the NAS. GoodSync is exceptionally fast.
NAS is like having your own Private Cloud, securely accessible from anywhere in the world with a data connection.
Photographic Data Management – Conclusion – 3 Cowboys defeated!
Upon finishing an event or trip with hundreds, more often thousands, of images, the summary process is as follows:
- Copy images to Mac
- Real-time sync from Mac to NAS. NAS to NAS replication
- NAS to Amazon S3 for backup
As soon as the images hit the Mac, the sync to the NAS begins automatically. Every time GoodSync detects a new or changed file, it syncs it. NAS to NAS replication is constant. At the same time, Adobe Bridge begins generating all of the image preview thumbnails so I can start editing immediately. All of the data synchronizations happen transparently in the background.
This is the storage, backup and access solution working in harmony for me, not me for it.
I hope this article was useful. Please leave a comment if you liked it. Feel free to also ask a question if I need to elaborate on something for you.