Every now and then, I’ve talked about how one can leverage storage based snapshots on Pure Storage, against SQL Server databases, to do cool things like orchestrate object level restores. And we have a number of example scripts published out on our Github, that you can use as building blocks to orchestrate time-saving workflows, like the old-fashioned restore Prod backups a bazillion times to non-Prod.
Almost all of those examples had one thing in common: they were written with vVols in mind. Why? Because storage array snapshots are volume based and vVols are awesome for a number of reasons. But then Broadcom decided to deprecate them in a future VCF release (note the article no longer explicitly states VCF 9.1 like it did originally), but that’s a different topic for a different time.
TL;DR – Where Can I Get Code Examples?
Back to VMFS & VMDK files
Many folks who virtualize their SQL Servers on VMware choose to rely on VMFS datastores with VMDK files underneath. And that poses an interesting challenge when it comes to taking storage array snapshots.
Remember how I said that our snapshots are volume based? That means that a storage volume corresponds to a VMware VMFS datastore. And what’s inside a VMFS datastore? A filesystem containing a bunch of VMDK files, each of those representing a virtual disk that is presented to a virtual machine. Thus, if one takes a snapshot of a datastore, one could be snapping a dozen SQL Server VMs and ALL of their underlying disks! Fortunately FlashArray is space efficient because it is globally data deduping, but this nuance still presents a headache when one just wants to clone some SQL Server databases from one machine to another.
With vVols, RDMs, and bare metal, we can isolate our snapshot workflows to just snap volume(s) containing our databases. If you’ve segregated out your user databases into individual volumes, like D:\ and L:\ for data files and log files respectively, that means we can skip your C:\ with OS, T:\ with TempDB, S:\ with system databases, M:\ with miscellaneous, etc.
To accomplish the same with a VMFS datastore, we need to work with the underlying VMDK files to achieve virtual machine disk level granularity.
Explaining the Workflow
First, if you want to follow along in my example code, hop over to our Github and you can pull up the Powershell script to follow along. Each discrete step is commented so I’ll just be giving a higher-level overview here.
In this example scenario, there’ll be a Production SQL Server (source) and a non-Production SQL Server (target). The source will reside on one VMFS datastore and the target will reside on a different datastore. In this example, it will also reside on a different array, to showcase asynchronous snapshot replication.
Like our normal workflow, we will take a crash-consistent snapshot of the source datastore, then we will clone it into a new volume on the array. That new volume will then be attached/presented to the ESXi host that our target SQL Server VM resides on. Next (after prepping the SQL & Windows layers), we will take the VMDK file(s) that contain our database(s) that we want to refresh, and detach and discard them from the target VMFS datastore. Don’t panic! Remember, we don’t need them anymore because we’re refreshing them with newer cloned ones from our source! After discarding the old VMDKs, we then connect up the VMDK files that we want, that are residing in the cloned datastore. Once that’s done, we can bring our disks and databases back online in Windows and SQL Server, and work can resume on the target SQL Server instance.
But we’re not quite done. At this point, the target SQL Server VM now has two datastores attached to it: the original datastore it resided on plus the newly cloned datastore with the newly cloned VMDKs from our source. We don’t want to leave those VMDK files there. This is where we leverage VMware Storage vMotion! We’ll conduct an online storage vMotion operation that’ll swing the VMDK files over from the cloned datastore to the primary datastore. Even better, VMware should leverage XCOPY under the covers, meaning the FlashArray won’t actually copy the VMDK files, but just update pointers, which’ll save a tremendous amount of time! Once the Storage vMotion is completed, we will go through the steps to disconnect then discard the cloned datastore because we no longer need it.
Voila, now we’re done!
From an operational standpoint, this will take longer than “just a few seconds,” mostly because of the new VMware steps involved, like re-scanning storage HBAs (twice). But all in all, it should only take about a minute or so, end to end. That’s still a tremendous time savings vs a full copying of a backup file to your non-prod environment, then executing a traditional restore.
What about T-SQL Snapshot Backup?
Great question! T-SQL Snapshot Backup is one of my most favorite features in SQL Server so I also put together an example that leverages that capability with a VMFS datastore! You can find it in the Point in Time Recovery – VMFS folder! It’s fundamentally the same thing as before, but with the extra VMFS/VMDK steps like what I outlined above.
Thanks for reading – hope you enjoy your time saved by using FlashArray snapshots!