Updated: Trouble with Hyper-V’s Snapshot Feature
January 30th, 2009 by Paul Sterley | 1 Comment | Filed in Hyper-V, VirtualizationI’ve just received this update from my friend, who I will call “Spleen” here to protect the “innocent”.
When you make a snapshot you go into “differencing disk” territory. Well, it turns out that even if you delete all your snapshots, those differencing disks could hang around. They get cleaned up “automatically” when you leave the machine turned off long enough. (Yeah, that’s how you activate that feature: sit around and wait to see whether it decides to start up.)
Long story short, before you know it your 40 GB VM happens to occupy 400 GB on disk. You’re out of space, and of course rumor has it (I haven’t seen it yet myself to confirm or deny) that “applying” a differencing disk to the base disk to make it go away requires as much free disk space as the sum of the base disk and the differencing disk.
Of course, you notice this when your differencing disks have soaked up ALL your free space, so unless you happen to have 50% of your hard drive’s entire capacity taken up by something else, you’re in deep, deep shit.
The way out, of course, is shuffling dozens or even hundreds of gigabytes of data all around hither and yon, until you have enough free space to fix your problem. (Ironically, this is where having two or more VMs on a single drive will save your bacon. If you only had one VM, and it filled the drive, you’re going to need a new drive that’s twice the size or so…)
I’m about to look into manually forcing it to apply the disks. You do this by finding out what the precise chain-order of the differencing disks is, and you take the first differencing disk (i.e. the one right “below” the base disk in the chain) and rename it from .avhd to .vhd, and then you use the “Edit Disk” feature in Hyper-V to squish the two discs together. Then you watch to see whether you have enough free disk for this to succeed, and if it does then you win because you’ve just made some fresh empty space. Yay!
(This information is unverified, and comes from here: http://itproctology.blogspot.com/2008/06/how-to-manually-merge-hyper-v-snapshots.html )
Seems like a shit-ton of work just because there’s no button there that says “please actually do this incredibly important task for me RIGHT FSCKING NOW because this is an important VM that really can’t sit around all weekend turned off while I pull my ass hairs and wonder whether some service will decide completely on its own to do what I need, or not, and why.”
My day would have been chock-full of work, start to end, if I had not discovered this issue late one evening when some people complained that some of the servers had stopped responding, and I was awakened by the pages. I cleared a meager 10 GB of space that was unused stuff and went to sleep knowing that would get us through until 8 am. (That technique ultimately helped us limp through until 6 pm, when we’re allowed to shut down the VMs.)
Hyper-V is definitely at around the “Version 2.0″ phase: it does some stuff, and it does some stuff really well. But the warts are so vastly terrible, you can go blind just wondering what the hell happened there. You know, sort of like Internet Explorer 2.0 was.
UPDATE:
I thought about some simple math last night / this morning, regarding how exporting a VM is kinda slow and takes up a lot of disk space. Like, 10 or 30 GB average for our machines. (10 is more normal while we are building them up, before users get on them.)
I bet that if my team had done an “export” instead of a snapshot every single time we actually did a snapshot, and then gone through the trouble of restoring on the rare occasions we needed to “apply” a snapshot, we would have used 10% of the time and 10% of the overall disk space that digging out of a snapshot hole has caused us.
Furthermore, all our “wasted time” would have happened _before_ deployment, not during deployment.
Changing gears only slightly, you can apparently make backups of your VHD if you know what you’re doing, then at a later date tell Hyper-V “drop that hard drive from the image and use this one (an old copy) instead”. Bam – now we’re effectively ghosting. And copying a VHD to a backup folder may even be significantly faster than exporting the whole VM.
Now, there may be a downside of having to shut down the VM to export it, and I’m certain you need to shut down the VM in order to just file-copy it. And, this requires research/training to achieve proficiency and confidence that you can do it without fscking up.
But you MUST shut down a machine to collapse out an applied snapshot anyway, and that will probably always be slower than copying an old VHD from the backup location. Snapshots keep biting us in the ass; I think it’s time to give shut-down-and-ghost a try instead.
Further Update:
I have a 300+GB bloated VM (should be more like 30 to 50 GB) that is merging 5 separate differencing disks at the speed of an arthritic old man frozen to death in a glacier without his walker, and every so often I take a screenshot of the files in the snapshots folder and save it.
That will allow us to look more precisely at the “A + B free disk space required” problem.
Tags: Differencing, Hyper-V, Snapshot

