Thursday, January 13, 2011

Magical Snapshotty-things

Magical Snapshotty-Things

I spent some time with a storage vendor recently. Vendors kill me. No matter what you say, they still cannot for the life of them understand why you are not using their product. And if you are, they are mystified by the fact that you're not using every last bell and whistle. In this case, the vendor was questioning why we weren't using their magical-snapshotty backup solution. Basically the way their backup feature works (similar to most snapshotty type of features) is that when a backup occurs, only the deltas are written out. Then, pointers/vectors (vectors sounds cooler) act as reference points to the delta blocks. If a recovery has to occur, the product is smart enough to The upshot of stuff like this is that recoveries are blazingly fast and the amount of data written is extremely small.

Very attractive - too good to be true right? Maybe a little - which takes me to my conversation with the vendor and my point about the inability of vendors to see past their product.

Me: So, your solution doesn't actually copy the data anywhere, except for the deltas?
Them: Yes, that makes it extremely fast and it uses tiny amounts of space.
Me: Okay, but that means there's not a complete copy of the data on physically separate part of the SAN?
Them: Yes, and it's extremely fast by the way.
Me: Um, yeah. So what if something radical happens? What if you lose more disks in the RAID group than you have parity disks?
Them: --Laughs--. That never happens.
Me: Really? I've seen RAID5 groups where two disks failed simultaneously.
Them: No way. Really?
Me: Yep. I've seen it happen three different times.
Them: --dumbfounded look. crickets chirping--
Me: So, you're willing to sign a form that guarantees that your storage system will never have a failure of that nature?
Them: --exasperated look-- Well, we can't really do that.
Me: Hmm. That's a shame.

In the end, they were probably frustrated with me, and I didn't intend to make them mad, but sometimes a DBA has to call BS on things. There's nothing wrong with their product. It's a very good product and we may end up making use of it in some way. The problem is that they're proceeding from a false assumption: namely, that unlikely failures are impossible failures. They're not.

In my last post, I mentioned that I would talk about the second common problem I see in the DBA world with backups, and that is "shortcuts" – ways to make things better, faster, stronger that ultimately leave you with a noose around your neck. The skeptic in me says, if it sounds too good to be true, it probably is – or at least there are probably some strings attached. If these guys were selling a magical-performance-tuney thing, it would be different. But as a DBA, you need to understand that there is no area where your fannie in the on the line more than the recoverability of the data. If you lose data and can't recover - it's gone - and you may be too.

With all apologies to Harry Potter, the trouble with magic is that it isn't real. Database administration isn't an art – it's a hard, cold science. In the end, there aren't many shortcuts to doing your job. If you're going to use a magical backup solution, you have to be dead sure 1) that you know the exact process as to how you're going to magically recover that data and 2) that you've taken every eventuality into consideration.

So in the end, problem #2 is similar to problem #1. Test things and make sure you know what you're doing. If red flags go up, stop and think. I don't want to see you in the unemployment line.

No comments:

Post a Comment