T-SQL Tuesday #12: Why are DBA Skills Necessary?

TSQL Tuesday 12

T-SQL Tuesday #12 hosted by Paul Randal

T-SQL Tuesday is being hosted this month by the great Paul Randal (blog | @PaulRandal). This is awesome, because it’s Paul, but also not so awesome, because it means Paul is guaranteed to read my stuff. I’m not nervous about this at all, I don’t know what you’re talking about.

I had a couple ideas for this, and if I had gotten started on this sooner, I probably would have written two different posts. Too many things didn’t work out right for that, so there’s only this one. Hopefully I didn’t choose poorly. Here we go…

So…Why are DBA skills necessary?

Well…because they are!

I mean… Are car mechanic skills necessary? Are pilot skills necessary? Are business skills necessary?

OK, you’re right, it depends. It depends on if you’re planning on rebuilding a Rochester carburetor and having the engine idle afterwards. Or landing a 747 on an actual runway and not bending anything. Or being the CEO of a multinational company and it continue to grow, prosper, and make money.

I don’t think those are extreme examples. Just like the above, if you want to build, support, and continue to improve a highly available, scalable, and performing database system, you need DBA skills to make that happen. Sure, even if you don’t know what you’re doing, you might get lucky and find a washer in the carb’s air horn that had locked the air valves on the secondary bores closed, the removal of which led to unprecedented quantities of burning gasoline, but you can’t run on luck forever.

But…Why?

All these things are hard. I know that rebuilding a Quadrajet is hard, because those things are a pain in the ass. I know that landing a 747 is hard, because it’s a giant airplane that isn’t slow and might have lots of people in the back. OK, technically I don’t know that it’s hard to run a multinational corporation, but I’m pretty sure it is.

Know what else is hard? Databases. Databases are hard.

This isn’t about platforms or anything like that. Databases are hard when they’re Oracle DBs or when they’re mySQL DBs. Databases are hard when they’re Access “databases”, although in a different way (and in a way that’s not hard if your name is Brent Ozar).

OK, why are databases hard?

They require concentration. They require a wide range of technical skills from all parts of IT to accomplish successfully. They require you to know where to look and who to ask when there are things that you don’t know. They require being able to deal with the pressure put on you when things aren’t going right.

Perhaps more so than other areas of IT, when things aren’t going right, lots of people aren’t happy. Databases contain one of the most important assets of their owners: their data. Having this data safe and available in a timely fashion is a requirement if that data is going to be useful at all. Being able to troubleshoot and fix problems while juggling those unhappy people isn’t always an easy task.

Databases are hard because there are a lot of places for things to go wrong, and a DBA needs to be able to deal with all of those. There’s data modeling, which can cause never-ending problems when done incorrectly. There are server administration tasks, which can have fundamental performance impacts. There is a need for security skills to keep that all-important data safe from all kinds of bad guys. The list goes on.

Why is this point missed?

I don’t know. Wish I did, pretty sure I could make a lot of money 😉

The problem isn’t with us. For the most part, IT folks already know that DB work is hard. We don’t need to convince ourselves. All manner of IT management and/or business owners need to know that DB work is hard, just like most other forms of IT work. This is where the problem is.

It seems like DBs get short-changed a lot, doesn’t it? When things start out, the little Access DB works fine for the few users that there are. As the business grows, either a Dev or Sysadmin that knows a bit about what they’re doing gets a hold of a SQL Server license and migrate the system over. This is completely fine for small to even medium-sized shops. Hardware and SQL itself will run really well out of the box for the vast majority of applications out there.

The problem is when you cross that line. The above-mentioned migrated Access app will run fine for probably a long time, but the next thing anyone knows, it’s five years down the road and the blocking is so bad in the poorly designed & maintained database that the users just know to get coffee across the street when they’re doing certain tasks because it’s going to be a while.

Pain like that can be avoided by having those DBA skills around from the very beginning. They don’t have to be FTEs. They don’t even need to be dedicated resources. They are, however, necessary skills, and every business that has a database (that should be a pretty high percentage) should have someone available to take care of these tasks at least on a part-time basis, even if it’s that Sysadmin who wants to learn the right way to do at least a few things.

Business owners might not think they need to spend the money now, and that very possibly may be correct. However, if they don’t, they need to at least know that the need will be coming someday. It’s only going to be more expensive later, and hopefully that day won’t be a time when a disk has died and the backup job hasn’t been working properly for three weeks 🙁

T-SQL Tuesday #11: The Misconception of Active/Active Clustering

TSQL Tuesday

October's TSQL2sDay: Hosted by Sankar Reddy

I was expecting my inaugural T-SQL Tuesday post to not happen this month… Until just yesterday, when I found myself helping explain to someone what an Active/Active SQL cluster was and what it wasn’t. Ah hah! I expected this to be a topic already covered fairly well, and as expected, I found a couple of really good articles/posts by others on this topic. I will reference those in a bit, but I can cover some more general ground first.

The misconception that was I was explaining yesterday morning is the same one that I’ve explained before and have seen others do the same: An Active/Active SQL Server Cluster is not a load-balancing system. It is, in fact, purely for High-Availability purposes. If what you are really after is true load-balancing, unfortunately, you will have to go elsewhere, to something like Oracle RAC.

When I say “true load-balancing”, I mean being able to load-balance a specific set of data across multiple pieces of hardware. What you can do with Active/Active clustering is load-balance a set of databases across multiple pieces of hardware. This is useful if you have a SQL Instance with multiple very heavily-utilized databases—one or more of these DBs could be pulled out to its own instance on separate hardware. This is not done without potentially serious tradeoffs, however, which will be covered in a bit.

The underlying reason for this is that Windows Clustering is only designed for HA purposes. With its “shared nothing” approach to clustering, any given resource within the cluster (say, the storage LUN that holds your accounting software’s MDF file) can only be controlled by one node of the cluster at a time. If the accounting DB’s home LUN is currently attached to the first node in a cluster, it’s going to be pretty hard for the second node to service requests for that database.

OK, what is Active/Active Clustering, then?

I like to think of it as two Active/Passive clusters that happen to live on the same set of hardware. To illustrate:

Active-Passive Cluster Example 1

This example cluster is a two-node Active/Passive cluster that contains a single SQL Server Instance, CLUS1\ACCOUNTING. This Instance has its own set of drive letters for its DB, Log, and TempDB storage (say, E:, F:, and G:). If both nodes are healthy, Node2 never has anything to do. It sits around twiddling its bits, just waiting for that moment when a CPU melts down in Node1, and it can take over running the ACCOUNTING instance to show how awesome it is.

Active-Passive Cluster Example 2

This is another two-node Active/Passive cluster, running a single instance, CLUS2\TIME (you know, because internal time tracking software is that important). Like CLUS1\ACCOUNTING above, it has its own set of drive letters for its storage needs (M:, N:, and O:)In this case, the SQL Server Instance is on Node2, leaving Node1 with nothing to do.

What do we get if we mashed these two clusters together?

Active-Active Cluster Example

We would get this, an “Active/Active” SQL cluster. It’s logically the same as the two previous clusters: it just so happens that instead of the second node doing nothing, it is running another independent instance. Each instance still has its own set of drive letters (E:, F:, and G: plus M:, N:, and O:). Just like in the “Accounting Cluster” above, if Node1 were to fail for some reason, the CLUS1\ACCOUNTING instance would be started up on Node2, this time, alongside CLUS2\TIME.

So, is doing this a good idea?

Maybe. Above, I talked about how a single instance Active/Passive cluster with more than one heavily-used DBs could be re-architected into an Active/Active cluster with some of the heavy DBs on one instance living on one node and the rest of them on a second instance living on the other node in the cluster. This gives you both High Availability and you’ve spread the load out between two servers, right? It does, but what happens if one of the nodes fails and the two instances “clown-car” onto the remaining node? Are there enough CPU & RAM resources on the remaining server to support the combined load?

That illustrates one of the major things to consider when thinking about deploying an Active/Active cluster. Tim Ford (Blog | @sqlagentman) has an article over at MSSQLTips that goes into detail about making this decision.

In addition, Aaron Bertand (Blog | @AaronBertrand) has a sweet little piece of automation to help manage resources when trying to squeeze the maximum amount or size of clowns into a single car without someone’s foot sticking out in the breeze. Of course, it’s something that you hope never actually gets used, but it’s a good way to wring the most performance out of an Active/Active cluster.

Conclusion

I don’t know if this is the best way to explain this concept, but I like it. The Active/Passive cluster concept is usually understood, so I believe it is a nice stepping stone to get to the truth about Active/Active clusters. As with everything, there are tradeoffs and gotchas that need to be considered when designing server architectures like this, and the above-mentioned others have gone a long way to cover those topics.

There we have it, my first T-SQL Tuesday post. Hopefully it is useful and not from too far out of left field 😉