Help, Comcast’s DNS is sucking again!

This past Sunday, if you lived in parts of the Midwest (I know IN & IL were on the list, but I don’t remember who all else) and have Crapcast Comcast as your ISP, you might have had problems surfing the tubes, because they were having trouble with their DNS servers again. News of this spread through Twitter (at least that’s how I heard about it since I wasn’t affected directly), in a bizarre twist of news-spreading similar to people sending email at work letting people know that the mail server is down… Obviously with smartphones & such this is still a good way to spread such information, but it still strikes me as a bit funny 😉

(I’m not going to go into what DNS is here. I assume most people reading this are already going to know what it is, what it’s for, and why it’s important, so I’m going to skip this whole part.)

The main way to take care of this is to set your machine to use other DNS servers. The usual suspects recommended during times like this are Google’s two public servers, aptly named google-public-dns-a and google-public-dns-b.google.com (-c resolves to a server, but its IP isn’t as easy to just remember (more on that in a sec). Or, it’s a Google honeypot for smarty pants 😀 ). There’s also OpenDNS servers, which apparently do a bunch of other stuff like redirecting you to the right place when you misspell something. I don’t really know how well this actually works or anything; this is just what their Wikipedia page says.

Google’s own page explaining how to set these up is a good reference for what to do if you don’t know. They’re pretty good instructions, so knock yourself out there. These servers’ IPs are 8.8.8.8 and 8.8.4.4, which are fairly easy to remember (and type), making it easy to set up on machines on-the-fly, which is important if you’re looking to set this up while DNS is down and you can’t get to google.com…

Here’s what I prefer to do

Google’s instructions are fine to set up a machine or two, but what if you’ve got half a dozen boxes in your house and/or you want a bit of a more robust solution? Glad you asked!

Most likely, the machines on your home network are serviced by your router’s built-in DHCP server. At the same time it provides the machines with IP addresses to use, it also provides them with one or two DNS servers. It should be getting these addresses from your ISP in basically the same way—it’s a DHCP client instead of a server on its external-facing port.

Obviously, under normal circumstances, your ISP is going to be providing their own DNS servers for this use, as it allows them to load custom search pages and stuff if they want to. (Did Comcast ever go through with doing that?) If your router supports it, however, you can override these settings, and that is where the more robust solution to this DNS problem comes in.

Somewhere in your router’s settings is probably a section called “Basic Setup”, “Basic Settings”, or something similar. This section will include things like the router’s name, authentication settings in case your ISP requires that, and possibly more advanced things like MAC address spoofing. What you’re looking for here is a section that refers to “Static” or “Manual” DNS servers like these:

WRT54G Settings

Old Linksys Settings screen

Netgear Settings Screen

Netgear Settings Screen

These are a couple of the routers that I have available to look at. The first one is an old Linksys WRT54G (if you aren’t actively using one of these at home…well…you have my sympathy) and the second one is our travel router, some random Netgear thing. These are the general settings areas that you’re looking for.

The main idea here is to put Google’s DNS servers’ IPs into those manual configuration boxes. You could forego your ISP servers altogether and just put in both of Google’s server and call it a day. However, for some reason, this strikes me as something that isn’t the best idea. I don’t have a real good reason for it, it’s just one of those things that doesn’t feel right. What I would do is to get one of your ISP’s DNS servers and put that in as your primary, then one of Google’s servers as the secondary (or tertiary, if you have an option for 3). That strikes me as a safer option, but there is one thing to keep in mind: If your ISP ever changes their DNS servers, since your settings are manually configured, that change won’t be made automatically like it otherwise would. Obviously if you’ve got one of Google’s servers in there, things will still work, but name resolution may be a bit slower if your machine picks the now-broken IP first and you have to wait for that to time out before it hits the second machine. This is just something to weigh; I don’t know that there’s a good reason to not just put Google’s DNS servers in there and go on with life, other than it’s almost definitely an “unsupported” configuration in your ISP’s eyes.

The reason I advocate this method of setup is simple: When you do this, these addresses will be passed on to your workstations when they get IPs to use from the router. This will happen for every machine that connects to your network, so you only need to set this up once to use these other servers.

While you’re in here messing around, it is important to leave the router set to get its external IP via DHCP or whatever else your ISP has told you it needs to be set to. If you do this, then things will work OK for now, but at some time in the near future, it’s probable that things will stop working altogether. If you can’t manually set your DNS servers without leaving the IP address on Automatic/DHCP, then you won’t be able to set DNS up this way, and you’ll need to use Google’s instructions to set each of your machines individually.

Nutjob solution

Or, forget all this, run your own DNS server, point it at the Root Hint servers, and thumb your nose at Comcast’s inability to do simple things like keep some DNS servers happy.

I have no idea who would do such crazy things, though <.< >.>

Maintenance Plans: They don’t like it when DBAs leave

Discovered some…interesting…information about Maintenance Plans a few weeks ago when we had a DBA leave. I don’t think it warrants being called a “bug”, but it is behavior that I don’t think is ideal, and it definitely can cause issues for us and others. The good news here is that I did get a workaround nailed down (thanks to the help of Random Internet Guy), although we haven’t decided whether or not we’re going to use it.

How this all started

Recently, we had one of our DBAs leave. (Incidentally, this is the first time ever that I’ve had this happen to me and also the first time one has left this company.) After he left at the end of his last day, I pulled all of his access to the Production systems. We knew that any Agent jobs that he happened to own would start failing, and sure enough, there were a few failures that evening and over the course of the weekend.

One of these failures turned into a little more of an involved situation: The Maintenance Plan on the only production SQL 2008 box he set up failed. With us being used to SQL 2000, this was expected to be an easy fix and life would go on. Except, it wasn’t.

Toto, I’ve a feeling this isn’t SQL 2000 any more

First attempt to fix this was to change the owner of the SQL Agent Job to ‘sa’. He’s our standard job owner for consistency and because that makes the job impervious to people leaving. The job still failed, although the error was different this time. Obviously the Agent Job owner had some role in all of this, but it wasn’t the whole story, as things weren’t fixed.

After changing the Job owner, this is the error that occurred:

Executed as user: <SQL Service acct>, [blah blah blah]  Started:  11:25:38 AM  Error: 2010-10-06 11:25:39.76     Code: 0xC002F210     Source: <not sure if this GUID is secure or not> Execute SQL Task     Description: Executing the query “DECLARE @Guid UNIQUEIDENTIFIER      EXECUTE msdb..sp…” failed with the following error: “The EXECUTE permission was denied on the object ‘sp_maintplan_open_logentry’, database ‘msdb’, schema ‘dbo’.“. Possible failure reasons: Problems with the query, “ResultSet” property not set correctly, parameters not set correctly, or connection not established correctly.  End Error  Warning: 2010-10-06 11:25:39.79     Code: 0x80019002     Source: OnPreExecute      Description: SSIS Warning Code DTS_W_MAXIMUMERRORCOUNTREACHED.  [blah blah blah] Error: 2010-10-06 11:25:40.58     Code: 0xC0024104     Source: Check Database Integrity Task      Description: The Execute method on the task returned error code 0x80131501 (An exception occurred while executing a Transact-SQL statement or batch.). The Execute method must succeed, and indicate the result using an “out” parameter.  End Error  Error: 2010-10-06 11:25:40.64     Code: 0xC0024104     Source: <another GUID>      Description: The Execute method on the task returned error code 0x80131501 (An exception occurred while executing a Transact-SQL statement or batch.). The Execute method must succeed, and indicate the result using an “out” parameter.  End Error  Warning: 2010-10-06 11:25:40.64     Code: 0x80019002     Source: OnPostExecute      Description: SSIS Warning Code DTS_W_MAXIMUMERRORCOUNTREACHED. [blah blah blah] End Warning  DTExec: The package execution returned DTSER_FAILURE (1).  Started:  11:25:38 AM  Finished: 11:25:40 AM  Elapsed:  2.64 seconds.  The package execution failed.  The step failed.

A Permissions error in MSDB? Huh?

I dug a little, and the short of it is that the Maintenance Plan’s connection string was still set to the departed Admin’s sysadmin account (a SQL login). All of the work that the MP was trying to do, it was still attempting as the other DBA. This, of course, isn’t going to get very far.

Maintenance Plans have connection strings?

This seems like a dumb question to ask, but it’s one of those things that I had glossed over in my head and not actually thought about. The answer: Sure they do! MPs are basically SSIS packages these days, and of course those have connection strings/settings. MPs are no different. Seems like a “DURRR” now, but hindsight, yadda, yadda.

This connection string is controlled by the “Manage Connections” button in Maintenance Plan toolbar. If you never mess with this when you create an MP, it uses an auto-generated connection to the Local Server, using whatever name you used to connect to it (there’s another topic right there, but someday I’ll talk about some quirks with Aliases I’ve run into, and I’ll address that then). This auto-gen’d connection also uses the same auth credentials you are currently using. If you only use Windows Auth, you are stuck with that; if in Mixed Mode, a SQL auth account is an option, and you can set it to another account. Of course, you would need to know that account’s password.

To fix our MP’s connection string, I could change it to my own credentials, but that of course doesn’t actually fix anything, and will leave an even bigger mess for whoever’s left if/when I leave or responsibilities change. I can’t set it to Windows Auth, because when you select that, it validates the auth right then and there. Since we don’t have Windows Auth SA accounts, that won’t work for us: the AD account that I’m logged on with doesn’t have enough rights. Something else needed to be done.

At this point I realized that this is a fair-sized problem and we can’t be the first shop to run into this; I mean, SQL 2005/2008 have been out for a while, and I’m sure lots of DBAs have left shops that heavily utilize MPs. Time to ask the Senior DBA.

Going nowhere fast

For as common as I expected this to be, I wasn’t finding much helpful information. There were some references to this error message, and even someone in the same situation I was in, but most of what I found were dead-end threads.

At one point, I thought for sure that @SQLSarg had it. This blog post contains the revelation that MPs have Owners! His main point of changing the MP owner is to change the owner that the associated Agent Job(s) are set to when you save the MP. Even though I had already changed the job owner to a valid account, I hoped that the MP’s owner had something to do with the credentials that were used in the Plan’s connection string.

Of course, it didn’t. No behavior change.

Aside: Even though it didn’t help with what I was trying to fix at the time, I believe that changing an MP’s owner to a shared or central service account is a good idea. Every time an MP gets modified, the Agent Job’s owner is changed to the same account as the Plan’s Owner. This might not even be you, and when that person leaves the company, you are guaranteed to have execution problems. There’s a Connect item open asking to make it easier to change the owner, so if you like Maintenance Plans and don’t hate yourself, it’s probably a good idea to log in over there and mash the green arrowhead.

The more I dug, the more dead-end threads I read, the more almost-but-not-quite blog posts I ran across, the more I became reserved to one simple fact:

It is impossible to set the Connection String Credentials in a Maintenance Plan to an account that you don’t know the password for.

As that started to sink in, I asked #SQLHelp if that were, in fact, true. I got a couple responses that yes, that was the case (I would be specific, but this was a number of weeks ago, and since Twitter sorta sucks for finding things like that and I can’t remember who it was, we’re out of luck). Although not the news I was looking for, I was glad to know that the deal was pretty much sealed—we were going to need to figure out something else to do.

One last gasp

Since I don’t feel like I’m doing everything I can while troubleshooting a problem if I don’t grab for at least one straw, I decided to investigate something that was described in one of the dead-end threads that I had found while searching.

This thread on eggheadcafe is a conversation between “howardwnospamnospa”* and Aaron Bertrand, who we all know and love. This one “howardw” was in a pretty similar boat to what I was in, and was really close to getting it to work. He was only thwarted because of an apparent quirk when an Instance is in Windows Auth-only mode.

In the last post in the thread, he reports that when he assigns the Local System account (“NT AUTHORITY\SYSTEM”) as owner of the MP, and sets the connection properties of the plan to use Windows Auth, the plan runs. To attempt to duplicate this, I started with an existing MP on a Dev instance that I had set up (therefore my SQL auth account was the MP owner and the user listed in the Connection Properties) and proceeded with the following steps:

  1. Using the statements in SQLSarg’s post, I changed the owner of the MP to Local System
  2. I added my Domain account to the Instance with SA rights
  3. I opened the MP in question with SSMS, switched its connection to use Windows Auth (while logged in with the Domain account used in Step 2), and saved it.
  4. Checked the MP’s auto-gen’d Agent Job to confirm that its owner was NT AUTHORITY\SYSTEM.
  5. Using a connection with my Domain account, I pulled my SQL account’s SA rights. This would previously make the MP sub-plan fail.

I ran the job and, holy cow, it worked! 😀

That’s good, right?

Ehhhh, maybe. Setting the owner to Local System seems pretty dirty. I haven’t thought of a way that this would be a security problem, but it still feels like a stupid hoop to jump through. For us, it has the extra hoop of us having to add our Domain accounts to the instances every time we want to set up or change connection info on an MP (once the Windows Auth is set, it doesn’t re-validate it when you save the plan, so you only have to do it that one time), then take it out when we’re done.

Those bits aside, this workaround (if you will let me get by without calling this a “hack”) does allow setting up Maintenance Plans that will live through the DBAs that set them up leaving the company. By technicality, I achieved my goal.

We’ve talked about this setup very briefly at the office, but haven’t made a final decision on whether or not to go down this road or to abandon MPs for DB maintenance altogether. Hopefully we will get that decision made before someone else leaves.

But at what cost?

The main cost here is a lot of destruction to my affinity for Maintenance Plans.

Using MPs in general, love them or hate them, is pretty much a religious debate. Usually I fall on the proponent side of those arguments. I think they’re easy to set up, they’re pretty flexible, they can dynamically pick up new DBs that get added to the system (not always an advantage), you can cook up some really crazy dependencies within them, and I like pictures & flowchart-y things :-).

After this whole ordeal? Pretty sure I’ve changed my mind. This whole default behavior of how they will stop working when the account that set them up initially is no longer an SA is way too much of a price to pay. What happens when you’re a single DBA shop and you get honked and bail one day? Near as I can tell, if your account gets restricted as it should, backups will stop working, but there won’t be anyone left to notice!

In lieu of MPs, there’s at least one good option out there that has most all of the desired functionality. While this was going on, @TheSQLGuru posted a link to this set of SPs. I haven’t had time yet to really dive into these, but on the surface, they look pretty good.

Final Comment(s)

This entire thing bugs me. I’m almost certain I’m missing something. This should be a huge problem for lots of people. I feel like I’m missing a detail or it’s because we use SQL logins for the most part or some other thing. But… in all of the testing that I’ve done, bad things happen when the initial creating user gets SA pulled. Period. I can’t get past the feeling that I’m doing something wrong, but I haven’t been able to figure it out yet.

In the meantime… I think Maintenance Plans are evil after all…

* Apparently not to be confused with howardwSOLODRAKBANSOLODRAKBANSO[LODRA] (Eve-Online thing; it’s honestly better if you don’t ask)

T-SQL Tuesday #11: The Misconception of Active/Active Clustering

TSQL Tuesday

October's TSQL2sDay: Hosted by Sankar Reddy

I was expecting my inaugural T-SQL Tuesday post to not happen this month… Until just yesterday, when I found myself helping explain to someone what an Active/Active SQL cluster was and what it wasn’t. Ah hah! I expected this to be a topic already covered fairly well, and as expected, I found a couple of really good articles/posts by others on this topic. I will reference those in a bit, but I can cover some more general ground first.

The misconception that was I was explaining yesterday morning is the same one that I’ve explained before and have seen others do the same: An Active/Active SQL Server Cluster is not a load-balancing system. It is, in fact, purely for High-Availability purposes. If what you are really after is true load-balancing, unfortunately, you will have to go elsewhere, to something like Oracle RAC.

When I say “true load-balancing”, I mean being able to load-balance a specific set of data across multiple pieces of hardware. What you can do with Active/Active clustering is load-balance a set of databases across multiple pieces of hardware. This is useful if you have a SQL Instance with multiple very heavily-utilized databases—one or more of these DBs could be pulled out to its own instance on separate hardware. This is not done without potentially serious tradeoffs, however, which will be covered in a bit.

The underlying reason for this is that Windows Clustering is only designed for HA purposes. With its “shared nothing” approach to clustering, any given resource within the cluster (say, the storage LUN that holds your accounting software’s MDF file) can only be controlled by one node of the cluster at a time. If the accounting DB’s home LUN is currently attached to the first node in a cluster, it’s going to be pretty hard for the second node to service requests for that database.

OK, what is Active/Active Clustering, then?

I like to think of it as two Active/Passive clusters that happen to live on the same set of hardware. To illustrate:

Active-Passive Cluster Example 1

This example cluster is a two-node Active/Passive cluster that contains a single SQL Server Instance, CLUS1\ACCOUNTING. This Instance has its own set of drive letters for its DB, Log, and TempDB storage (say, E:, F:, and G:). If both nodes are healthy, Node2 never has anything to do. It sits around twiddling its bits, just waiting for that moment when a CPU melts down in Node1, and it can take over running the ACCOUNTING instance to show how awesome it is.

Active-Passive Cluster Example 2

This is another two-node Active/Passive cluster, running a single instance, CLUS2\TIME (you know, because internal time tracking software is that important). Like CLUS1\ACCOUNTING above, it has its own set of drive letters for its storage needs (M:, N:, and O:)In this case, the SQL Server Instance is on Node2, leaving Node1 with nothing to do.

What do we get if we mashed these two clusters together?

Active-Active Cluster Example

We would get this, an “Active/Active” SQL cluster. It’s logically the same as the two previous clusters: it just so happens that instead of the second node doing nothing, it is running another independent instance. Each instance still has its own set of drive letters (E:, F:, and G: plus M:, N:, and O:). Just like in the “Accounting Cluster” above, if Node1 were to fail for some reason, the CLUS1\ACCOUNTING instance would be started up on Node2, this time, alongside CLUS2\TIME.

So, is doing this a good idea?

Maybe. Above, I talked about how a single instance Active/Passive cluster with more than one heavily-used DBs could be re-architected into an Active/Active cluster with some of the heavy DBs on one instance living on one node and the rest of them on a second instance living on the other node in the cluster. This gives you both High Availability and you’ve spread the load out between two servers, right? It does, but what happens if one of the nodes fails and the two instances “clown-car” onto the remaining node? Are there enough CPU & RAM resources on the remaining server to support the combined load?

That illustrates one of the major things to consider when thinking about deploying an Active/Active cluster. Tim Ford (Blog | @sqlagentman) has an article over at MSSQLTips that goes into detail about making this decision.

In addition, Aaron Bertand (Blog | @AaronBertrand) has a sweet little piece of automation to help manage resources when trying to squeeze the maximum amount or size of clowns into a single car without someone’s foot sticking out in the breeze. Of course, it’s something that you hope never actually gets used, but it’s a good way to wring the most performance out of an Active/Active cluster.

Conclusion

I don’t know if this is the best way to explain this concept, but I like it. The Active/Passive cluster concept is usually understood, so I believe it is a nice stepping stone to get to the truth about Active/Active clusters. As with everything, there are tradeoffs and gotchas that need to be considered when designing server architectures like this, and the above-mentioned others have gone a long way to cover those topics.

There we have it, my first T-SQL Tuesday post. Hopefully it is useful and not from too far out of left field 😉

Alright, So Maybe I Get the Cloud…

In general, I’m a cynical, Negative Poo-Poo Head. In the last few years, I’ve become more and more this way when it comes to technology trends. My first reflex as to why, is my increased time spent working with developers on a day-to-day basis, and my perception of how development  is susceptible to Flavor-of-the-Week-itis. I don’t think that’s really the cause; instead, I think I’m just getting more cynical as I get older in general. I know, this isn’t necessarily a good thing, but that’s a different blog post. The good news here is that I at least can still turn it off when I need to.

Anyway, the Cloud and my negative poo-pooness.

I’ve basically spent more time making fun of Cloud Computing than I have learning about it. “Oh, it’s the dumb terminal/mainframe paradigm all over again”, “There’s something wrong with this whole thing when you can buy a shipping container full of servers, put it out back, and call it a ‘private cloud’”, “It’s just the latest buzzword/flavor-of-the-week”, on and on. Honestly, even though Brent, Buck, and others have been saying it’s the Way of the Future™, I didn’t buy it.

Well…I think I’m on board now.

What changed my mind?

I don’t think I can nail it down to one thing, but I can point to a couple things that have been going on lately that played into this change of heart.

Last Friday was the Nashville SQL Users Group Meeting. Presenting was Brian Prince (Twitter | Blog (lookit dat Gamerscore) ) on SQL Azure. I was pretty interested in this topic due to wanting to actually learn something about this stuff that I make fun of. OK, that and I am a little interested to see what all the fuss is about.

Although I had to leave early due to an unfortunately-scheduled meeting, I did get to see most of Brian’s presentation. I was impressed with a lot of what I saw. I was impressed by the shipping containers full of servers that make up open-air datacenters. I was impressed by how they each can have their own genset backup power sitting next to them, how the servers in the containers live at 95o F, are cooled primarily by what basically amount to Swamp Coolers, and pretty much never get touched once they’re racked up. During this part of the presentation, the Sysadmin in me was clamoring to give up the DBA thing and figure out how to get a job building/supporting this stuff, because it looks just that awesome…

Brian then went into the components the Azure platform and where SQL Azure sits in it and how it is supported, replicated, and made available. We also got to see how you actually use the system along with some good arguments about DBs that we all have in our environments that would be good candidates to move to the cloud. The prices seem pretty good (I haven’t done any extensive calculations) when compared with running small but still mission-critical DBs in-house… It all actually sets up a pretty good argument.

All of this good info and eyes-on examples went a long way to getting me over whatever I was afraid of or didn’t buy about the system.

Buck has been going on about this whole thing for a while, and even put his money where his mouth is and is moving over to the Azure team. At first, I thought that was a loss for the SQL Community, but I think with my improving outlook, I don’t feel that way anymore. I believe it will be for the better, and I can’t wait to hear more from him in the future.

These are a couple of the things that have been going on lately that are beginning to change my mind about the cloud. I’m looking forward to learning more about it and considering it as an implementation option in the future.

“Well I made a mistake today”

Getting mails from a Developer that start like this almost always leads to awesome. This turned out to be one of the times when it wasn’t as awesome as it could have been, but did give me the opportunity to spread some knowledge (which I don’t get to do very often, because, well, I’m not that smart).

This situation was the old, “oops, WHERE clauses are a good idea with DELETE statements.” The good news is that this was in Development, so it wasn’t a giant fire. Although I didn’t see the message right when it came in, I did see it in time to get to it before that night’s backup ran (we just keep one backup file in Dev and overwrite it every evening). I probably could have pulled from Production or Test instead of restoring a 140+ gig DB for a 299 row table, but we’ve got the space, more IO than God, and it was a Friday night where nothing else was going on out of the ordinary. Table restored, life goes on.

Actually, there were a couple points that I was able to make with this situation.

First: Tell your DBA when things go bad!

In our situation, with the backup file getting overwritten every night, if a Developer makes a mistake like this, they have to let us know before 8:00 the day of in order for us to be able ion do anything about it. The guys/gals have to first realize something bad happened, and then get to us right then in order to recover. If they sit on it until the next day, it is too late.

Second: BEGIN TRAN is your best friend.

When running DML, manually start and end transactions. Sure, SQL Server has the nice, easy implicit transactions that you don’t have to worry about, but those can become your worst enemy very easily. All it takes is either missed highlighting before mashing F5 or an unfortunately-placed closing paren.

BEGIN TRAN? (skip this paragraph if you already know) By default, SSMS uses implicit transactions. This means that even though you don’t type it out, when you run statements, SSMS begins a transaction, runs your stuff, and then commits it. By manually starting a transaction with BEGIN TRAN in front of your UPDATE, DELTE, or whatever, you retain control of this instead of letting the UI do it for you. This means you can run your statement(s), check the results, and then COMMIT or ROLLBACK yourself. In short, this is manual transaction control.

This one takes some diligence, because it’s easy to be complacent. I’m doing a simple little UPDATE statement, I didn’t make any mistakes, everything will be fine. Of course you think that—you wouldn’t run any statements that you didn’t think were right, would you? This is why you have to tell yourself to type BEGIN TRAN every time. It only takes once to really ruin your day.

OK, Third: COMMIT TRAN until it throws an error

This is another tip that I learned from our senior DBA on probably my first or second day on the job. Basically, when you commit your user transaction, keep trying to commit it until SSMS reports an error (trying to commit a transaction when there isn’t one open). Why? Glad you asked!

Create a table & put a couple rows of data in it:
CREATE TABLE TransTest (
  
ID      INT     IDENTITY(1,1),
  
Name    VARCHAR(20)     NOT NULL
   )

INSERT INTO TransTest
  
SELECT 'Smythee'

INSERT INTO TransTest
  
SELECT 'Bob'

Next, say you want to delete Bob from the table. Bob was never any fun anyway, was he? Because you’re heeding the above advice, you are going to wrap this simple one-row delete in a Transaction. You run the following:

BEGIN TRAN

DELETE
   FROM TranTest
  
WHERE Namee = 'Bob'

Whoops, you fat-fingered the column name and didn’t notice until you ran it, and it threw an error.

Fix it and run it again:

BEGIN TRAN

DELETE
   FROM TranTest
  
WHERE Name = 'Bob'

This runs OK, you double-check the contents of the table, everything looks fine.

Next step is to run COMMIT TRAN. That runs without error, and you go on your merry way.

But, there’s a problem: Select @@trancount and see what you get. You should see one transaction still open. Why is that?

When the first statement was run, a transaction was opened. Even though the statement itself bombed because of the bogus column name, that transaction is still there. When you fix it, if you run BEGIN TRAN again, you will now have a nested, second transaction. Running a single COMMIT will commit your changes, yes, but it still leaves one transaction open. Because that transaction still has locks, it will block other statements looking to operate in the TranTest table.

Moral of the story? Mash F5 on COMMIT TRAN until SSMS throws an error.

What was I talking about again?

Oh right, our poor developer.

In the mail I sent back to him, I commended him for being smart about letting us know right away when a mistake was made, as it allowed us to actually get the data back (or mostly so). I also recommended manual transactions, because they can save your tail.

I don’t know if he’ll take the advice to heart, but he at least has the tools available to him now if he wants to use them.