Wednesday, December 10, 2014

SQL Agent Connection Failure after Cluster Reconfiguration

We had some unfortunate event befall one of our SQL clusters lately, and it’s led to some interesting fallout.

First, the TempDB LUN was unpresented accidentally during some SAN maintenance.  As I’m not a SAN admin, I’m not yet sure what all happened with that, but of course SQL Server crashed immediately.  It also appears that the other DB LUNs were briefly unavailable to the server, which is where things get interesting.

We had a new TempDB LUN presented in reasonably short order, and SQL Server was running again without any apparent problems.

Then patch week came.  When the active node rebooted after patching, SQL Server would not start on the secondary node.  Which meant SQL Server was offline pending manual intervention.

It took awhile to figure out why SQL Server wasn’t starting:  all of the role resources were online, and everything looked fine.  These errors in the event log gave us some clues:

[sqsrvres] CheckAndChangeVirtualServerName: Could not obtain the Virtual Server Name (status 138f)
[sqsrvres] GetVirtualServerName, Unable to obtain next Resource from Cluster Resource Enumerator. Error: 0.


Those are both event ID 19019.  So I wondered:  what might have changed about the cluster resources?

The answer?  Dependencies.  When the LUNs were unpresented, all of the dependencies for those also disappeared.  Add those dependencies back, such that the SQL Server service resource depends on the appropriate resources (drives, for instance):

image

You’ll also want to make sure the appropriate network names are set as dependencies.

Wednesday, November 12, 2014

Who is using this database? Is it in use at all?

This is a question we get as SQL Server DBAs all the time:  is this database in use?  Most of us are in the habit of putting this question back on the application owners, or of using a simple scream test to see if it’s important to anyone. 
Until SQL Server 2008, there weren’t a lot of great options:  triggers would work, but they have a lot of downsides, and they’re reasonably resource-intensive for busy servers.
Extended events offer a much better solution, and we’ll use this question as a way to ease into their use; this will be a beginner’s introduction to SQL Server extended events (on both SQL 2008 R2 and SQL 2012).  We’ll end with a pretty cool (and portable) solution to the question, “is this database in use?”

Thursday, February 20, 2014

SQL Server Reporting Services Service Start Timeout

SSRS will often timeout when starting under Windows Server:
The error is:  Windows could not start the SQL Server Reporting Services (MSSQLSERVER) service on local computer.  Error 1053:  The service did not respond to the start or control request in a timely fashion.

This problem is detailed here: http://support.microsoft.com/kb/2745448, and it's a result of the SSRS service checking for certificate revocation lists (CRL).  If you're like most people, you've got a firewall blocking Internet access from your server, so each call for a CRL has to time out before SSRS will start.

The above KB article mentions a handful of ways of dealing with this problem; the most-frequently used method is to increase the service start timeout in the registry.  This works, but it requires a server reboot.  And it doesn't really get to the root of the problem, that SSRS is taking a long time to start.

Now, generally a CRL is a good thing, even if only in theory.  But if your server can't get to the Internet, anyway, there's precious little danger of it getting to a revoked certificate.  After all, you're not using the server to browse the web.  Right?

So:  disable CRL checking within SSRS.  This is done with the generatePublisherEvidence element in the ReportingServicesService.exe.config file.

By default, this file is in the <drive>:\Program Files\Microsoft SQL Server\MSRSxxx.MSSQLSERVER\Reporting Services\ReportServer\bin directory.

To disable CRL checking, add the following just before the </runtime> tag at the bottom of the ReportingServicesService.exe.config file:
<generatePublisherEvidence enabled="false"/>

If SSRS is slow to start as a result of CRL checking, this will dramatically speed the service startup.  One benefit to doing this, too, is that it doesn't require a server reboot or anything to take effect:  it'll work at the next SSRS start.

Creating a Self-Signed SSL Certificate in Windows without IIS (for SSRS, for instance)

Sometimes you have need for a SSL certificate on a Windows server when you don't have IIS installed.  Like when you want to install SQL Server Reporting Services (SSRS).

You could use OpenSSL, or download the 1+GB Windows SDK to get the makecert utility.  But those are troublesome, each in its own way.

Happily, it's still possible to use the IIS 6 resource kit's SelfSSL, even without having IIS installed.  And that's only a 5MB download.  Plus, you can install only the SelfSSL piece, which is something like 50kB.  Much, much better.

So, download it from this link.  Install it on the server you want, and then run selfssl as below:

Note two things:
  1. Run the command prompt as administrator
  2. You'll get an error, on account of not having IIS running on the server.
The error is OK:  the certificate is installed in the computer personal certificate store.  You can view it this way.

If trust is important on the server, you can use the /T switch for selfssl, and it will automatically add the certificate to the trusted root CA certificate folder, as well.