The environment is new and secure.
Assumptions: You are a DBA who knows how to build WSFCs and AGs. If you need help with that, then you’re in the wrong place for this post.
The windows 2016 / SQL Server 2016 environment has been built, SQL Server 2016 Enterprise Edition installed, the Windows Failover Cluster configured, HA has been enabled on the SQL Servers, the Skype for Business server software has been installed (but not configured) and all is well.
Now to configure / create the Skype For Business topology. How hard can it be?
NBThis is being done from memory, so the actual implementation detail may be a little bit hazy, and I haven’t got any screenshots or actual error messages as this was done on someone else’s computer.
What went wrong, though that’s real. And the reasoning is also as near as I can remember.
Creating the Topology Configuring for Availability GroupsLaunch the Topology Configuration wizard/app. By default it wants to go with mirroring, so you tell it to use Availability Groups. This causes the app to ask up front for the availability group name, which doesn’t exist, and then ask for a server name.
Don’t assume that this is a mistake and give it the server name. Alternatively, don’t get confused and give it a server name. You’ve got to get this right up front. If you don’t, you’ll have to strip out the whole thing and start over.
Use PowerShell rather than the GUIThe GUI feels nicer (barring obvious confusion above), but doesn’t adequately report error / warning / problems. We were experiencing problems with the installation, and all the messages that the skype engineers were seeing led them to view it as a “problem with the database server”.
Using the Install-CsDatabase command gives a whole heap of more useful information, including showing that a connection was made to the server and configuration settings (for database file paths) were read correctly. It’s not a problem with the server.
Oh, and use the -UseDefaultSQLPaths parameter, assuming you’ve configured your server up correctly, as it’s easier than trying to figure out the various other path options.
What $ share?The output from Install-CsDatabase showed that the installer was attempting to copy database files to the appropriate locations, but had translated the value retrieved from SQL Server, say, E:\mssql\SQLData to \\SkypeServer1\E$\MSSQL\SQLData the transaction log path had suffered similar indignities. Yes, that’s using the $ / “admin” shares. The problem is that, for various security reasons, these shares have been removed.
If you try creating the blank databases, the installer complains that the database is there, but it’s version 0, and that the installer can’t upgrade from version 0 to version 12. So drop those databases as they’re not doing any good.
The only way forward on this appeared to be to get an exemption logged and create those shares. Installation can (finally) proceed. Hurrah.
When the databases have all been created, create AG, failover, rerun topology-push to create users & jobs on the secondary, and failback.
That was a painful afternoon.
Still, all is not entirely well.
The Next DayThe following day, you’ll notice that various SQLAgent jobs have failed. Checking the server, you will notice that there are the following jobs:
LcsCDR_Purge LcsCDR_UsageSummary LcsLog_Purge QoEMetrics_Purge QoEMetrics_UsageSummaryThese jobs have been set up in a sensible way step 1 is a check to see if the database / server is the primary or the secondary, and step 2 actually does the work (if this is the primary). Here’s that step 1 in full:
declare @_MirroringRole intset @_MirroringRole = (
select mirroring_role
from sys.database_mirroring
where database_id = DB_ID('LcsCDR')
)
if (@_MirroringRole is not NULL and @_MirroringRole = 2) begin
raiserror('Database is mirrored', 16, 1)
end
else begin
print ('Database is not mirrored')
end
That’s great, if you’re using mirroring. But we’re not. And Skype-for-Business *knows* that we’re not, as we told it that right up front. What this check should be looking at is the sys.dm_hadr_database_replica_state DMV and working against that. This code seems to work:
declare @_AGPrimary bitif exists (select * from sys.dm_hadr_database_replica_states where database_id = DB_ID('LcsCDR'))
BEGIN
set @_AGPrimary = (SELECT is_primary_replica FROM sys.dm_hadr_database_replica_states WHERE database_id = DB_ID('LcsCDR') and is_local = 1)
IF (@_AGPrimary = 0)
BEGIN
RAISERROR('Database is AG secondary',16,1);
END
BEGIN
print ('Database is AG primary');
END
END
Your choice either replace the existing step 1, or add this code to it, and the jobs now work. Depending on how nit-picky you are about these things, you might want to change the two references to ‘LcsCDR’ to match the database mentioned in the job… Given that the databases are / should be in the same AG, though, this might be overkill/paranoia. I’m paranoid. It goes with the territory.
Creating the Listener Active Directory PermissionsStepping back a bit, one other thing that bit us was that we had problems creating the availability group listener the cluster didn’t have the rights to do this. This is fixed by either:
(Preferred) granting the failover cluster virtual server object permission to create objects in the OU in which it lives. (alternatively, should work, untested by me) create the listener computer object first, and grant the failover cluster virtual server CNO rights to control / edit that object. All doneThe engineers can now carry on with configuring / building out the skype-for-business environment. Rather them than me. I need a beer after all that!