(also known as Monitoring Lync with Open Source Tools)
Microsoft's Lync Server 2013 includes a number of reporting options, however they are server-focused and don't lend themselves well to external queries. Many network telecommunications providers and technology organizations use MRTG and RRD to monitor key aspects of their networks, devices and environments.
In fact, I use MRTG and RRD to monitor all aspects of our infrastructure, from Hypervisor metrics (CPU, RAM, HDD, etc.), individual guest virtual machine metrics, and network infrastructure devices such as our routers, firewalls, load balancers and various other appliances.
MRTG simply takes a number of values, typically 2, and stores the result in a local dataset. Results are then graphed using MRTG's graphing tool, or using RRD for more precise output.
Data is normally provided to MRTG via SNMP counters, although it can take arbitrary values from any data source, with the right approach.
Lync Monitoring Reports
Lync Server 2013 includes a set of standard reports that are published by Microsoft SQL Server Reporting Service, within a hosted or on-premises environment. These reports, which are accessible by using a web browser, provide usage, call diagnostic information, and media quality information on an overall or per-user basis.
Hosting and Service Providers
Hosted Lync environments are typically deployed in a multi-tenant state meaning the service can be provided to a number of customers simultaneously from the one environment, segregating data and usage between customers.
The Lync Monitoring Reports, whilst having the ability to report per-user, don't have any visibility of tenancy groups or customers, which are typically based on individual SIP domains (such as customer1.com, customer2.co.uk, customer3.com.au, etc.)
It is quite possible that Microsoft will release this capability in some form in future Lync releases (2014? 2015?), however today this is not available. There are a few commercial solutions available, but these are typically expensive, proprietary and standalone solutions.
So let's roll our own!
Simplified Topology
For the purposes of simplification, we'll assume a relatively flat topology, with the Monitoring Server role separated from the Front End server.
Extracting Lync Statistics Users Online
There are a number of ways to extract statistics, such as WMI counters and by querying the Central Management Store directly. The latter is the method I have chosen, as the database contains additional valuable information such as client version details.
Any of your Front End servers can be queries, as the databases are replicated between all Front End instances.
In a default Standard Edition installation, the Central Management Store is a SQL Express instance called [servername]\rtclocal, whilst the Enterprise Edition can utilize a full SQL instance for this store.
In any case, the table structures remain the same. The first step is to return the recordset of current active Lync sessions. This is useful to test your connectivity, and forms the basis for further analysis.
Select (cast (RE.ClientApp as
varchar (100))) as ClientVersion,
R.UserAtHost as UserName,
Reg.Fqdn
From
rtcdyn.dbo.RegistrarEndpoint RE
Inner Join
rtc.dbo.Resource R on R.ResourceId = RE.OwnerId
Inner Join
rtcdyn.dbo.Registrar Reg on Reg.RegistrarId =
RE.PrimaryRegistrarClusterId
Order By ClientVersion, UserName
Lync defines an actual user session as an Endpoint, hence the table names. This query returns a recordset of the currently active sessions, and is updated based on Lync's session polling interval.
This is useful information, but let's refine the results so as to see the total number of users online, and also the total number of unique users.
Select count(*) as totalonline, count(distinct UserAtHost) as totalunique
From rtcdyn.dbo.RegistrarEndpoint RE
Inner Join
rtc.dbo.Resource R on R.ResourceId = RE.OwnerId
Inner Join
rtcdyn.dbo.Registrar Reg on Reg.RegistrarId =
RE.PrimaryRegistrarClusterId
Why are unique users important? This gives administrators visibility of how many users are using multiple devices. Where there are more active users than unique users than, this shows that one or more users are logged on simultaneously on different devices.
This example shows a difference of 1 between total online users and total unique users therefore 1 user is logged on with two devices.
By integrating into MRTG, results like this can be generated (see below for instructions on how to do this).
Extracting Lync Statistics Media & Application Usage
Whilst the number of users online at any given time is very useful, sometimes you'd like to see _what_ your users are doing at any given time.
Activity usage is recorded in the Monitoring Server (typically a dedicated SQL server, or sometimes collocated on a Front-end Server in smaller installations). Rather bafflingly, or perhaps a reminder of the legacy of Lync, one of the key Monitoring databases is still referenced by its old Live Communication Server prefix LcsCDR.
Within this database lies SessionDetails, a logging table of, you guessed it, sessions.
A bit set that indicates the media type of this session. Listed are the definitions of the types:
Media Type | Bit Set |
IM | 1 |
FILE_TRANSFER | 2 |
REMOTE_ASSISTANCE | 4 |
APP_SHARING | 8 |
AUDIO | 16 |
VIDEO | 32 |
APP_INVITE | 64 |
As we're looking to graph various usage statistics, we only need to return values for sessions within the polling period of MRTG or RRD 5 minutes or 300 seconds.
The example below returns the number of sessions with the IM (Instant Message) bit set.
SELECT count(*)
FROM [LcsCDR].[dbo].[SessionDetails] s
where (MediaTypes & 1)=1
Of course, this will return the count for every session, so we need to narrow it down to the last 5 minutes.
SELECT count(*)
FROM [LcsCDR].[dbo].[SessionDetails] s
left outer join [LcsCDR].[dbo].[Users] u1
on s.User1Id = u1.UserId left outer join [LcsCDR].[dbo].[Users] u2
on s.User2Id = u2.UserId
where (MediaTypes & 1)=1
AND s.SessionIdTime>=dateadd(minute,-5,getdate())
| Important Hint: For those of us not living in the UTC timezone, you will need to transform the s.SessionIDTime into your local timezone. If you live in the USA, Australia or any large country with multiple timezones, you may also need to consider this. Google the solution with Bing. J |
But what if you want to query this data for a specific userID or userIDs? Simply perform an outer join on the user table, to see this information.
SELECT count(*)
FROM [LcsCDR].[dbo].[SessionDetails] s
left outer join [LcsCDR].[dbo].[Users] u1
on s.User1Id = u1.UserId left outer join [LcsCDR].[dbo].[Users] u2
on s.User2Id = u2.UserId
where (MediaTypes & 1)=1
AND s.SessionIdTime>=dateadd(minute,-5,getdate())
This is a simplified approach, and there are a number of performance improvements you can make to the queries above.
Passing Lync Statistics to MRTG
Once you have the metrics you require, it's time to pass these to MRTG. An almost identical process is used for RRD, so in the interested of brevity, we'll focus on MRTG in this example.
MRTG uses a tool called rateup, which takes 4 inputs and stores them into a local, text-based flatfile database before generating the graphs based on historical data.
MRTG itself a Perl script, to which we pass configuration data, including where to gather statistical data.
| Important Hint: A *PERL* based tool used to monitor a Windows-based server environment? Sure! MRTG runs perfectly fine on any Windows platform running a Perl interpreter such as ActivePerl. Learn more here. The examples below are based on MRTG installed on a Windows 2012 R2 server with ActivePerl 5.16 Community Edition. |
In my environment, due to my old-school approach, I use a small VBS script to query the LcsCDR
database mentioned above, returning the values required by MRTG.
The values required are rather simple.
Value1
Value2
Uptime
Device Name
MRTG was originally designed to monitor network traffic, and Value1
and Value2
were originally in
and out
values measured in bytes. These can, of course, be any values, representing any metric.
Uptime
isn't overly relevant to monitoring Lync sessions, although if you'd like to see your server uptime status in the reports, by all means.
Similarly, Device Name is no longer relevant, as hopefully you're actually running Lync across multiple devices!
A typical example of my VBS script would return results similar to this (results in bold):
C:\web\mrtg>cscript lyncactivity.vbs |
Microsoft (R) Windows Script Host Version 5.8
Copyright (C) Microsoft Corporation. All rights reserved.
313
55
3/02/2014 11:23:38 PM
mylinkservice
You then point your MRTG instance at a configuration file, which takes this information, parses it in the context of specification settings, and returns a graph.
[pathtoperl] [path to mrtg perl script] [your lync configuration file]
|
Example:
C:\perl64\bin\perl c:\mrtg\bin\mrtg c:\mrtg\configs\lync.cfg
|
The lync.cfg file is a text file containing the configuration of each MRTG graph you want to create. A comprehensive listing of all the options available can be found on the MRTG web site.
Target[lync_sessions]: `cscript //nologo c:\web\mrtg\lyncactivity.vbs`
|
MaxBytes[lync_sessions]: 128
YLegend[lync_sessions]: Users
ShortLegend[lync_sessions]: Users
Legend1[lync_sessions]: Total Users
Legend2[lync_sessions]: Unique Users
LegendI[lync_sessions]: IM:
LegendO[lync_sessions]: File Sharing:
Options[lync_sessions]: growright,integer,noinfo,gauge,withzeroes
Title[lync_sessions]: IM and File Sharing Sessions
The key is the Target[name] line - this points to the source of your metrics, in this example, my VBS file that returns the 4 data components shown earlier.
By parsing your configuration (.cfg) file every 300 seconds, MRTG will now poll Lync for your preferred metrics and generate graphs.
By default, MRTG returns 4 graphs (Last Day, Last Week, Last Month, Last Year) each building up over time as more and more data is stored.
The Results
MRTG can be used to show current and historical metrics, is extremely flexibly, and is a great way of building your own monitoring and reporting capability, with just a few lines of script.
Final Steps - What To Do With This Data?
Depending on your own monitoring environments, the data generated by MRTG and/or RRD can be incorporated easily. Graphs are individual PNG, and raw data is in a simple flat-file format.
Many Service Providers utilize the CACTI framework to group and display various device and service reports. This is a great way to start grouping your Lync usage reports for integration into your existing monitoring.
Coming Soon
My next post will be a how-to integrate with RRDTool, as well as adding a few more interesting metrics.
Links
- MRTG by Tobias Oetiker. MRTG is free software, download it, and if you use it, consider supporting Tobias!
- RRD by Tobias Oetiker
- CACTI - the complete rrdtool-based graphing solution