Download.microsoft.com



[pic]

This document is provided “as-is”. Information and views expressed in this document, including URL and other Internet Web site references, may change without notice.

Some examples depicted herein are provided for illustration only and are fictitious. No real association or connection is intended or should be inferred.

This document does not provide you with any legal rights to any intellectual property in any Microsoft product. You may copy and use this document for your internal, reference purposes.

Copyright © 2011 Microsoft Corporation. All rights reserved.

Microsoft, Active Directory, Excel, Lync, Outlook, PowerPoint, SQL Server, Visual C++, Windows, Windows Media, and Windows PowerShell are trademarks of the Microsoft group of companies. All other trademarks are property of their respective owners.

This chapter is part of the Microsoft Lync Server 2010 Resource Kit book that is currently being developed. Chapters will be available for download while this book is being completed. To help us improve it, we need your feedback. You can contact us at nexthop@. Please include the chapter name.

For information about the continuing release of chapters, check the DrRez blog at .

Contributors

Project Manager: Susan S. Bradley

Content Architect: Rui Maximo

Chapter Lead: Byron Spurlock

Writers: Keith Hanna

Sidebar Contributor: Byron Spurlock

Technical Reviewers: Gang Peng, Jeffery Reed, Lei Hua, Thomas Lee, Weiming Shen, Xu Liu

Lead Editor: Alexandra Lise

Art Manager: Jim Bradley

Production Editor: Kelly Fuller Blue

Table of Contents

Contributors 4

Introduction 7

PART 1: ARCHVING 7

New Archiving Features 7

Archiving Infrastructure 7

Archiving Policy 8

Notes from the Field 8

Running Archiving Server in Critical Mode 8

Archiving Components and Dependencies 9

IM and Web Conferencing Archiving Integration 11

Web Conferencing Archiving Topology 12

Scaling and Performance 13

Archiving Cmdlets 15

Retrieving Records by Using the Export-CsArchivingData Cmdlet 15

PART 2: MONITORING 17

New Monitoring Features 17

End-to-End Scenario and Health Monitoring 18

Call Reliability Monitoring 19

Media Quality Monitoring 19

Service Health Monitoring 20

Call Quality 20

Listening MOS 20

Sending MOS 21

Network MOS 21

Conversational MOS 22

Interpreting MOS Values 22

Troubleshooting MOS issues 23

Monitoring Server Reports 23

User Registration Report 26

Call Reliability Summary Report 27

Report Customization 32

High-Level Performance Counters 36

Monitoring Database Sizing 37

Summary 37

Additional Resources 38

Introduction

For compliance purposes, many organizations and most financial companies, pharmaceutical companies, financial departments, and law departments require instant messaging (IM) archiving. Some of these organizations also require web conferencing archiving. Microsoft® Lync™ Server 2010 provides a way to archive information from IM conversations, conferences, or both and to customize how information is archived, for example for which users. This chapter describes archiving in Lync Server 2010.

This chapter also describes monitoring in Lync Server. Lync Server provides new and improved monitoring features and reports that you can use to monitor system heath and usage. Use these tools to discover hardware failures, network segment outages, and other system problems early, which can help minimize their impact, and to collect data about media quality, Lync Server workload use, and errors—data that can help with troubleshooting and planning your training and capital investment.

PART 1: ARCHVING

New Archiving Features

There are a few changes to the archiving infrastructure and the way archiving policy works in Lync Server. In Lync Server, the archiving agent runs on the Front End pool on a server running Microsoft® Lync™ Server 2010, Standard Edition.

Archiving Infrastructure

If your organization has a mission-critical requirement, such as a compliance regulation, that all instant messages are archived, you need a plan in case Archiving Server becomes unavailable. Your plan should be to enable critical mode.

Note. Critical mode is not enabled by default.

In previous versions (Microsoft® Office Communications Server 2007 R2 and Microsoft® Office Communications Server 2007), when critical mode is enabled, the Front-End service stops if Archiving Server became unavailable, which results in IM being unavailable for the entire organization. In addition, the Registrar shuts down, so voice calls are affected as well.

Critical mode works differently in Lync Server. In Lync Server, if, for any reason, the archiving database is unavailable, the Lync Server Front-End service continues to run until the message queue reaches capacity (1 GB by default). When capacity is reached, IM and web conferencing functionality become unavailable (IM and web conferencing are coupled in Lync Server Archiving Server) because IM and web conferencing content and details can no longer be archived. However, other workloads, such as Enterprise Voice, will continue to operate as usual, so users can still do things like make and receive calls.

Note. If a conference participant uploads a file, but the file cannot be copied to the archiving file store, web conferencing functionality is blocked until the problem is resolved, but IM functionality is not blocked.

When the Lync Server archiving agent fails to archive IM (that is queued in Message Queuing, also known as MSMQ, or in the local database), it sends a SIP error response. The response includes a ms-diagnostics header that indicates the failure, so clients can respond appropriately.

Archiving Policy

Many organizations find it useful to keep an archive of all IM conversations that their users take part in; other organizations are legally required to keep such an archive. In order to archive IM conversations with Lync Server, you must perform two steps. First, you need to enable archiving at the global scope, the site scope, or both by using the Set-CsArchivingConfiguration cmdlet. This gives you the ability to archive IM conversations. It does not, however, automatically begin archiving those conversations.

Instead, to actually save transcripts of your IM conversations, you must complete a second step: create one or more IM archiving policy that determines which users will have their IM conversations recorded and which type of IM conversations (internal, external, or both) will be archived. Internal IM conversations are sessions where all the participants are authenticated users who have Active Directory® Domain Services accounts in your organization; external IM conversations are sessions where at least one participant is an unauthenticated user who does not have an Active Directory account in your organization.

As noted, archiving policies can be assigned to the global scope or to the site scope. In addition, these policies can be assigned to the per-user scope and then applied to a specific user or a specific set of users. For example, suppose your global policy archives only internal IM conversations for all of your users. In that case, you might create a second policy, one that archives both internal and external conversations, and apply that policy to only your sales staff. Because per-user policies take precedence over global and site policies, members of the sales staff will have all their IM conversations archived. Other users (users who are not part of the sales department and are not affected by the sales policy) will have only their internal IM conversations archived.

You can create new archiving policies (at either the site or the per-user scope) by using the New-CsArchivingPolicy cmdlet. If you create a policy at the site scope, it will automatically be applied to the site at the time the policy is created. If you create a policy at the per-user scope, that policy will not be used until you explicitly assign it to a user or set of users by using the Grant-CsArchivingPolicy cmdlet. You cannot create a new policy at the global scope.

Notes from the Field

Running Archiving Server in Critical Mode

Byron Spurlock

Founder and Principal Architect of Quadrantechnologies

If you enable critical mode in Lync Server to make sure no conversations or conferences go unarchived when and if the Lync Server Archiving service or the archiving database becomes unavailable, the Front-End service will not come to a screeching halt, like it did in previous versions. Instead, it will continue to run as usual until the queue that contains the instant messages is not able to hold any more messages. After this occurs, both instant messaging and web conferencing functionality are blocked until the issue has been resolved. (Similarly, if a conference participant uploads content that cannot be copied to the archiving location, web conferencing is blocked for the organization when you’re running in critical mode.)

There is a queue on the Front End Server that will hold all the instant messages that are destined for the archiving database queue and that has a default size of 1 GB, which you can adjust. So what happens when this queue reaches its capacity? This is when users can no longer send or receive instant messages or participate in conferences.

This isn’t cause for great alarm, because unlike in Communications Server 2007 R2, in Lync Server, the blocking of instant messaging and web conferencing does not affect any other Lync Server features or functionality; things like voice will continue to operate as normal.

Key Takeaway: If critical mode is enabled, increase the size of the Message Queuing storage location.

Archiving Components and Dependencies

Before deploying Archiving Server, you must install the following software:

A Windows® operating system and required Windows updates on supported hardware for each server that you want to deploy archiving components on, including the Archiving Server, archiving database, and archiving file store. For details about the hardware and software requirements for Lync Server and database servers, see Determining Your System Requirements at .

Software prerequisites for Lync Server, including Microsoft® .NET Framework 3.5 with Service Pack 1 (SP1), the Microsoft® Visual C++® Redistributable, the Microsoft® Visual J# Redistributable, the URL Rewrite Module version 2.0 Redistributable, Windows Media® Format Runtime, Windows PowerShell™ version 2.0, and Windows Installer version 4.5. For details about all prerequisites, see Additional Software Requirements at .

Message Queuing with Active Directory Integration enabled on the Archiving Server and on each server running Microsoft® Lync™ Server 2010, Front End Server or Lync Server 2010 Standard Edition that hosts users who will have IM archived. For details about Message Queuing requirements, see Additional Software Requirements at .

Microsoft® SQL Server® 2008 or Microsoft® SQL Server® 2005 with Service Pack 2 (SP2) (required) or the latest service pack (recommended) on the computer that will host the archiving database. For details about supported versions, see Database Software and Clustering Support at .

The following diagram shows the dependencies of the Lync Server Archiving service and other services and stores.

[pic]

Figure 1. Archiving components and dependencies

The dependencies of the archiving components are as follows:

Note. The Lync Server Archiving service is not part of the Front End cluster in a consolidated topology.

The IM archiving agent runs on the User Services cluster, and the web conferencing archiving agent runs on the Conferencing services cluster.

Multiple User Services (IM archiving agents) can point to one Lync Server Archiving service.

Multiple Conferencing services (web conferencing archiving agents) can point to one Lync Server Archiving service.

The Lync Server Archiving service depends on both a Microsoft® SQL Server® database and a file store.

Multiple archiving services can share a single archiving SQL Server database.

Multiple archiving services can share a single archiving file store.

The following diagram looks more closely at dependencies of the Lync Server Archiving service.

[pic]

Figure 2. Lync Server Archiving service dependencies

The dependencies for the Lync Server Archiving service are as follows:

The mandatory dependency between the Archiving Server and Lync Server Archiving service provides the backend database location.

Multiple Archiving Servers cannot share the same archiving database.

The mandatory dependency between the Archiving Server and the file store provides the place where the file content is archived.

Multiple Archiving Servers can share the same file store.

IM and Web Conferencing Archiving Integration

In Lync Server, like in previous versions, IM content and web conferencing content is archived separately. IM content, along with conference metadata, such as participation records, is archived in the archiving database. Conference content, along with metadata about the content, such as the time stamp, is archived in a file store. The integration between IM and web conferencing happens as follows:

The same global or per-user archiving policy is enforced on both IM content archiving and on web conferencing content archiving.

The Web Conferencing service has a built-in archiving agent that interfaces with Archiving Server by using Message Queuing.

The Web Conferencing service archiving agent writes web conferencing content events (metadata) into the archiving database (by using Message Queuing). For performance optimization, the Web Conferencing service batches data events and writes them every X minutes so that only the last X minutes worth of content is lost if web conferencing fails.

The Web Conferencing service writes uploaded content to a disk.

Archiving Server creates a folder for a given web conferencing instance and stores it in the archiving database.

Archiving Server enforces the purging and archiving logic. If purging is on, an archiving record is deleted at the time of purging when one of the following conditions are met:

The record is marked safe to delete. This could happen when a transcript export tool has been run.

The record is older than the number of days the archive is kept.

Archiving Server enforces the per-user policy. All conferences are logged in the archive when logging is on. When web conferencing archiving is enabled, the Archiving Server reviews the records in the archive and keeps records only for conferences that meet the following conditions:

There are participants who need to be archived for internal communications, and more than one internal user is participating.

There are participants who need to be archived for external communication, and at least one internal user is participating.

Purging records for per-user policy enforcement happens regardless of whether purging is enabled and should occur more frequently than the default purging process, which is 60 days.

Web Conferencing Archiving Topology

In Lync Server, the web-conferencing archiving topology is integrated into the IM topology. The topology for web conferencing archiving works as follows:

The Web Conferencing service writes the uploaded content to a network file store that you can configure.

The Web Conferencing service writes events to Message Queuing.

The Lync Server Archiving service reads the events from Message Queuing to put them into the archiving database.

These components are shown in the following figure.[pic]

Figure 3. Web Conferencing service archiving components

The Archiving Server is responsible for cleaning up the archiving database.

The following diagram gives an overview of how data is stored in the archiving database.

[pic]

Figure 4. Data storage in the archiving database

Conference activation works as follows:

1. A user joins a conference.

2. The IM archive logs a conference join event.

3. If it is the first join event, the IM archive logs the exact location of the web conferencing archive for this conference, and the Web Conferencing service creates the archive folder for this conference.

4. The user adds a data modality to the conference.

File upload works as follows:

1. A user uploads a file to the Web Conferencing service.

2. The Web Conferencing service logs the file upload event in the Archiving Server.

3. The Web Conferencing service stores the uploaded file.

Scaling and Performance

When deploying Archiving Server, you must associate it with the servers running Lync Server 2010, Front End Server in a pool. Archiving Server then collects IM content from communications involving the users who are homed in the pool.

For the best scalability, do not collocate the Archiving Server with another server role. However, you can host the archiving databases on the Archiving Server; hosting the archiving databases on a separate computer does not significantly improve performance.

Note. One Archiving Server can support up to 500,000 users. If you have multiple pools that support less than 500,000 users total, we recommend that you associate all these pools with a single Archiving Server to simplify administration and data retrieval.

For optimal performance, we recommend that you put each of the following items on a separate physical disk:

System file and Message Queuing file

Archiving database data file

Archiving database log file

You can collocate the archiving database with other databases, but we recommend carefully evaluating performance impact before you do. If you do collocate the archiving databases with other databases on the same server, you should run the archiving database in a separate instance from other databases.

You can base your performance evaluation on the following assumptions and example. The example imagines an organization with 100,000 IM and web conferencing users. It assumes that, for peer-to-peer IM:

Each user has an average of two IM conversations per hour. This means that

for 100,000 users, there are 100,000 IM conversations per hour, or 27.8 IM conversations per second.

Each IM conversation consists of an average of 10 instant messages.

Each instant message consists of a) an average of 200 characters and b) metadata (details such as who sent the instant message and when) that is, on average, 100 characters. This means that the transcript for each IM conversation contains 3,000 characters on average.

For group IM, the example assumes that:

There is a five percent concurrent conferencing user rate, that is, in every work hour, for 100,000 users, there are 5,000 users participating in a group IM conversation or a conference.

Thirty percent (1,500 in our example) of concurrent conferencing users are part of a group IM conference.

The average group IM conference size is three, which means there are 500 group IM conferences per hour in our example, or 0.14 conferences per second.

Each group IM conference consists of an average of 15 instant messages.

Each instant message consists of a) an average of 200 characters and b) metadata that is, on average, 100 characters. This means that the transcript for each group IM conference contains 4,500 characters on average. We will round down to 4,000 for our example.

And for conferences, we assumed that:

There is a five percent concurrent conferencing user rate for group IM and conferences combined, that is, in every work hour, for 100,000 users, there are 5,000 users participating in a group IM conversation or a conference.

Twenty-five percent (1,250 in our example) of concurrent conferencing users are part of a conference.

The average conference size is six, which means there are 208.3 conferences per hour in our example, or 0.058 conferences per second.

The content size distribution for each conference is, on average, two 10 MB Microsoft® PowerPoint® presentations and one 5 MB handout. This means 25 MB of content per conference per 5,000 concurrent users.

So for a 100,000 user deployment, you could plan on there being:

27.8 + 0.14 + 0.058 = Approximately 28 transcripts per second

And:

(27.8 *3) + (0.14 *4) + (0.058 *5) = 84.25 KB of metadata per second

And, if you are including web conferencing data as part of the meeting transcript:

(27.8 *3) + (0.14 *4) + (0.058 *5) + (25 *1024) = 25684.25 KB total data per second

Note. If archiving is mission-critical for your organization, you should enable critical mode in case archiving fails. If you enable critical mode, it applies to only the failed workload. For example, if the failure affects only web conferencing, web conferencing is blocked until the problem is resolved, but other workloads, such as Enterprise Voice, are not blocked. For details about critical mode, see the section “Archiving Infrastructure,” earlier in this chapter.

Archiving Cmdlets

With Lync Server Archiving Server, you can use the various CsArchivingConfiguration cmdlets to enable and disable IM archiving and to manage your archiving database. You can also suspend IM should archiving fail, which helps ensure that you keep a record of all your electronic communications. For details about the CsArchivingConfiguration cmdlets, see Archiving and Monitoring Cmdlets at .

One useful CsArchivingConfiguration cmdlet is Export-CsArchivingData. This cmdlet extracts all the records or a selected subset of records from the archiving database and saves them as a Microsoft® Outlook® Express Electronic Mail (EML) file so that you can view them.

Retrieving Records by Using the Export-CsArchivingData Cmdlet

The Export-CsArchivingData cmdlet scans archiving data in a given archiving database and constructs per-conference transcripts for all conferences that happened in a specified date range. Those transcripts are exported and saved in the specified output folder. Data records used in transcripts can optionally be marked as exported or safe to purge. A period of time, indicated by a start time and an end time, is required.

Note. By default, only members of RTCUniversalServerAdmins are authorized to run Export-CsArchivingData locally.

The following table describes the parameters for Export-CsArchivingData.

Table 1. Export-CsArchivingData parameters

|Parameter |Description |Type |

|StartDate |Start date (required) and time for export. |Date |

| |Inclusive. Local time. MMDDYYYY [hh:mm:ss] | |

| |format. Defaults to 12:00 A.M. if no time is| |

| |specified. | |

|EndDate |End date (required) and time for export. |Date |

| |Exclusive. Local time. MMDDYYYY [hh:mm:ss] | |

| |format. Defaults to the current date and | |

| |time if not specified. | |

|UserURI |User whose conference involvement will be |Uniform Resource Identifier |

| |exported. |(URI) |

|DBInsatnce |Database instance of the archiving database.|String |

|ExcludeWebConfArchive |Web conferencing archiving data will be |Boolean |

| |excluded, if present. | |

|OutputFolder |Folder in which to output transcripts |N/A |

| |(required). | |

|Purge |Record is safe to purge, if present. |String |

|WhatIf |Description of what will happen if you run |Boolean |

| |this command. | |

Example Exporting Commands

The following command exports all conferences that begin between 12:00 A.M. on March 1, 2011 and the current date. The conference transcript includes web conferencing data, and all records used are marked as safe to purge.

Export-CSArchiveData -StartDate 03012011 -DBInstance Server1\rtc -OutputFolder \\Server1\export -Purge

The following command exports all conferences that happened during March 2011. The conference transcript does not include web conferencing data, and all records used will not be marked as safe to purge.

Export-CSArchiveData -StartDate 03012011 -EndDate 04012011 -DBInstance \Server1\rtc -OutputFolder \\Server1\export -ExcludeWebConfArchive

Export-CsArchivingData Output

When the parameter WhatIf is present, the output is the number of sessions that would be exported. Otherwise, output from Export-CsArchivingData looks like this:

Starting to export transcripts

Session ‘sessionIdTime + SessionIdSeq’ successfully exported

Session ‘sessionIdTime + SessionIdSeq’ successfully exported

……

Session export successful. ### total sessions exported.

Troubleshooting Exporting Issues

Causes for errors include the following:

The user does not have read/write permission to the archiving database.

The user does not have read/write permission to the archiving file store.

There is a disk-write error.

The disk is full.

Purge is not allowed because the parameter ExcludeWebConfArchive is present.

The start is time not valid.

The end time is not valid.

The user URI is not valid.

PART 2: MONITORING

New Monitoring Features

Lync Server introduces several new features that enhance Monitoring Server, including ready-made and customizable call detail recording (CDR) and Quality of Experience (QoE) reports. In Lync Server Monitoring Server, you will find:

Rich reporting: Monitoring Server takes advantage of SQL Server Reporting Services to provide ready-made reporting on system usage, call reliability, and media quality diagnostics. A custom dashboard presents an aggregation of these reports from a single location.

New management features: Monitoring Server now uses Lync Server Management Shell and Lync Server Control Panel for all administration and management tasks.

Improved server diagnostics: Client and server components consistently report diagnostics in failure responses to SIP INVITE messages—to explain why a call could not be established—and in BYE messages to INVITE dialogs—to explain why a call was terminated. This makes troubleshooting significantly easier.

Optimized infrastructure: The Monitoring Server infrastructure has been optimized to improve reliability and maintainability.

QoE improvements: A significant amount of time was invested in determining thresholds at which different metrics begin to cause noticeable call quality degradation—metrics including jitter, packet loss, round-trip time, and loss concealment.

Monitoring Server report readability: Metrics that fail to meet acceptable thresholds are highlighted in Monitoring Server reports as yellow or red. This removes the need for administrators to evaluate each metric. In addition, you can pause on the various metrics in the reports to open tooltips that provide details about the QoE metric.

User facing diagnostics: If end users experience poor sound quality, new UI helps explain the cause—for example, the user’s device might not be working correctly, the volume might be too low, the network might be intermittent, or their computer might be slow because of heavy CPU consumption. These diagnostics are included with Monitoring Server reports and can help with troubleshooting.

CDR: All successful, failed, and dropped conference joins are now recorded, whereas previously, only successful conference joins were.

End-to-End Scenario and Health Monitoring

Lync Server introduces a number of new concepts about the following:

End-to-end scenario monitoring

Call reliability monitoring

Media quality monitoring

Service health monitoring

End-to-End Scenario Monitoring

Synthetic transactions are Windows PowerShell cmdlets that allow end-to-end tests to be carried out. They provide a true end-to-end perspective on the environment and are an integral part of the Microsoft® System Center Operations Manager (formerly Microsoft® Operations Manager) management pack.

The following transactions are some of the most common:

Test-CsAddressBookService: Tests the ability of a user to access the Address Book Server. For example:

Test-CsAddressBookService -TargetFqdn atl-cs-001. -UserCredential contoso\bob -UserSipAddress "sip:bob@"

Test-CsAVConference: Tests the ability of a pair of users to take part in an audio/video (A/V) conference. For example:

Test-CsAVConference -TargetFqdn atl-cs-001.

Test-CsClientAuth: Determines whether a user can log on to Lync Server by using a certificate download from the certificate provisioning service. For example:

$cred1 = Get-Credential "litwareinc\kenmyer"

Test-CsClientAuth -TargetFqdn atl-cs-001. -UserSipAddress "sip:kenmyer@" -UserCredential $cred1

Test-CsComputer: Verifies the status of the Lync Server services running on the local computer. This cmdlet also verifies that the appropriate Lync Server Active Directory groups have been added to the corresponding local groups on the computer and that the necessary computer firewall ports have been opened. For example:

Test-CsComputer -Verbose

Test-CsDialInConferencing: Checks to see if a user can take part in a dial-in conferencing session. For example:

Test-CsDialInConferencing -TargetFqdn atl-cs-001.

Test-CsFederatedPartner: Verifies the ability to connect to a federated domain.

Test-CsFederatedPartner -TargetFqdn accessproxy. -Domain

Test-CsGroupIM: Tests the ability of users to participate in a group IM conversation. For example:

Test-CsGroupIm -TargetFqdn atl-cs-001.

Test-CsIM: Tests the ability of two users to exchange instant messages. For example:

Test-CsIm -TargetFqdn atl-cs-001.

Test-CsP2PAV: Tests the ability of a pair of users to conduct a peer-to-peer A/V call. For example:

$cred1 = Get-Credential "litwareinc\pilar"

$cred2 = Get-Credential "litwareinc\kenmyer"

Test-CsP2PAV -TargetFqdn atl-cs-001. -SenderSipAddress "sip:pilar@" -SenderCredential $cred1 -ReceiverSipAddress "sip:kenmyer@" -ReceiverCredential $cred2

Test-CsPhoneBootstrap: Verifies that a user can log on to Lync Server by using a device running Microsoft® Lync™ 2010 Phone Edition. For example:

Test-CsPhoneBootstrap -PhoneOrExt "+14255550119" -Pin "0712"

Test-CsPresence: Tests the ability of a user to log on to Lync Server, publish his or her presence information, and then subscribe to another user’s presence information. For example:

Test-CsPresence -TargetFqdn atl-cs-001.

Test-CsRegistration: Tests the ability of a user to log on to Lync Server. For example:

Test-CsRegistration -TargetFqdn atl-cs-001.

Service-impacting events and performance counters are categorized as either Key Health Indicators (KHI) or non-KHIs. KHIs result in medium priority alerts in System Center Operations Manager and are auto-resolved if the component returns to health. Non-KHIs generate information alerts in Operations Manager and require manual resolution.

Call Reliability Monitoring

Call reliability monitoring is stored as CDR data, and failures are classified as “expected” or “unexpected” based on the ms-diagnostic header.

The Operations Manager management pack specifies that alerts will be generated based on higher than expected failure rates.

Media Quality Monitoring

Media quality monitoring QoE data and calls are classified as “good” or “poor” quality based on the following metrics:

• Network degradation

• Round-trip time

• Packet loss

• Jitter

• Healing

For details, see the section “Conversational MOS,” later in this chapter.

Operations Manager alerts are raised for server infrastructure components (such as Mediation Servers) and for network links and are generated based on higher than expected poor quality call rates.

Service Health Monitoring

Service health monitoring provides a holistic view of the end-to-end service, based on automatic synthetic transactions that ensure that each component is operating and reporting as it should.

New features in health monitoring, such as automatic discovery, have enabled more capabilities for more robust discovery— the Operations Manager management pack uses the Central Management Server to determine what computers and services are running in the topology that need to be automatically monitored. This eliminates the need for administrators to do a lot of manual configuration. The management pack has numerous views on this data, including a global site-level view, pool-level view, and server-level view of service health.

Service-health alerts are now categorized into the following three alert types:

High: Notifies you that a service outage has been detected and a feature can no longer be used. High alerts are verified exclusively by synthetic transactions and should be resolved immediately.

Medium: Notifies you that either high availability is at risk but features are still available, a high volume of users are experiencing issues but not all users, or a component has detected that something is broken. Medium alerts are verified by component alerts, call reliability alerts, and media quality alerts and should be resolved by the next business day.

Other: Notifies you about an issue that impacts a relatively small set of users, for example, "Too many people are subscribing to see the presence of a CEO" or "User moves failed for a few users.” These alerts should be resolved after high-alert and medium-alert issues.

Call Quality

The factors that determine call quality are based on the following mean opinion score (MOS) values:

• Listening MOS

• Sending MOS

• Network MOS

• Conversational MOS

Listening MOS

Listening MOS is a prediction of the wideband MOS Listening Quality (MOS-LQ) of the audio stream that is played to the user. This value takes into consideration the audio fidelity, audio distortion, and speech and noise levels. From this data, it predicts how a large group of users would rate the quality of the audio.

Listening MOS varies depending on the following factors:

• The type of codec (wideband or narrowband)

• The characteristics of the microphone the speaker is using

• Any transcoding or mixing that occurs

• Packet loss and packet loss concealment

• The speech level of and background noise from the speaker

Due to the multiple factors that influence this value, it is most useful to view Listening MOS statistically from a sampling of calls rather than for a single call.

Sending MOS

Sending MOS is a prediction of the wideband MOS-LQ of the audio stream prior to being encoded and sent to the network. This value takes into consideration the speech and noise levels of the user along with any distortions. From this data, it predicts how a large group of users would rate the audio quality they hear.

Sending MOS varies depending on the following:

• The microphone that the speaker sending the audio is using

• The speech level of and background noise from the speaker

Due to there being more than one factor that influences this value, it is most useful to view a report showing trending of Sending MOS from a sampling of calls rather than for a single call.

Network MOS

Like Listening MOS, Network MOS is a prediction of the wideband MOS-LQ score for the audio stream that is played to the user. This value takes into consideration network factors such as the codec used, amount of packet loss, amount of packet reorder, amount of packet errors, and amount of jitter.

The difference between Network MOS and Listening MOS is that Network MOS considers only the impact of the network on the call quality, whereas Listening MOS also considers the payload (for example, speech level and noise level). This makes Network MOS useful for identifying network conditions that impact the audio quality.

For each codec, there is a maximum possible Network MOS that represents the best possible MOS-LQ for a call scenario. The following table shows the codec typically used for each scenario and the corresponding maximum Network MOS. By understanding the maximum Network MOS, you can interpret the impact of degradation. For example, a degradation of 0.5 will provide a significantly different experience when using RTAudio Wideband than when using RTAudio Narrowband. (With narrowband, it is likely the call is extremely difficult to understand. With wideband, there is likely to be a noticeable loss of quality, but the call is likely to be understandable).

Table 2. Typical Codecs and Maximum Network MOS for Call Scenarios

|Scenario |Codec |Maximum Network MOS |

|Microsoft® Lync™ 2010-to-Lync 2010 call |RTAudio Wideband |4.10 |

|Lync-to-Lync call |RTAudio Narrowband |2.95 |

|Lync conference call |G.722 |3.72 |

|Lync-to-public switched telephone network |RTAudio Narrowband |2.95 |

|(PSTN) call | | |

|Lync-to-PSTN call or |G.711 |3.61* |

|Lync-to-Lync call (with media bypass enabled) | | |

*The maximum MOS value for the PSTN part of a call is 2.95.

Due to the differences between codecs, it is not possible to do a straight comparison between Network MOS values, but it can be interesting to look at the average degradation of Network MOS during the call. This allows you to compare calls by showing the impact that the network, devices, or both had on the call quality. The average degradation can be broken down into how much degradation is due to network jitter and how much is due to packet loss. For very small degradations, the cause of the degradation might not be available.

Conversational MOS

Conversational MOS is a prediction of the narrowband MOS Conversational Quality (MOS-CQ) of the audio stream that is played to the listener.

Note. Conversation MOS is narrowband. Other MOS values are wideband.

This value takes into consideration the wider properties of a call, not just one aspect. It includes values such as the quality of the audio played and transmitted across the network, the speech and noise levels for both audio streams (caller to callee and callee to caller), and echoes. It predicts how a large group of people would rate the quality of the connection for holding a conversation, in contrast to Sending MOS or Listening MOS values, which are treated on a single-person basis.

Conversational MOS varies depending on the same factors as Listening MOS and on the following:

Echo

Network delay

Delay due to jitter buffering

Delay due to devices that carry out encoding and decoding (such as handsets)

Due to the multiple factors that influence this value, it is most useful to view Conversational MOS statistically from a sampling of calls rather than for a single call.

Interpreting MOS Values

When investigating individual reports of poor quality, it is important to understand that the values reported as percentages typically vary widely based on the length of a call. For example, a call that starts with poor quality is more likely to cause a user to hang up and try again. The effect of a user hanging up and trying again is that the actual percentage value of the issues displayed has a high impact. For example, if the call experiences five seconds of issues in a 40 second call, this equates to 12.5% (five seconds poor quality/40 seconds total *100), and the user is likely to report this as a poor-quality call. However, a call that lasts 400 seconds and also has five seconds of issues, has a 1.25% rate, and the user is less likely to remember (or even notice) the issue.

Note. Neither of these examples can be converted directly into MOS because MOS takes into account many more variables.

For this reason, extremely short calls (calls that are less than one minute) and extremely long calls tend not to be very revealing from a percentage perspective in terms of highlighting issues. Long duration calls can be useful for showing whether the issues occurred at separate times or during one long, problematic time.

For Network MOS degradation to be noticeable to the end user, the value typically has to be more than 0.5 (this is on a scale of 1 to 5, not percentages). For values less than 0.5, it is likely that the issue was a short transitory issue that users will, in most cases, not notice.

Troubleshooting MOS issues

To use a report to troubleshoot MOS degradation, do the following:

For Listening MOS degradation: Look at the values related to the device on the receiving end and its characteristics; for example, the transcoding value, the mixing value, and whether there is echo.

For Sending MOS degradation: Look for causes such as background noise from the callee.

For Network MOS degradation: Look at the values that vary based on the network conditions; for example, network issues, packet loss, round-trip delay, and wireless issues.

Note. The Conversational MOS value analyzes end-to-end calls and is typically used for trending rather than troubleshooting.

Monitoring Server Reports

Monitoring Server Reports can be a resourceful measuring tool, providing the deployment team or help desk with current, past, and future trends related to the various Lync Server workloads. Monitoring Server Reports includes more than 40 reports, including a dashboard view that presents a recent trend report, either by week or by month.

Figure 5 is an example of a weekly report. Weekly reports show six weeks of system usage data, with more detailed diagnostics data from the past six days.

[pic]

Figure 5. Dashboard weekly report

Note. The Total failures value, in the Peer-to-peer group under Call Diagnostics, contains data only if a workload failed on this particular day.

In the dashboard view, poor quality areas are highlighted in yellow and should be investigated. These thresholds can be modified to your liking.

The following report in figure 6 displays the total workloads across Lync Server during one week. A report like this could help you interpret the total amount of peer-to-peer traffic in your organization and the total number of audio and IM conversations, file transfers, and program sharing sessions. This would be helpful if you wanted to know how active the users are in your Lync Server deployment.

[pic]Figure 6. Report showing IM, A/V, program sharing, and file transfer use

The dashboard view includes ready-made reports ideally suited for trending. For reports that require customization (specific dates, sites, and so on), a higher-level view is provided. This view, called the Monitoring Server Reports home page, is shown in figure 7.

[pic]

 Figure 7. Monitoring Server Reports home page

Two important reports included on the Monitoring Server Reports home page are the User Registration Report and the Call Reliability Summary Report.

User Registration Report

The User Registration Report is a system usage report that shows, by default, the last eight days and allows you to customize the date range and interval and to choose a specific pool, if required. This report is shown in figure 8.

[pic]

 Figure 8. User Registration Report

Call Reliability Summary Report

From the home page, you can access reliability and quality style reports, including the Call Reliability Summary Report. This report provides a breakdown of the failure rates per modality and includes links that allow you to zoom in on individual causes.

Figure 9 is a Call Reliability Summary Report that shows that program (application) sharing is the largest modality of failure in conferences but has the second lowest failure rate in peer-to-peer conversations (sessions).

[pic]

Figure 9. Call Reliability Summary Report

The detailed report on the call from the preceding figure is shown in the following figure. The data has been broken out into separate sections—Call Information, Media Line, Caller Device and Signal Metrics, Callee Device and Signal Metrics, Caller Event, Callee Event, and Audio Stream—to allow a better breakdown of the important data.

The Call Information section, shown in figure 10, provides a detailed summary of the devices used at each endpoint of a call. One important value is the call duration—if the duration is too short or too long, it can skew the percentages used in the calculations, as explained previously.

[pic]

Figure 10. Call Information section

The Media Line section is shown in figure11. It provides the location information associated with each endpoint involved, along with the network connection type and link speed. This section includes any policy bandwidth limits applied to the call. Important values include the connection type (wired or wireless), whether the user is inside the network, and whether the user is using a virtual private network (VPN).

[pic]

Figure 11. Media Line section

Figure 12 shows the Device and Signal Metric sections. These sections provide device characteristics for both the caller and callee devices, including detailed information such as the individual driver (or firmware) versions being used. It is useful to compare the send signal levels and the receive signal levels to determine how much noise is lost during the transmission.

[pic]

Figure 12. Device and Signal Metrics sections

The Client Event sections detail any error feedback raised by the device or the client. These events allow you to see the types of alerts that users saw in Lync during the call, information that can be useful for help desk or troubleshooting.

For example, help desk can use these sections to determine if the client computer was low on resources, such as CPU; if there was a device issue that caused an echo; or if there was a high amount of glitching or network issues that impacted call quality. An administrator might use these sections to determine which other QoE metrics might be of interest.

Figure 13 shows the Client Event section for a caller.

[pic]

Figure 13. Caller Client Event section

The Audio Stream sections provide details about the network. There is a section that analyses the audio stream from the caller and another that details the stream from the callee. Figure 14 shows a caller-to-callee audio stream.

The most important value in the Audio Stream sections is the average Network MOS degradation. This value quantifies the impact of network issues on the quality of the call; quality issues include packet loss or jitter. A value greater than 0 indicates some quality loss and requires further investigation to determine the nature of the issue.

Note. Due to rounding in the calculations, the sum of the values for Min. Network MOS and Max. Network MOS degradation might not equal the expected Network MOS value for the particular codec. (For details about the expected Network MOS values, see Table 2 in the section “Network MOS,” earlier in this chapter.) Their value might be off by 0.01, as shown in the following figure.

[pic]

Figure 14. Audio Stream section

Report Customization

Lync Server supports customized Monitoring Server reports. You can create reports in Microsoft® Excel® 2010 that link directly to the archiving, QoE, and CDR databases. You can also use other tools to generate custom reports, including SQL Server Report Builder, which can create additional monitoring reports by using SQL Server Reporting Services. For details about Report Builder, see “Getting Started with Report Builder 3.0” at .

To generate custom reports by using Excel

1. Open Excel 2010, and click the Data tab.

Click From Other Sources, and then click From SQL Server.

[pic]

The Data Connection Wizard opens. Click Next.

On the Connect to Database Server page, in the Server name box, type the FQDN\database_instance_name and appropriate authentication credentials.

[pic]

Click Next, and then select the database you want to connect to:

For CDR reporting, click LcsCDR.

For QoE reporting, click QoEMetrics.

For archive reporting, click LCSLog.

Note. There might be additional databases listed, such as master and temp, but only the databases listed in Step 5 are specific to Lync Server.

[pic]

On the Select Database and Table page, do one of the following:

To connect to a single table, select the Connect to a specific table check box.

To connect to multiple tables, clear the Connect to a specific table check box. Clear this check box if data is required from multiple tables.

[pic]

(Optional) On the Save Data Connection File and Finish page, to ensure that future updates can be captured after the spreadsheet report is finished and saved (by clicking Refresh Data in Excel), select the Always attempt to use this file to refresh data check box.

Click Finish.

[pic]

In Select Table, select a table to retrieve data from.

[pic]

In Import Data, specify where to add the data from the table in the spreadsheet.

[pic]

Note. The Properties button provides more options for how often (if at all) data should be refreshed and how the connection data should be used.

Click OK to complete the wizard. The data from the database table is inserted into the spreadsheet at the location specified, with filters enabled for all the columns.

[pic]

Using the preceding procedure, you can add additional tables from the Data tab by clicking Existing Connections. Figure 15 shows you the options you will have.

[pic]

 Figure 15. Existing Connections dialog box

From the Existing Connections dialog box, do one of the following.

To reproduce any existing connections that have been defined (which is essentially to duplicate the table), click Connections in this Workbook.

To reproduce the connection file, which might be to the database (as in the preceding figure, as indicated by the database icon) instead of directly to a table, click Connection files on this computer. This creates the option of a new table to be input.

By combining multiple tables and potentially linking to data exported from Active Directory, you can produce many complex custom reports.

High-Level Performance Counters

Performance counters are useful for capacity planning and troubleshooting performance issues. When installing Monitor Server, the following Lync Server performance counters are added to the Performance Monitor:

LS:CDR Service - 00 – DATABASECDR: Contains the counters relevant to the CDR database performance, queue depth and latency, and so on.

LS:CDR Service - 01 - Read: Contains the messages received by the CDR service, and provides information about the rate of receipt along with the number of lost messages from the message queue.

LS:CDR Service - 02 - Write: Contains details about the rate of message writing to the CDR database and about any failed writes.

LS:CDR Service - 03 - ReportError: Contains details about the numbers of errors that occur when trying to generate CDR reports.

LS:QMS - 00 - QoEMonitoringServer: Contains a summary of the QoE messages being received—the most important counters here are QMS-004 and QMS–005, which both relate to counts of failed messages.

LS:SipEps - 00 - Sip Dialogs: Contains information about actual SIP dialogs, specifically around broken and healed dialogs.

LS:SipEps - 01 - SipEps Transactions: Contains SIP transaction information, such as incoming, outgoing, active, and so on.

LS:SipEps - 02 - SipEps Connections: Contains generic connection information, such as bytes sent and received, and the number of successful and failed connections.

LS:SipEps - 03 - SipEps Incoming Messages: Contains the breakdown of incoming messages based on SIP type (NOTIFY, SUBSCRIBE, REGISTER, and so on), summaries of response codes, and individual counts of response codes (for example, response code 200).

LS:SipEps - 04 - SipEps Outgoing Messages: Contains the breakdown of outgoing messages based on SIP type, summaries of response codes, and individual counts of response codes.

Monitoring Database Sizing

When you deploy Monitoring Server, you associate it with one or more Front End pool. Monitoring Server then collects data from the pool(s) you have associated it with.

When you use the recommended hardware configuration and collocate the Monitoring Server and Monitoring database on the same computer, a single Monitoring Server can serve up to 250,000 users. If you have multiple pools that total less than 250,000 users, we recommend that you associate all of these pools with a single Monitoring Server to simplify administration.

We also recommend having an understanding of how large the database can grow. The following example provides a formula that you can use to estimate the database size. By using the counters listed in LS:CDR Service - 01 - Read (specifically, average message size) and LS:CDR Service - 02 - Write (specifically, messages per second), you can confirm whether our estimates are accurate and then make any required adjustments. 

Example: Based on a standard user profile description, the QoE database will grow at approximately 16.8 KB per user per day. To figure out how much storage your QoE database will need at a minimum, use the following formula:

database size = 16.8KB * (number of users) * (retention period days)

Summary

The Archiving Server and Monitoring Server server roles support proactive troubleshooting by providing detailed data capturing and reporting capabilities. In this chapter, we described the ways that Lync Server improves these capabilities, providing you with the following information about archiving:

An overview of the archiving components and dependencies related to perquisites for configuration

Scaling and performance calculations to determine sizing estimates

An explanation of IM and web conferencing integration

Details about archiving cmdlets

And information about the following for monitoring:

Synthetic transactions for testing and validating workloads

A discussion of about using Listening MOS, Sending MOS, Network MOS, and Conversational MOS to measure call quality

An overview of Monitoring Server Reports

Additional Resources

For more details, see the following:

Planning for Archiving at

Microsoft Lync Server 2010 Archiving Deployment Guide at

Planning for Monitoring at

Microsoft Lync Server 2010 Monitoring Deployment Guide at 221271

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download