Troubleshooting with Azure Operation Management Suite

By –

Vaidehi Bijamwar

September 13, 2017

Share –

Troubleshooting with Azure Operation Management Suite

Operation Management Suite (OMS) is a cloud-based collection of management services, entirely hosted in Azure Cloud Services.

Log Analytics, part of OMS, monitors both the on-premise and cloud environment to maintain their availability and performance. Log Analytics can be accessed using an Azure Portal as well as through an OMS portal. It provides a centralized dashboard to monitor application logs, metrics and other infrastructure-related information.

In this blog, we will focus on how OMS collects data to scrutinize and resolve issues.

The below diagram depicts a high-level view of Operation Management Suite (OMS) infrastructure. 1

OMS architecture

Features of an OMS portal

An OMS portal is equipped with some basic features, including Log Search, Dashboard, Alerts, Export Data and more.

  • Log Search: Once the log data from different systems is in the OMS repository, users can leverage this feature to query them.
  • Dashboard: Query data is presented in a dashboard as tables, columns or charts.
  • Alerts: Triggers can be set up based on query results. These alerts can be presented in the dashboard or configured to be sent as an email notification to administrators.
  • Export Data: Query data can be exported to analytical tools like PowerBI or MS Excel.

There are many scenarios where these OMS features can help in troubleshooting, including:

  • Performance monitoring of servers from Azure
  • Forecasting resource utilization trends
  • Security log collection
  • Malware detection and software update status
  • Monitoring of OS Resources (CPU, disk, memory, network)
  • Search and log analytics, etc.

Using OMS for real-time monitoring and troubleshooting of an integration project

Testing an integration project is comparatively more complex than other web application testing. The systems involved in the different layers of the integration project can have different architecture. Some common challenges include:

  • Testing is purely message based
  • Absence of GUI to view message flow
  • Various protocols that require testing
  • Logistics challenges to get the services up and running

Message troubleshooting, for one, is quite demanding because developers and support engineers must manually review logs to identify the root cause. This manual process for each testing cycle leads to reduced efficiency.

Operation Management Suite (OMS) can be used to resolve the above-mentioned problems because it supports several built-in solutions to collect real-time logs from different applications that are hosted either on-premises or in the cloud for an integration project.

Different ways to sync data to OMS

In the Integration project, as the different applications can be diversely hosted, the main challenge is to connect all systems to OMS and sync real time logs from them. OMS portal also provides some built-in solutions like Applications Insights Connector, Telemetry APIs and Diagnostic Settings to collect data from different resources. These solutions can be leveraged by different applications as per their ease or suitability

Application Insights Connector and Telemetry API

OMS provides some built-in connectors to collect diagnostic logs from connected resources. Application Insights is a powerful analytics feature of Azure Cloud Services that can help diagnose issues. This service can be used by applications across a wide variety of platforms including .NET, J2EE etc. An OMS portal can be configured to collect logs from Application Insights services. Further, the Application Insights services come with a core telemetry API which can be used to write custom events in the application code and sync to OMS.

There are few limitations with using the Application Insights connector. It can sync only a few of Application Insights events like Request, Exception, Availability, Page View, and Custom Event to OMS through the connected application. However, if any of the ASP.Net applications are using System.Diagnostics.Trace for diagnostic tracing, those logs will get captured using Event Tracing in the Application Insights instance, but will not sync into OMS using this connector service. These limitations can be addressed using the core telemetry API.

Diagnostic Settings for Logic Apps

Many organizations are adopting Logic App Services as their integration solution. Azure Logic Apps provides a way to simplify and implement scalable integrations and workflows in the cloud. The real-time monitoring of the Logic App functioning is normally done through the Azure Portal. Even though the history of execution is available in the Azure Portal, it is very tedious to determine the specific execution details out of several run instances without knowing the exact WorkflowRunID (the internal unique ID to identify each execution instance of Logic App). Logic App provides a diagnostic logging feature which can be enabled to collect the event logs and execution metrics data into the Operation Management Suite repository.

Using Operation Management Suite via Azure Cloud Services can enable various benefits including immediate access to log data, a centralized solution that can be connected to on-premise as well as cloud applications, and portal management that can be conducted via a mobile app.

What’s next?

For many integration projects, implementing a common solution is not feasible for systems due to their varied architecture. The systems may not have the capabilities to push data to systems because of various different constraints or non-functional requirements.

In future blogs we will discuss custom solutions using Azure Logic Apps and HTTP Data Collector API, which we have designed using the Azure Logic App services to pull logs data from different systems. This custom solution also leverages Logic App Custom Connector and HTTP Data Collector API to push logs data to the OMS repository directly. This custom polling consumer pattern using Azure Logic Apps also will be explained in a future blog.

Subscribe for Latest Content

Want to scale your organizational digital initiatives?

Sign up for insights