Monitoring in an Era of High SLA

As more and more enterprises embrace the benefits of mobility and are setting the path to higher levels of productivity, business agility and bottom line impact, there is a lot to consider when selecting a mobility solution provider. While the Capriza platform makes it easy to create micro apps – one-minute workflows we call Zapps ­– there is much more to our service than the product itself. Given the mission critical nature of many Enterprise mobile applications, the SLA we provide is key.

For this reason, Capriza is committed to delivering the highest level of SLA and in order to do this, we have created comprehensive enterprise technology services to monitoring that addresses  the following different areas:

 

  • The Capriza Cloud – Our cloud is comprised of several globally distributed backend components — application servers, databases, cache services and more. All components are monitored by our Network Operations Center which is staffed 24/7.
  • External ServicesAll third party services are included in our monitoring procedures to ensure we can quickly mediate if they are experiencing an incident.
  • Customer Backend Application – Since Capriza’s micro-apps connect to our customers backend systems, we need to monitor their applications as well. This is not a trivial undertaking. Every micro-app is unique and the shear volume of transactions is high. As a result, we need a monitoring solution that is familiar with every Zapp, the different APIs of the back-end system, as well as the response codes and errors of the backend systems.
  • Mobile User Experience – The success of any mobile app lies with delivering an excellent user experience. Performance is one aspect of mobile experience and rather than waiting for a customer to notify us that their users in China are having slow response times, we set out to be notified before an issue is noticed.

  • Spot Problems Fast — When targeting problems, we don’t assume they will happen during regular working hours . All monitoring is implemented as a 24/7 services with layers of redundancy in the response team.
  • Easy On-Boarding – When we on-board new customers, there are some manual aspects needed for the setup. So we made sure to have the easiest on-boarding process in place for ensuring the required SLA.

A comprehensive monitoring system to ensure the highest level SLA

In order to provide the necessary SLA levels, we have extensively addressed the various aspects mentioned above, and have created a comprehensive three-layer monitoring solution.

Layer 1 — External View:  The first layer monitors external API response codes from the CDN and to the CDN.  This system provides us the ability to get alerts on changes in the systems. We have created our own tool as well as rely on some third-party tools. The end result enables us to  gather the information to central logs giving us the ability to get alerts on every new error.

Example of one of our system backend components (errors over time)

Layer 2 — Low Level System Monitoring:  We use Zabbix, an open source tool, for basic server monitoring. Zabbix provides real-time monitoring of tens of thousands of servers, virtual machines, and network devices.

Example of a memory graph of one of our components

Layer 3 — The Heartbeat System:  This is the heart of our service level monitoring. We built a system, Heartbeat, that monitors our customers’ Zapps. This system runs detailed logs monitoring along side with screen shots that every Network Operation Center  (NOC) engineer can understand if and where there is the problem and what needs to be done in order to solve it.

We run this monitoring on every customer’s Zapps from several different locations every 15 minutes, and, for those customers with a highest level of service agreement, every 5 minutes. So every 15 minutes we are getting a view of what our customer experience when they use a Capriza Zapp. This allows us to be proactive on behalf of our customers.

The results are fantastic. In most cases ( 99%) of the time we notify the customer about a problem before the customer is even aware, and we immediately start to investigate the issue and solve for it.  In an era where business is moving faster than ever before and companies are relying on technology innovation for their competitive advantage, this is how we believe we need to deliver a product .

Shay Peretz is head of operations at Capriza. He is responsible for the ongoing security, performance, and stability of Capriza's SaaS platform.

Required fields marked with a *. Your email address will not be published.

*