Monitor hundreds of services with just a single pair of eyes

Drools-eyeIn my previous post on monitoring I wrote about setting up customer activity monitoring supported by anomaly detection. Now I will show how to monitor hundreds of services without drowning in data. The solution i present processes raw monitoring metrics and turns that data into actionable alerts and usefull insights.

Continue reading


Introduction to monitoring with anomaly detection

machine-learning-finalIn this article I’ll describe how I implemented customer activity monitoring and anomaly detection. If you are a service provider that provide services to a group of large accounts its vital to know that your customers can do their business.

Monitoring customer behavior is not only required for managing IT operations its also vital to know from a business point of view. Nowadays customers get smart too. High tech customers such as Netflix and Airbnb use data analytics to monitor the results they get from their payment providers. Based on this data they make real-time decisions to switch to another supplier when that better suits their needs.

Continue reading

The smart monitoring vision

WhatsApp Image 2017-10-04 at 14.27.36

ING Business continuity conference 2017

Last week I attended the ING Business Continuity conference in Amsterdam as a speaker. More than 140 participants from all over the world discussed for two days how reliability of IT services could be improved, handle major incidents and recover from disasters. My presentation on Smart monitoring provides an practical approach how to improve reliability of IT services.

Continue reading

End to end transaction analysis with Neo4j


End to end transaction analysis

Successful enterprises constantly seek new ways to improve availability of their services, and try to avoid compliance breaches.  This can be achieved by managing services on an end-to-end basis. By analyzing the topology of end-to-end business value chains you gain insight into the behavior of your systems in a way you have never seen before.

Continue reading

Avoiding time confusion


Changing the clock confuses humans and computers as well

This weekend all over the world hundreds of servers went down, millions of transactions failed and data was lost due to daylight saving time confusion. This recurring event happens every time we change the clock between winter / summer time because of daylight saving time. In this article I explain the cause of the problem. I also provide some hints on how to prevent these kind of problems. Continue reading

The basics of analyzing business transactions


Analyzing business transactions

While most companies manage their services on the physical component level some are beginning to manage them at the transaction and business process level. This because they want to understand their customer journey or need to have a complete audit trail in order to meet compliance requirements. In this post i will describe the basics of analyzing business transactions.  Continue reading

Managing microservices beyond the hype curve

teaserMicroservices promise to deliver new business services faster and at lower cost, but it comes at a price: increased operational complexity. I share my vision on managing microservices on the enterprise level:

  • Why you should manage your services
  • What challenges you will face
  • What benefits you can get
  • What you can do to get in control

This article is part of a series on service monitoring.

Continue reading

Getting started with Smart monitoring

If you want to offer your customers IT services and promise them 99,9% availability then Service monitoring is essential. Smart Monitoring is a philosophy and way of working to achieve that goal. It enables you to understand your complex IT and be in control. In this article I will give you some guidelines and a basic plan how to get started.

This article is part of a series on monitoring.

smartmonitor_overviewLets try to aim for this:

The dashboard shows information on several services and how they conform to SLA defined KPI’s such as response time target. You can see how these services perform, what results they return (success, functional and technical errors) and drill down by Team.

Continue reading

How graph databases can help us to understand our microservice topology

This article is part of a series on monitoring.

In the old days we just had to manage a single mainframe and a private network with terminals and printers. Now technology has evolved into service oriented, REST-api based, lightweight, runs in the cloud and communicates with tens of other services. Developing these services takes a week and deployment is fully automated. So the Agile DevOps factory is spitting out new services every minute.

This results in a higly dynamic environment and it becomes a real challenge to manage thousands of micro services. How do you get an overview on the current state of your system. That is what the next video explains.

Why do we need Smart Monitoring

OverviewNowadays the large companies and institutions have complex IT Services to support business value chains that execute millions of business transactions around the clock. All this business activity need to be monitored for problems and technical failures.

Question: How can we  assure 99.7% availability and reduce cost at the same time ???

In this article I will describe how a typical company monitors their IT services and how it can achieve situational awareness.

Continue reading

ING Xperience PowerIT 2016

The xPerience is an annual technology market at ING. This was an excellent opportunity for me to present Smart Monitoring an innovative solution for end to end monitoring.

More than 500 engineers, product owners, chapter leads and managers visited the event that was all about experiencing the capabilities and approaches that help us building the next generation digital bank.

Continue reading

Smart Monitoring


Bewust omgaan met IT Services

Waar komen al die internet bankieren storingen toch vandaan en wat kan je er er aan doen?

Veel bedrijven richten hun aandacht op het snel ontwikkelen en in de markt zetten van nieuwe diensten. Deze wens naar snelle ontwikkeling word ondersteund door de Agile werkwijze waarbij scrum ontwikkelteams continue en in korte sprints van enkele weken nieuwe componenten opleveren. Door diverse oorzaken raakt het beheer van de bestaande diensten wat op de achtergrond en de kwaliteit van de geleverde diensten neemt af. In dit artikel wil ik “Smart Monitoring” introduceren, een werkwijze  voor het efficiënt en effectief monitoren en verbeteren van stabiliteit en beschikbaarheid van IT services.

Continue reading

Coaching a TIBCO ESB team for ProRail InfoPlus

InfoplusThe ProRail InfoPlus system provides round the clock information to travellers. The system processes thousands of messages per hour and delivers this over a TIBCO Enterprise Service Bus to external parties such as OV9292. Mr Ritter is asked to assist sub-contractor Conclusion Future Infrastructure Technologies in delivering 7*24 Gold support. Conclusion will provide remote support on the TIBCO Enterprise Service Bus. Mr Ritter is responsible for preparing the remote support team for this important task.


Continue reading

How to Monitoring TIBCO ESB


Monitoring and Reporting

TIBCO Enterprise Service Bus provides a complete set of ESB products, these products need to be combined into a solution architecture. For my current customer I designed and implemented the following monitoring and reporting solution that resulted in a significant reduction of business process exceptions.

The solution includes the following products
  • OpsView (Enterprise IT Monitoring)
  • TIBCO Hawk (monitor infrastructure behavior, metrics and failures)
  • TIBCO Clever (monitor functional and technical errors)
  • TIBCO Spotfire (reporting)
  • Pentaho Data Integration (ETL)
  • Esper (Complex Event Processing)
  • Confluence (Wiki based knowlegde base)

Continue reading