In my previous post on monitoring I wrote about setting up customer activity monitoring supported by anomaly detection. Now I will show how to monitor hundreds of services without drowning in data. The solution i present processes raw monitoring metrics and turns that data into actionable alerts and usefull insights.
In this article I’ll describe how I implemented customer activity monitoring and anomaly detection. If you are a service provider that provide services to a group of large accounts its vital to know that your customers can do their business.
Monitoring customer behavior is not only required for managing IT operations its also vital to know from a business point of view. Nowadays customers get smart too. High tech customers such as Netflix and Airbnb use data analytics to monitor the results they get from their payment providers. Based on this data they make real-time decisions to switch to another supplier when that better suits their needs.
Last week I attended the ING Business Continuity conference in Amsterdam as a speaker. More than 140 participants from all over the world discussed for two days how reliability of IT services could be improved, handle major incidents and recover from disasters. My presentation on Smart monitoring provides an practical approach how to improve reliability of IT services.
Successful enterprises constantly seek new ways to improve availability of their services, and try to avoid compliance breaches. This can be achieved by managing services on an end-to-end basis. By analyzing the topology of end-to-end business value chains you gain insight into the behavior of your systems in a way you have never seen before.
This weekend all over the world hundreds of servers went down, millions of transactions failed and data was lost due to daylight saving time confusion. This recurring event happens every time we change the clock between winter / summer time because of daylight saving time. In this article I explain the cause of the problem. I also provide some hints on how to prevent these kind of problems. Continue reading
While most companies manage their services on the physical component level some are beginning to manage them at the transaction and business process level. This because they want to understand their customer journey or need to have a complete audit trail in order to meet compliance requirements. In this post i will describe the basics of analyzing business transactions. Continue reading
Microservices promise to deliver new business services faster and at lower cost, but it comes at a price: increased operational complexity. I share my vision on managing microservices on the enterprise level:
- Why you should manage your services
- What challenges you will face
- What benefits you can get
- What you can do to get in control
This article is part of a series on service monitoring.
If you want to offer your customers IT services and promise them 99,9% availability then Service monitoring is essential. Smart Monitoring is a philosophy and way of working to achieve that goal. It enables you to understand your complex IT and be in control. In this article I will give you some guidelines and a basic plan how to get started.
This article is part of a series on monitoring.
The dashboard shows information on several services and how they conform to SLA defined KPI’s such as response time target. You can see how these services perform, what results they return (success, functional and technical errors) and drill down by Team.
This article is part of a series on monitoring.
In the old days we just had to manage a single mainframe and a private network with terminals and printers. Now technology has evolved into service oriented, REST-api based, lightweight, runs in the cloud and communicates with tens of other services. Developing these services takes a week and deployment is fully automated. So the Agile DevOps factory is spitting out new services every minute.
This results in a higly dynamic environment and it becomes a real challenge to manage thousands of micro services. How do you get an overview on the current state of your system. That is what the next video explains.
Nowadays the large companies and institutions have complex IT Services to support business value chains that execute millions of business transactions around the clock. All this business activity need to be monitored for problems and technical failures.
Question: How can we assure 99.7% availability and reduce cost at the same time ???
In this article I will describe how a typical company monitors their IT services and how it can achieve situational awareness.
The xPerience is an annual technology market at ING. This was an excellent opportunity for me to present Smart Monitoring an innovative solution for end to end monitoring.
More than 500 engineers, product owners, chapter leads and managers visited the event that was all about experiencing the capabilities and approaches that help us building the next generation digital bank.
Waar komen al die internet bankieren storingen toch vandaan en wat kan je er er aan doen?
Veel bedrijven richten hun aandacht op het snel ontwikkelen en in de markt zetten van nieuwe diensten. Deze wens naar snelle ontwikkeling word ondersteund door de Agile werkwijze waarbij scrum ontwikkelteams continue en in korte sprints van enkele weken nieuwe componenten opleveren. Door diverse oorzaken raakt het beheer van de bestaande diensten wat op de achtergrond en de kwaliteit van de geleverde diensten neemt af. In dit artikel wil ik “Smart Monitoring” introduceren, een werkwijze voor het efficiënt en effectief monitoren en verbeteren van stabiliteit en beschikbaarheid van IT services.
- Alles onder controle hebben
- Incidenten effectief afhandelen
- Rustig kunnen slapen
- Verantwoording naar de business (SLA)
- Verantwoording naar de klant (kwaliteit)
- Kwaliteitsverbetering (dienstverlening, resource gebruik)
- Terugkoppeling naar de leverancier (projecten)
The ProRail InfoPlus system provides round the clock information to travellers. The system processes thousands of messages per hour and delivers this over a TIBCO Enterprise Service Bus to external parties such as OV9292. Mr Ritter is asked to assist sub-contractor Conclusion Future Infrastructure Technologies in delivering 7*24 Gold support. Conclusion will provide remote support on the TIBCO Enterprise Service Bus. Mr Ritter is responsible for preparing the remote support team for this important task.
TIBCO Enterprise Service Bus provides a complete set of ESB products, these products need to be combined into a solution architecture. For my current customer I designed and implemented the following monitoring and reporting solution that resulted in a significant reduction of business process exceptions.
- OpsView (Enterprise IT Monitoring)
- TIBCO Hawk (monitor infrastructure behavior, metrics and failures)
- TIBCO Clever (monitor functional and technical errors)
- TIBCO Spotfire (reporting)
- Pentaho Data Integration (ETL)
- Esper (Complex Event Processing)
- Confluence (Wiki based knowlegde base)
Why do we need it?
Monitoring and Reporting is not just an add-on to the business functionality that can be added long after the design of business functionality has finished. It is an essential element of the architecture and design. It provides you with all the tools and information required to run business process as expected.