Follow me on Twitter: @ericjbruno
Email me: eric@ericbruno.com
Eric J. Bruno
Monitoring in general may not always be viewed as a differentiator, but transaction monitoring and tracing can become a critical requirement. The world isn’t perfect, and even the best development practices may not be able to prevent all data integrity issues. Being able to reconcile with good monitoring and reporting can be the difference between staying in and going out of business, or even life and death.
In terms of software, transaction processing is the act of taking individual operations and executing them as part of a single unit. All of the composite operations must either succeed or fail as a whole, meaning if one component fails, the others must be rolled back. Partial completion is never acceptable.
Take, as an example, a purchase made on an e-commerce site on the web. In simplistic terms, the two important components of the transaction are the removal of the item from inventory, and the charging of your credit card for payment. If either one of those operations fails, then the other must not complete. For example, if the item is not in inventory, then your credit card should not be charged. Inversely, if the credit card payment fails, then inventory must not be decremented. The transaction only succeeds when all of the individual components succeed, otherwise actions are rolled back to maintain data integrity.
Transaction processing gets more complicated in scenarios where multiple, distributed, systems are involved; a scenario described as distributed transaction processing. For example, renting a movie on iTunes (or other movie rental service) requires that the movie be available, proper payment credentials are received, and enough space exists on the viewer’s device for download. If either one of those steps fails across each of the distributed components, then the entire transaction fails (meaning you don’t get charged, and the renter doesn’t report the rental to the owner of the video content).
Unfortunately, the complexities of distributed transaction processing extend to other systems as well, such as your monitoring and reporting implementations. Fortunately, there are strategies and tools to help.
Making matters more complicated, modern systems may also contain a mix of cloud-based and mainframe-based database processing; all of which needs to be monitored and traced.
Part of efficient transaction monitoring includes identifying the right people to be involved when questions arise. Alerting the wrong people, or the wrong vendor, can be costly not just in terms of wasted resources, time and effort, but also increased risk due to delays in recovery.
Understand and visualize performance spikes in your applications, even if the cause is only a single outlying request, by analyzing trends by individual requests.
However, transaction monitoring goes beyond the standard dashboard green or red light—Systems may be up, but transactions may not always execute properly, or be timely. Measure overall system performance, including each component and even network infrastructure, to perform root-cause analysis of issues affecting your users.
Starting from this macro level, you can isolate interesting code and dive into the lowest-level details of your application, including individual web pages and their constituents (see Figure below). This includes a page’s script code as it’s executed, host server activity, network latency, associated database queries, image downloads, and more, all down to the individual line of source.
Dive deep into the user experience, monitoring each transaction, looking for outliers, and reporting real user performance. Some monitoring solutions focus on specific parts of the stack, but to measure the entire transaction envelope, observe the entire stack and how it affects user experience.
Continuously monitor trends to alert you of impending trouble before users are impacted. Distributed transaction monitoring services can be deployed across your data centers, or the cloud globally, to track your users’ experience accurately, regardless of where they reside. As a result, you can trace transactions across hosts, measure against baselines and transaction acceptance criteria using the Latency data API, and generate automated real-time alerts when thresholds are crossed.
Some organizations view monitoring as a necessary evil. But with distributed transaction systems, thorough monitoring can be a business differentiator, in that it extends to all of your components, helps you become proactive, detect and fix issues before your users experience them, and enables you to quickly recover in times of failure. I often like to draw a comparison between software and the game of golf: even the best golfers occasionally hit a bad shot. It’s what they do to recover that counts most. Effective transaction monitoring can do the same for your business, impressing your customers with your exceptional ability to recover from even the worst of problems. In business, it’s often that final impression that counts.