The importance of performance metrics

Last time, we talked about metrics that answer questions to help you increase sales.

Metrics aren’t solely about sales and marketing. Quality and performance metrics drive your business. Can you identify a few that would cause serious concern if they changed by as little as five percent?

If there are, how are you monitoring them? Monitoring and access to that info is what I was initially referring to last time. The performance metrics I’ve been working on are a mix of uptime, event and work completion information. In each case, a metric could indicate serious trouble if it changed substantially.

Uptime metrics

An uptime metric shows how long something has been running without incident. In some fields, particularly technology, it’s not unusual to have seemingly crazy uptime expectations.

Anyone who has picked up a landline phone is familiar with the standard in uptime metrics – the dial tone. For years, it was the standard because it was quite rare to pick up the phone and not get a dial tone. Its presence became an expectation, much as our as yet unfilled expectation of always-unavailable cell and internet service is today. The perceived success rate of the dial tone is probably a bit higher than reality thanks to our memory of the “good old days”.

Today’s replacement for the dial tone is internet-related service. Years ago a business would feel isolated and threatened when phone service went out. Today, it’s not unusual for a business to feel equally vulnerable when internet service is out. The seemingly crazy uptime expectations are there because more businesses function across timezones (much less globally) than they did a few decades ago. Web site / online service expectations these days are at “nine nines” or higher.

Nine nines of uptime, ie: 99.9999999% uptime, is a frequently quoted standard in the technology business. Taken literally, this means less than one second of downtime in 20 years. There are systems that achieve this level of uptime because they aren’t dependent on one machine. The service is available at that level, not any one device. Five nines (99.999%) of uptime performance allows for a little over five minutes of downtime per year. Redundancy allows this level of service to be achieved.

Events that idle equipment and people are expensive. What’s your uptime metric for services, systems, critical tools like your CNC, trucks on the road (vs. on the side of the road), etc?

Event metrics

Event metrics are about how often something happens, or doesn’t happen – like “days since a lost work injury” or “days since we had to pull a software release because of a serious, previously unseen bug“.

In those two examples, the event is about keeping folks focused on safety first and safety procedures, as well as defensive programming and completeness of testing. You might measure how many crashes your software’s metrics reported in the last 24 hours and where they were. Presumably, this would help you focus on what to fix next.

What event metrics do you track?

Work completion metrics

Work completion metrics might be grouped with event metrics, but I prefer to keep them separate. Work completion is a performance and quality metric.

Performance completion not only shows that something was finished, but that successful completion is a quality indicator. Scrap rates and scrap reuse get a lot of attention from manufacturers in part because of the raw material costs associated with them. Increasing scrap rates can indicate performance and quality problems that need immediate attention.

One system I work with processes about 15,000 successful events a day – not much in the technology world when compared to Google or Microsoft, but critical for that small business. If the number dropped to only 14,900 per day, their phones and email would light up with client complaints, so there’s a lot of emphasis on making sure that work completes successfully. Catching and resolving problems quickly is critical, so redundant status checks happen every 15 seconds.

Events of this nature are commonly logged, but reviewing logs is tedious work and can be error prone. Logs add up quickly and can contain many thousands of lines of info per day – too much to monitor by hand / eye via manual methods.

These days, business dashboards are a much more consumable way of communicating this type of information quickly and keeping attention on critical numbers – such as this example from klipfolio.com:

business performance metrics dashboard

What work completion metrics do you track?