Gathering metrics is something that every web-based business will do – and of course – there are lots of different way to gather metrics. Metrics are invaluable for capacity planning and troubleshooting (including site downtime postmortems).
I would defined a metric as anything that can be discreetly measured at a certain point in time or over a particular standard period, such as a minute or an hour. And I’d break them down into two basic categories.
These are metrics that are used to determine how well your website hardware and software are performing. Generally these will translate well across different businesses as they are often based on the technologies used by the website rather than what service the website offers. Examples would be:
- Memory Usage
- Average Request Response Time
- Request Queue Length
- Request Queue Time
- Page Impressions / sec
- Memory Usage
- Disk I/O
- Transactions / sec
- Blocking and Locking
You’ll find a definition here but any business will want to measure that which indicates the success or otherwise of the business. So these metrics can vary from business to business. Often this will be website signups or transactions of some description. Depending on your business this can also extend to some performance metrics such as Page Impressions / Sec – a portal sight might see this as a business metric for example. In the past we’ve used:
- Signups / min
- Transactions / min
- Page Impressions / min
- Payments / min
- Transfers / min
Setting Up Metrics
Setting up all the above metrics from scratch in a way that can be queried and trended over time can be a very time-consuming undertaking. You’ll spend days and weeks:
- Setting up each individual metric so that it is persisted to file or database
- Creating a database schema to hold all this data
- Writing the agents to gather all these metrics and store or aggregate them every x minutes
- Writing the reports and graphs to visualise all this data over time
- Maintaining all the above when you get new hardware or install new software versions
Give yourself a break!
More and more these days, we do not need to roll our own metrics – especially in regard to performance metrics. As I mentioned, these type of metrics are often transferable from different technologies. Requests / sec means the same thing to a .NET developer as it does to a Ruby on Rails developer. Likewise, you’re an open source Postgres or MySQL developer or administrator then you’ll understand the ramification of high latency disk IO metrics.
So, why not use a cloud based service to monitor your application performance for your Java or PHP application? The idea here is that you install an agent on each of the servers you wish to monitor. These agents will submit metrics information on a stash of standard web server metrics to the cloud-based mothership. No need for you to worry about any of the setup or maintenance of the data, or the reports and graphs. All for a monthly subscription of course!
There are pros and cons to such a hosted metrics service.
- Bandwidth incurred
- Restrictions on customisation of graphs/charts/reports
- A monthly subscription
- Firewalls exceptions for outbound traffic in restricted DMZs
- That particular metric you have a bee in your bonnet about might not be one of those captured
- Quick and simple
- Minimal maintence for new hardware
- No need to worry about the space all these metrics are taking up
- No development overhead
I can really see how this could be attractive to many businesses – especially startups who might have a real limit on development time (i.e. like actually doing business-related stuff!) but still need to show potential investors they care about their metrics.
The big player out there is New Relic. As an aside for any ColdFusion readers: they don’t support JRun – which is another reason why I think Abobe need to move to a more modern and better supported Java server, such as Tomcat.
If you prefer to gather those metrics in-house then use a metrics gathering facility to help you store and manage this info – rather than doing it from scratch. One I came across recently via the Code as Craft blog – was Graphite (Lunix-only sadly), but it allows you to specify the metric to record, the range of data you’re likely to want to query in future, and the percision to store the data (e.g. one data point per minute / hour) etc.
This type of framework can make the reporting and graphing work a lot simpler and quicker. No more worrying about graphing your data and, as Etsy found, you can mix-in other things like your deployment points.
Great lets put all our metrics in the cloud!
Not so fast. Google Analytics (another hosted service) will help you, for free, gather a massive amount of client-side data for your site such as popular pages. A lot can be gleaned about your business from impressions and click-throughs to certain pages but it can’t see inside your business. For example, GA can’t tell you what the profit was on your transactions or how many of your customers have been active in the past 30 days.
As nice as it may seem for a developer to have one of these cloud-based services tag care of all metrics there is not a snowball’s chance that you’ll get your CIO or CEO to allow your critical business metrics be gathered by a third party. So you’ll probably need to roll your own there.
In conclusion, if you are going to use metrics – think about using a cloub-based service for those performance ones – but you’ll still need to look at a way to gather the business-based ones into the mix. Best of luck!