Well, read on for an example that teaches us the opposite: All is fair in marketing and sales, even the redefinition of completely obvious mathematical terminology.
Uptime and SLAs
Let's start with some background information.
Service providers tend to offer SLAs (Service Level Agreements), which spell out what level of availability you should be able to expect from the service. This is often expressed as a percentage of uptime over some amount of time. If the 'allowable' downtime that is specified in the SLA is exceeded, you might be able to get some sort of refund from the provider. If it is not exceeded, though, then you have no grounds to complain about it and just have to live with it. Keep in mind that the allowable downtime doesn't mean that the service will always be down for that much time every month. It only means that you have no 'guarantee' of uptime beyond what is spelled out.
For example, if someone offers "99% availability" - which sounds quite nice if you just read it like that - then some simple math shows us that this actually still implies a possible 26.000 seconds of downtime per month, or more than seven hours! Or almost 15 minutes per day. As we can see, this is not very good after all for any serious application that needs to be accessible 24/7.
Modern cloud computing services, such as Google's AppEngine or Amazon's EC2/S3 typically have SLAs that mention "99.9% availability". That's around 43 minutes of downtime per month, or almost 2 minutes per day. Still not great, but certainly better.
More serious offerings in other areas sometimes talk about 5 nines availability, which is another way to say 99.999% uptime. That is only 26 seconds per month, less than 1 second per day. Very nice, indeed.
The redefinition of "100%"
Now imagine my surprise when I read the bold claim by one particular service provider - who shall remain nameless - which loudly advertised 100% availability! Wow! I thought. They must have some really impressive highly redundant setup. Of course, no technical service will ever really provide 100% availability. Stuff fails, and you can only engineer a system to deal graceful with failure, you can never eliminate it completely.
So, I assumed they were willing to back that claim with an SLA, knowing full well that under very rare circumstances they would have to pay out a penalty to their customers. I thought they probably figured that backing 100% availability with an SLA and paying the occasional fee would still be benefitial to them due to attracting more customers. There isn't anything wrong with that strategy, since the customers would get reimbursed accordingly in case of failure.
But imagine my surprise when I read the small-print of their SLA:
The [...] Infrastructure shall maintain 100% availability.Wow! This is rather unbelievable. In fact, I would call this false advertisement on many different levels.
“100% availability” shall mean that the [...] Infrastructure shall not fail to respond to [...] queries for more than fifteen (15) consecutive seconds out of any thirty (30) day period.
Firstly, we see here a redefinition of the meaning of "100%". Usually, 100% means: The whole thing. All of it. Nothing left. In its entirety. But according to this service provider it really only means 99.999421296%. Last time I checked, 99.999421296 is NOT the same as 100. So, something is clearly off. They place "100% availability" in quotes, almost to make it appear like just some sort of marketing term. That is clearly misleading, because customers should not be expected to interpret it this way. "100% availability" has a very specific and well-established meaning. Something they simply hope to define away in this contract.
Secondly, look closely and you will find that it is actually much worse than that. They only see their availability claim violated if the outage is for more than 15 consecutive seconds per 30 day period. What does this mean? Let's say they are offline for 14 seconds then come back for one second and go offline for another 14 seconds ... and continue this pattern for a whole month. Well, technically they would not have violated their SLA since no outage was for more than 15 consecutive seconds. You would have had a month of terrible service and NO recourse with the vendor at all.
So, as a conclusion and lesson to learn: Even if it says "100% availability" on the box, always read the fine print as well. And since vendors like to throw in weasle words like "consecutive" here and there, make doubly sure that the SLA you are getting is worth the paper it's printed on (or the pixels on your screen).
The one question that I am left with now: Should I, or should I not name that particular service provider? Could I get into some sort of legal trouble if I would? I didn't sign an NDA with them, so I should be fine, right?
Other related posts:
Finally: Persistent storage for Amazon EC2
Comment by Keith, on 29-Jul-2008 13:25
I think you should name them. After all, their guarantee and their SLA is publicly available on their site, correct?
Posting the company would also do many people a good service if they are in the market for a new provider in that industry.