Chronology of an 'Internet outage' ... and lessons learned

, posted: 12-Dec-2007 05:50

The last few days saw one of my worst nightmares come true. The scenario where in the past, whenever I read about it, I would think to myself: “Phew! I'm glad that didn't hit us!” But over this weekend (and beyond) it DID hit us: An Internet outage!

Well, of course not the entire Internet, just our residential broadband connection. But it sure gets to you when you realise just how dependent you have become on this technology, how fickle it can be, and how much you depend on others.

I work for an overseas client from home. E-Mail and VoIP are key tools for me to get my work done. An outage to my residential broadband connection means that I simply cannot do my work. So, for me this was much more than just a case of not being able to catch up on news, my social network, my private e-mail or feeding my Internet addiction.

Below then, for your enjoyment, the chronological account of this outage. But first, the 'lessons learned'...

Lessons learned

If your livelihood depends on your Internet connection you need to scout out alternatives. This is especially true if you work from home, being on a residential plan. My guess is that residential customers are dealt with at a much lower priority. Just because YOU run your business from home may not mean that the ISP will treat your case with the same urgency as a business customer. I hadn't prepared for the event of a broadband outage, so I paid the price. Dumb oversight on my part, really. Don't let it happen to you.

  • Always make sure you have a backup plan on how to get broadband (if at all possible). If you have a laptop with wireless then you have options: Know ahead of time where some public WiFi hotspots are, for example. Research where you could get broadband access without paying an arm and a leg (especially in New Zealand there don't seem to be any free WiFi hotspots anywhere and prices can differ dramatically). Is it quiet there? Can you make VoIP conference calls there if you'd have to? Do they have power outlets there?

  • Don't forget good old dial-up. In this age of ubiquitous broadband we often forget this basic technology, which (more or less) reliably got us online during the birth of the world-wide-web. Even if the broadband infrastructure fails, the phone service often is still operational. So, dial-in is a rather slow, but valid means of getting online in an emergency and also an option for those not wielding a laptop. Ask yourself: Do you have a dial-in number? Do you know how to use your modem software? If you use Linux this can actually be a bit complicated. Have you tried it lately, so that you know that it all works and you can go online? Remember, once a broadband outage strikes, you can't look up anything any more! You just need know how to get online via dial-up.

  • Consider if your ISP offers business accounts you can afford and which come with a sort of service level guarantee. You might (!) get better responsiveness that way. Note, however, that your ISP may be entirely powerless if they depend on the infrastructure of another ISP (more about that later).

  • Quasi monopolies in the telco infrastructure lead to quasi customer service. In the end, it turned out to be very squarely Telecom's problem. They caused the issue with some botched equipment upgrade in the exchange and apparently the test suite they run after the work is entirely insufficient, or they didn't care to read their trouble tickets over the weekend, or they just don't bother much about residential customers of their ISP resellers. Or all of the above. They seem to be unresponsive at an extraordinary level. Dealing with Xnet (my ISP) was a bit of a mixed bag. Some things could have been improved, I thought, but overall I think they did what they could, and even a bit more at times. They actively monitor the Geekzone forums. This is a rather efficient way to raise attention to issues and in my case resulted in some personal follow-up and suggestions to get me online temporarily at least. That's much appreciated.


Day 1 (Friday):

7 am:

My ADSL connection disappears. 'Not connected', my router tells me. A temporary snafu? A glitch? Nothing to get worried about just yet, I figure. I have a document to review, so it doesn't stop me from working.

7:10 am:

Ah, around 10 minutes later the network comes back. No problem, smooth sailing.

10:15 am:

Middle of the day, middle of the work: E-Mail, researching stuff on the Internet, VoIP conference calls. My ADSL connection disappears again. This time it hurts more. I was just in the middle of things for which I needed a connection. Oh well, what can I do? Maybe it's similar to earlier this morning? I wait 10 minutes, 20, minutes, 45 minutes, still nothing.

11:00 am:

Still offline. That's enough! I call Xnet (my ISP).

Oh, yeah, we can see that Telecom is changing their equipment in the Glenfield exchange. They migrate from their old equipment to the new one. So, once they are done, you can probably expect better service quality. That can take a few hours ... No, we didn't get any advance warning. Normally, the first we know about them doing this kind of work is when customers call and complain ... If you are still not online by 3 pm, definitely give us a call.“

Ugh. Now what? Fortunately, I have a few more documents to review anyway, so I start with that. I call my client in the US and let them know that I am going to be offline for a while.

I do have to wonder though: Why can't Telecom even be bothered to give their large ISP customers some advance notice? I could almost understand that informing all residential customers would be tricky. But come on! Shouldn't they be able to at least fire off an e-mail to their ISP resellers ahead of time?

Hours pass.

4:00 pm:

My router is still not connected. On the phone with Xnet again.

No, Telecom is not working on their exchange any more, they are done ... Sometimes this can take a few days...“ (What? Didn't you just say that they are not working on it any more? So, what exactly can take days? And besides, your colleague earlier told me that it normally only takes a few hours!) “We have a lot of customers with this kind of problem over the last couple of weeks.“

Is there anyone of your other customers in Glenfield experiencing the same problem?”, I ask.

No... don't know, really.“

That doesn't leave me with a good feeling. If this is an exchange wide problem then the Xnet staff should definitely know about it. Maybe this only affects me for some reason? Maybe my 'case' will be damned to the pile of 'not quite so important', low-priority things for them? After all, this only affects a single residential customer, so how important can it be? But this is my livelihood we are talking about!

We will be filing an issue with Telecom, that's all we can do.” the Xnet support person mumbles. And that's that. I get a strong, sinking feeling.

7:30 pm:

Still offline! For more than 9 hours now! I feel terribly disconnected, isolated, and helpless. Clearly, withdrawal symptoms are beginning to show. And now I am definitely behind on work. Discussions I was supposed to have, e-mails I was supposed to follow up on. It's all in shambles.

I call Xnet again. “Yeah, well, we filed that issue ticket with Telecom. Let's see if there is a response from them? ... No, nothing yet. Well, it should definitely be online by tomorrow again...” (How can he say that if he has no idea what the problem is?) “But if not, you can call us from 10 am onwards again.”

Wait, 10 am you said?”, I ask. “I normally start working at 5 am in the morning, I need the network connection, I can't do my work without one!”

Yes, 10 am. Our support centre opens at 10 am on Saturday.”

Oh no! It's weekend of course! Things always go wrong just before a weekend! The car fails, the tooth starts to hurt, the cat gets sick... the broadband connection fails. It's common wisdom! Things go wrong when nobody cares or is available any more. I should have known!

10:30 pm:

Still offline, for more than 12 hours now. I go to bed, not very optimistic about it all.

Day 2 (Saturday):

5 am:

Start of my workday? I don't think so. Still offline. And now I will have to wait another 5 hours before I can even begin to talk to anyone! This can't be happening to me! I might have to seek out some public WiFi spot somewhere, just to do my work!

9:30 am:

In sheer desperation, I drove to a Starbucks to use their WiFi. Well, in other countries maybe they have free WiFi at those establishments, but not here, though. $10 per hour! I feared something like that when the WiFi network that showed up was a Telecom hotspot. Plus some sort of maximum-simultaneous-connection limit, which effectively prevents me from browsing the web, while my mail client connects to half a dozen mail servers to get my mail. Rip-off!

10:30 am:

I posted questions on Geekzone to find alternative hotspots that are better priced. Cafenet was suggested, and I think I will go and try that out.

10:40 am:

Just got off the phone with Xnet (after a 15 minute hold). Still no news. They are now (only now?) changing the ticket status to 'urgent'. I didn't know they could do that. Yesterday, I had explained the urgency of the situation to other Xnet staff before, but none of them had suggested changing anything to 'urgent'. Well, now it supposedly is, and we will see what comes of it. I'm moving on to one of the Cafenet hotspots now.

1:00 pm:

Came to the Takapuna library, because they have Cafenet there. It's a quiet place for me to work, and the price is MUCH cheaper. $10 for 24 hours. Still expensive, but certainly an improvement. Thank you to the forum member who pointed out Cafenet to me. The library could do with a few more power outlets, though.

An Xnet staff member who monitors the Geekzone forums is responsive and tries to help resolve the situation. That is much appreciated

2:30 pm:

Got a phone call from Xnet, still no news. They say: “There are troubles right now in the Glenfield area, but Telecom's helpdesk is not responding, either because it's weekend or because there are so many problems, so there's nothing more we can do at this point.” Note this: Telecom's helpdesk is not responding? Not even to one of their ISP resellers? Because it's weekend? What? You'd think that ISP resellers have a special number they can contact? My goodness! How screwed up is that!

11:00 pm:

Sigh... still nothing.

Day 3 (Sunday):

All day:

Checking throughout the day, but still disconnected...

Day 4 (Monday):

8:00 am:

Check the network status. Why am I not surprised? Still down... Need to run some errants first before calling Xnet again.

1:45 pm:

Called Xnet again. Had to tell the same story to a different CSR ('Is the ADSL light on? Can you reset your router...' We did all of that so many times already, you'd think they have it in their file). He will call Telecom and will see if they can send a technician out, but it might take a few hours before they will call us back. I thought they had advanced our ticket to 'urgent'? Why didn't they follow up on their own? Why didn't they ask Telecom to send a technician out on their own, first thing in the morning? I would have thought they could have followed up on my behalf?

Later that day we are informed by Xnet (after we call them) that a Telecom technician will look at the case on Tuesday afternoon, at 3 pm. Oh no, another day without...

Day 5 (Tuesday):

6:00 am:

Left to the Takapuna library. Outside the library and the cafe you can still get coverage for the Cafenet WiFi hotspot. Not great quality, but at least I'm connected and can do some work, sitting there in my car. And before 8 am the parking is still free, at least. The library opens at 9:30 am, so at around 8:30 am I move to the Cafe across the street to continue work and also recharge my laptop batteries. At around 11:50 am I move over to the library.

2:00 pm:

Finally! My wife calls to tell me that the Telecom technician has arrived and starts to take a look at the issue ... but he's in our house! Somehow, I think the problem is not with our house, but with the exchange somewhere? Let's see what he finds.

3:30 pm:

I get home, the Telecom technician has left a while ago already. Nothing was wrong with the wiring in our house, so now he's off to the exchange.

3:45 pm:

We get a phone call from the Technician to check the ADSL light on the router. Gasp! It stopped blinking! I frantically get on the computer: Yes, indeed! We are online again! Back in business! The e-mails (mostly spam) start to stream again!

The Telecom technician was very nice and helpful, but I have to ask myself: Was that necessary? After starting to look at it he was able to resolve the issue within an hour. We gave notice on Friday morning. Now it's Tuesday afternoon.

Was that necessary?

Comment by KiwiOverseas66, on 12-Dec-2007 07:22

wow...what a saga! And your right - you don't realise how much you depend on it until its gone (especially if its you livelyhood). I guess a 3G card will be high on the present list for xmas?

We suffered the same thing here recently - and our connection is not just for work and recreation but for correspondence school as well. Nothing like having a bored child sitting around during the day.

One comment if I may - I suspect the line you were given that "the telecom helpdesk is not responding" is ever so slightly bogus - especially when you talking about one of the biggest call centres in the country. I've had to get exchange faults investigate in the past for customers (on the weekends as well) and those guys are reacheable 24x7. The fault would have been assigned a ticket number (which would have been known by Xnet). Did Xnet give you a fault number? From there you can request half hour or hourly updates on fault work in progress, you can request escalation if initial fault repair work fails, you can request a site visit, etc - basically as a corporate or wholesale customer there is a bunch of stuff you can ask for - but you do have to ask for it.

Sounds like a bit of miscommunication may have occurred. As for scheduling exchange or network upgrades without telling customers - oh yeah! Definitely been on the end of one of those! Even saw an incident once where one part of Telecom did an upgrade without telling the rest of Telecom!

Comment by cokemaster, on 12-Dec-2007 07:23

Perhaps the wifi hotspot was busy?... I know that in the past that I've checked my mail and did some heavy downloading (even used bittorrent with mixed success) from the hotspots and it was working fine (I use them on a daily basis).

Its also a little bit strange why a tech wasn't booked earlier. I've had issues when I first got connected and had a tech booked out the next day (booked there and then over the phone).

Also the wholesale providers do have an 0800 number from memory as some enterprising people have posted it on geekzone before...

Comment by maverick, on 12-Dec-2007 07:47

I agree with some of your points, and having a back up plan is always good for customers that run a business through their residential link.

We do hate for Customers to be offline and we do accept the responsibilty of getting the customer back online, even though Telecom may have caused the issue, we are responsible to the customer for getting it resolved.

The issue stemed from the upgrade that Telecom are doing on all Exchanges , basically converting eveyone to the new ISAM devices, these were all on our Network Staus Page btw and had been there for a while so Telecom did inform us tha tthere would be ongoing work there and we did post it

Network Status Report....

This area details any issues that are currently present on the network:

Type: scheduled

Subject: Telecom Outage - Glenfield

Begin: 2007-11-26 07:00:00

End: 2007-12-14 18:00:00


Telecom have advised that Xnet HSI customers on the Glenfield exchange may experience a loss of service for up to 10 minutes per customer between 07:00 26/11 - 18:00 03/12. This outage is required as part of the ASAM to ISAM migration project. Ref#70974.


Update - This has been extended until 14/12/07

Last Updated: 2007-12-07 15:16:00

Bottom line is though you were off for several days which is not acepatable when it was probably a simple fault for telecom to rectify, but I do believe that we did try and several attempts were made to push it along, but there were delays in Telecom getting the contracor on site

heres the thread for the issue btw

Comment by freitasm, on 12-Dec-2007 08:26

I use a cable modem connection, and while it's been quite good in the last ten years I always keep thinking that I would lose connection if the cable modem were to stop working and I had to wait five days for a new one to be delivered.

I do have alternate access through cellular data cards (I have four cards here) but it would impact at least my mail server.

There's always the thought of actually having a DSL line in standby just in case. Perhaps on a very low plan (1 GB) being paid but only used if an emergency arises.

Author's note by foobar, on 12-Dec-2007 08:32

@maverick: I really appreciate how you jumped in and tried to get it escalated and resolved. Especially offering me the use of the WiFi in your board-room on Saturday was a pretty classy thing, I have to say, even though I couldn't take you up on the offer at that time.

You pointed out that the planned outage was a known thing and was listed on your fault page. The first CSR that I spoke to, though, didn't seem to be aware of that. He was the one who told me that normally it is when customers call in that you find out for the first time that such an upgrade occurs. He also told me that Telecom normally doesn't give advanced notices of this kind. He then went and looked at something in the system and told me about the equipment upgrade in the exchange. Maybe he looked at your 'scheduled outages' page at that point?

A small disconnect, maybe.

One thing I noticed: The CSRs I spoke to never gave me the ticket number. I didn't know that this would be potentially useful, so I didn't ask for it. They certainly didn't volunteer it either. I also never needed to give it to them in subsequent calls, since my user-name was sufficient to get them to the file again. But what would the ticket number have given me? Would that have allowed me to call Telecom directly? What can I do with that ticket number once I have it? As a previous commentor to this blog post said (KiwiOverseas66) you can then request half-hourly updates on the status? With whom? Telecom?

Overall, though, I completely understand that the issue is Telecom's, and wasn't Xnet's fault as such. I appreciate that you claim the responsibility for following up, though. And I also understand that you were making calls and followed up on this.

Please note: What I wrote down in the post is not necessarily a reflection of what you did, but of my impressions of the whole process. This is how it was communicated and how it came across. For example, the issue with "Why didn't they call Telecom first thing in the morning, why did we have to call them to remind them to do so..." I wrote that because that is how it appeared to us, and nothing to the contrary was said to us. It's quite possible that you actually did do all these things. Maybe the communication can be improved through some unsolicited status updates on your part? I know that costs time and money, but for someone being offline, a call with a status update would feel wonderful.

I continue to be a happy Xnet customer, and would have no problem recommending your service. I believe you guys really tried to help. But even a good thing can still be improved.

Comment by PIERCD, on 12-Dec-2007 08:35

This happened to us a couple of weeks ago, when Telecom started spontaneously working on the Albany exchange, My office was without net from 10am till 4pm, on the last day of the month which led to huge issues for us, our non-Telecom ISP was also in the dark about this outage, and also could only file a report with Telecom. we work with a few large datacentres, and they would never dream of an outage in the middle of a business day, especially for this length of time.

Upgrades need to happen, but not during peak business hours, thats just not professional.

Author's note by foobar, on 12-Dec-2007 08:37

@freitasm: Yes, absolutely. It was a dumb move on my part that I hadn't thought about an alternative means of access sooner. I heard good things about Telstra cable, but I don't think that's available where I live. So, no matter what alternative I choose, in the end I will depend on Telecom infrastructure at some point, which is rather scary considering the amount of ineptness they have demonstrated.

I do want to stress that the Telecom technician that showed up in our house and eventually went to the exchange to fix the issue was really nice and helpful. It's not his fault that he was sent out to us so late.

Anyway, it seems that dial-up is probably the only alternative for me that is technically sufficiently 'different' to be a useful. Even though that is Telecom as well. Sigh...

Author's note by foobar, on 12-Dec-2007 08:43

@PIERCD: I fully agree! I was wondering why on earth they would schedule something like this in the middle of a work day. I figured they think that residential customers aren't all that important, and that it's no big deal to disconnect them for a couple of hours in the middle of the day.

But of course, if a business has an ADSL connection then they will be just as affected.

Upgrades like this should happen at 1 am Sunday morning, or some such thing. Even this would impact some who need the Internet at that time, but certainly less people than in the middle of a work-day.

Also, don't do major upgrades like this just before the weekend, and then take the weekend off, with issues left unresolved.

You are right, I think this was unprofessional on the part of Telecom.

Comment by maverick, on 12-Dec-2007 08:54

No totaly understand and thanks for the comments didn't mean to hijack your Blog just wanted to add to it, we are pretty dissapointed we couldn't get you online any faster (I take things like this personally ), yes we are always looking to improve and comments like this force us to look at our processes and take a look at our processes, I agree with you we need to look at how the customer saw it the issues and resolution process and in fact this is very important to us, we have to look at it from the customer perspective as they are the person being effected and we are here to support them if holes were shown up then we can look o address them, we did some things right but could have been bettter at others.

some things we will try and take out of this are,

Make sure the customer knows he has a dial up account in these kind of issues i.e, loss of Broadband (this would have helped you)

Ensure we give the customer the ticket number and he is aware of it

Make sure our Teams are current with network outages and upgrades and how these will effect the customers

Comment by KiwiOverseas66, on 12-Dec-2007 17:37

"What can I do with that ticket number once I have it? As a previous commentor to this blog post said (KiwiOverseas66) you can then request half-hourly updates on the status? With whom? Telecom?"

Sorry for causing any confusion here - the telecom ticket number would be for Xnet to follow up with Telecom - the Xnet ticket number would be for you when dealing with a new Xnet CSR (and hopefully not having to explain the fault all over again, etc). The reason I mentioned the telecom ticket number is that this is how Xnet will be dealing with them - and they should be using that ticket number to beat them over the head and get a response until its resolved.

Sorry for taking up space on your blog like this - and I know maverick and the team will be on to it - but chances are that once in a while you will strike a CSR who, as you say, isn't aware of planned outages, hasn't read the past history of the fault, etc. Also - as maverick pointed out - they were aware of the work being done by Telecom yet the CSR you spoke to said they weren't - and Telecom does this all the time??? Sounds a little bit like a reflex response - but again I have no doubt that Xnet will be on to it!

