Geekzone: technology news, blogs, forums
Guest
Welcome Guest.
You haven't logged in yet. If you don't have an account you can register now.


View this topic in a long page with up to 500 replies per page Create new topic
1 | 2 | 3 
irpegg
146 posts

Master Geek
+1 received by user: 102


  #3136100 28-Sep-2023 19:47
Send private message

Any competent engineer can copy paste a previous outage and change 3 words and a date, or have a template ready.  It's a 60 second job tops and shows you actually have empathy for your customer and front line staff.

Worked with a lot of people who blow a fuse when asked for an update because they can't handle pressure or handle multitasking properly.  It's just a status update, not a full blown incident report.




Linux
12212 posts

Uber Geek
+1 received by user: 8493

Trusted
Lifetime subscriber

  #3136102 28-Sep-2023 19:52
Send private message

irpegg:

 

Any competent engineer can copy paste a previous outage and change 3 words and a date, or have a template ready.  It's a 60 second job tops and shows you actually have empathy for your customer and front line staff.

Worked with a lot of people who blow a fuse when asked for an update because they can't handle pressure or handle multitasking properly.  It's just a status update, not a full blown incident report.

 

 

@irpegg I doubt they would have access to edit the website to start with zero to do with how easy it is to change!


SomeoneSomewhere
1886 posts

Uber Geek
+1 received by user: 1092

Lifetime subscriber

  #3136105 28-Sep-2023 19:56
Send private message

That's why you have a dedicated status page that the engineers can post updates to at the click of a button.



Linux
12212 posts

Uber Geek
+1 received by user: 8493

Trusted
Lifetime subscriber

  #3136108 28-Sep-2023 20:07
Send private message

SomeoneSomewhere: That's why you have a dedicated status page that the engineers can post updates to at the click of a button.

 

@SomeoneSomewhere: It does not work like that. The service owner should be making the comms to be posted up not the Engineers working on the actual outage

 

The Engineers should be the people on the tools working out how big the outage is and the impact to customers and how long to restore


RunningMan
9212 posts

Uber Geek
+1 received by user: 4856


  #3136112 28-Sep-2023 20:16
Send private message

It doesn't matter who posts what on the website - it's just a matter of having a simple process that allows for that to easily happen. The engineer will know pretty quickly if it's a big problem, and if that's the case then the process can be activated. Either the engineer if they have the authority in a small company, or escalating to the appropriate person in a larger company.


michaelmurfy
meow
13586 posts

Uber Geek
+1 received by user: 10931

Moderator
ID Verified
Trusted
Lifetime subscriber

  #3136137 28-Sep-2023 21:42
Send private message

I do agree with all of you but I want you to see this from another point of view too.

Biggest bank in NZ. Engineers don’t have access to their Facebook / Twitter etc to post in the first place nor would the 1000’s of engineers ever get access. Sure, there could be a dedicated team who has on-call people (and talk to the incident managers) but the majority of incidents could be customer impacting but nobody would ever know they happen in the first place. There are almost daily incidents going on that customers will never know about (eg, a microservice going down, cluster members going down causing degraded performance, network issues, loss of communication to downstream systems or even downstream system outages). If the call was to post about every single customer impacting outage then there will be daily posts. This just would never fly and so often the call is to keep the vast majority of incidents quiet.

Yes, that is an example from another point of view. Totally understand Quic are smaller and could likely post something quickly too but I’ll tell you now even if the process was there to do the post in the first place this is never an engineers first step. First step is always triage, and sometimes during the triage stage you’ll actually fix it (seriously would happen 95% of the time with me). No point posting until you actually know a wider issue exists in the first place. I’m sure they’ll likely fire up alerting at a later date or something along those lines to do it when monitoring notices anything.




Michael Murphy | https://murfy.nz
Referral Links: Quic Broadband (use R122101E7CV7Q for free setup)

Are you happy with what you get from Geekzone? Please consider supporting us by subscribing.
Opinions are my own and not the views of my employer.


 
 
 

Shop now at Mighty Ape (affiliate link).
tccki
16 posts

Geek
+1 received by user: 44

Trusted

  #3136178 29-Sep-2023 01:43
Send private message

I moved from a frankly more reliable residential ISP. I’m setup to poll 24/7 whether I’m able to transact with LAN & WAN elements and one issue I checked when it affected me, the phpweathermap on their website showed there was contention out to Chorus NI.

I’ve read every Quic thread and have seen them speaking to the issues. I had a read of every page on their website when I signed up as well. My take away is Quic appears to be suited at least right now as an ISP for folks who, if they care to know why, will have the capacity to determine which end a fault lies. My recommendation is that, if your job depends on it, maybe don’t have them as your only connectivity option right now. But Quic comes across as a small team with excellent kit and I feel they know what they’re doing.

If it’s a, “everything is wiped, we’re travelling to the racks to sort,” super long outage, I do expect them to communicate. But knowing whether a drop is contention, auth DB, BNG, v4/v6 is mostly just interesting. An incident team would be expensive and typically still not provide an accurate ETA until the final leg. The purpose of that communication is to minimise tickets, which I don’t submit if I believe it’s getting dealt with.

I’m aiming to pay for transparent metrics and performance even if it means bumps along the way. My preferences are a bit of a zigzag but my understanding is that homelab folks generally prioritise in such a weird fashion.




Quic Fibre Internet - https://account.quic.nz/refer/74633 - free setup ($29 value) by using promo code at checkout: R74633EJQHUT

RunningMan
9212 posts

Uber Geek
+1 received by user: 4856


  #3136598 29-Sep-2023 15:22
Send private message

michaelmurfy:[snip] If the call was to post about every single customer impacting outage then there will be daily posts. This just would never fly and so often the call is to keep the vast majority of incidents quiet.

 

RunningMan: [snip] The guidlines for what is significant and what impact it is having can be drawn up in advance, but generically it would be reasonable for total loss of service or loss of a key part of the service (e.g. DNS) for a suburb or bigger area to reach this threshold. 

 

Agree @michaelmurfy totally not needed for the bulk of incidents, just where there is significant customer impact - basically if it's going to resiult in a bucketload of incoming queries to tier 1 support then pre-empt with a notice that the outage is already known about.


1 | 2 | 3 
View this topic in a long page with up to 500 replies per page Create new topic








Geekzone Live »

Try automatic live updates from Geekzone directly in your browser, without refreshing the page, with Geekzone Live now.



Are you subscribed to our RSS feed? You can download the latest headlines and summaries from our stories directly to your computer or smartphone by using a feed reader.