Geekzone: technology news, blogs, forums
Guest
Welcome Guest.
You haven't logged in yet. If you don't have an account you can register now.


View this topic in a long page with up to 500 replies per page Create new topic
1 | 2 | 3 | 4 | 5 | 6 | 7 
BarTender
3608 posts

Uber Geek

ID Verified
Trusted
Lifetime subscriber

  #182477 6-Dec-2008 14:10
Send private message

IMHO I completely know how things like this happen.  You build a Datacenter to hold say 20 full height racks.  You fill that up in a few years, and people keep on wanting to put more and more servers into that Datacenter, take over another floor or two in the building to keep up with the growth, then run out of space again, need a bigger UPS to keep all the power clean, and a bigger Generator in case the power fails and you need to keep everything going.  Big UPSs and Generators and AC set you back a few mil each but because of budget constraints you can't go for the biggest so they all run at 50% utilisation which is what you want so if one or two of them fails or is out of action you still have enough backup supply to keep it all running, you can only go for what will just keep it all going and all of them run at 90%, then the Infrastructure manager for the Centre says "you can't put any more in before you take some out".  Then project manager "A" who needs to get their project over the line complains up the food chain to the senior mgmt saying this project is vital, and there isn't anywhere else where we can house these servers, senior manager comes down to Infrastructure manager saying make it so.  Then Infrastructure manager says there is a risk and you need to accept it mr senior manager, and senior mgr says I accept the risk.

Instead of just one project have 200 projects and the problem just becomes unmanagable.  All you need is someone to do something silly say put a stack of new SAN disk in and power them all on at the same time without checking the power consumption or the current rating on the current circuit, blow the circuit because it's overloaded and kiss goodbye to it all.

Now I am not saying this is what happened at Telecom, but it is whats happened to me at a govt department 5+ years ago when you have an in-house managed datacenter where whoever screams in the right ear wins and the Datacenter manager doesn't have complete right of Veto for any new hardware.  If it were managed by a third party they know it would be their ... if it all blew up so they wouldn't let things like that happen.  Not to say that managed datacenters don't cost you 2-3X more than doing it yourself the bigger the center gets.

It's even better in this day and age with Blades, where you have a 7U blade enclosure which sucks as much power and pumps out as much heat as a fully populated rack in "the old days" holding 14x3U servers.  You can squeeze 3 enclosures into a single rack with 14 blades per enclosure and you figure it out.....

I would really love to know what actually happened, but I doubt anyone other than those in the know (who aren't talking) will ever know the full story.



FredDag
206 posts

Master Geek
Inactive user


  #182496 6-Dec-2008 17:23
Send private message

BarTender  

Yes... The way to manage this is to cost per AU and Per Watt and cover for any other knowns and... a few unknowns in current and future costs i.e. add some fat.
Then when asked, as the Data Centre Manager, for the hosting of a rack of Cognos servers or... RADIUS whatevers, you give them a price. If its inter-business you hope there is something like an interlock between divisions/department or you get paid. So..... by the time the Data Centre is near full you have enough $$ for expansion *OR* you have data showing which departments within the company have chewed through all the resource.. and where requests for budget increases have been ask and what risks are associated at that budget level and risk increase at varying levels of budget funding below that.

If the outage is due to not enough infrastructure I do sooo hope for the Data centre manager that they have done the above else it will be the end of their job . If it is a technical blow out, then it will fall on the DC architect if it is a failure in the design. Else if it is a tech'ies screw up, then it will fall on them and their manager(s) and possibly on the change control manager.

If it was an act of God............. bugger.

When will Anon deliver the goodies on the outage?

Fred



NZFINEST
202 posts

Master Geek

Trusted

  #182534 6-Dec-2008 22:06
Send private message

i work in MDR and there is major work that has been going on in there over the last month.{ from what ive heard }yes the power was cut some how { main power supply }. and some switch that controlls the back up power failed... they are taking this very seriously
in most cases when MDR losses power from the main power source batteries kick in strait away, and after a few minutes ifthe main power supply ant back  3 massive dessiel generators that start up eash one the size of a small shipping container.
back in the late 90's when there was that big power outage in auckland those generators ran 24/7, suppling the excahnge and a few streets around it with power.
i think this was a freak 1 off, and there was nothing to see this happening. im sure xtra messures are been taken so something like that wont happen again



FredDag
206 posts

Master Geek
Inactive user


  #182552 7-Dec-2008 07:06
Send private message

Thanks for that.

Fred

BarTender
3608 posts

Uber Geek

ID Verified
Trusted
Lifetime subscriber

  #182554 7-Dec-2008 07:57
Send private message

NZFINEST: i think this was a freak 1 off, and there was nothing to see this happening. im sure xtra messures are been taken so something like that wont happen again

So what about 4 November 2007?? And 14 May 2006?? And lastly 14 January 2005?? All were because of sigificant power loss in MDR from what I heard... Shall we see if November 2009 is a better month?

w2krules
492 posts

Ultimate Geek


  #182559 7-Dec-2008 10:26
Send private message

Knowing what I do about how Telecom has been run over the last decade, it's not hard to figure out why stuff like this happens.  Undoubtedly this has ended up on Paul Reynolds' desk, and it will be interesting to see if it happens again...




I was a geek before the word was invented!

insane
3242 posts

Uber Geek

ID Verified
Trusted

  #182591 7-Dec-2008 15:33
Send private message


@ Bartender,

much scoping of projects is done before servers are sinmly loaded into racks. Even when new servers are added or old ones upgraded, projects are re-scoped to make sure that power figures are gathered and it has to fit inside the power envelope and cooling capabilities of the DC. Not many places in NZ can handle multiple blade servers per rack without hot or cold aile containment so I would doubt that such a relatively simple mistake was not guarded against. Heads would roll if such things were overlooked and in a large organisation I'd expect there to be enough red tape that overloading the system intentially to meet a deadline would be deat with swiftly.

You would have to ask though why the power systems feeding the whole DC at MDR are not segragated and provided by different physical power feeds though.....

 
 
 

Move to New Zealand's best fibre broadband service (affiliate link). Free setup code: R587125ERQ6VE. Note that to use Quic Broadband you must be comfortable with configuring your own router.
MattD
663 posts

Ultimate Geek
Inactive user


  #182906 9-Dec-2008 09:00
Send private message

Gotta love the status on telecom.co.nz/help/servicealerts
No known issues yet many south island exchanges have been down since 5amAt least the page loads! (thanks to Vodafone 3G for backup connection)

TELECOM SERVICE STATUS

There are no known issues with our services at this time
  • Green
  • Green
  • Green
  • No Alerts
  • No Alerts
  • No Alerts

kingjj
1728 posts

Uber Geek

ID Verified
Trusted

  #182961 9-Dec-2008 12:17
Send private message

Ah so there was an issue in the South Island this morning? I thought it was just my connection that was down, a collegues telecom connection was working. Was it a Radius issue? I could get a DSL connection just couldn't authenticate.
MattD: Gotta love the status on telecom.co.nz/help/servicealerts
No known issues yet many south island exchanges have been down since 5amAt least the page loads! (thanks to Vodafone 3G for backup connection)

TELECOM SERVICE STATUS

There are no known issues with our services at this time
  • Green
  • Green
  • Green
  • No Alerts
  • No Alerts
  • No Alerts






FredDag
206 posts

Master Geek
Inactive user


  #182963 9-Dec-2008 12:26
Send private message

snap just had a 30 minute outage, back online around 12:15

Fred

insane
3242 posts

Uber Geek

ID Verified
Trusted

  #183094 9-Dec-2008 18:35
Send private message

kingjj: Ah so there was an issue in the South Island this morning? I thought it was just my connection that was down, a collegues telecom connection was working. Was it a Radius issue? I could get a DSL connection just couldn't authenticate


not quite, problem at Riccarton RAN 21 which feeds all the local exchanges. In other news the cause for the telecom outage that this thread is about was infact caused by overloading at MDR... started when they plugged some new peice of equipment in and it rolled on from there. guess they would have been running on the edge for a while then...

Zippity
683 posts

Ultimate Geek


  #183096 9-Dec-2008 18:38
Send private message

So the BS from Telecom just keeps on keeping on, and the majority of you fools who use/pay for their crummy service accept it Frown

scottjpalmer
5973 posts

Uber Geek

Moderator
ID Verified
Trusted
Lifetime subscriber

  #183101 9-Dec-2008 19:06
Send private message

Alright we have seen some on topic reasons. Now we are going OT.

If any more comes to light let a mod know and we can unlock but otherwise this is locked before it deteriorates further.

1 | 2 | 3 | 4 | 5 | 6 | 7 
View this topic in a long page with up to 500 replies per page Create new topic





News and reviews »

Gen Threat Report Reveals Rise in Crypto, Sextortion and Tech Support Scams
Posted 7-Aug-2025 13:09


Logitech G and McLaren Racing Sign New, Expanded Multi-Year Partnership
Posted 7-Aug-2025 13:00


A Third of New Zealanders Fall for Online Scams Says Trend Micro
Posted 7-Aug-2025 12:43


OPPO Releases Its Most Stylish and Compact Smartwatch Yet, the Watch X2 Mini.
Posted 7-Aug-2025 12:37


Epson Launches New High-End EH-LS9000B Home Theatre Laser Projector
Posted 7-Aug-2025 12:34


Air New Zealand Starts AI adoption with OpenAI
Posted 24-Jul-2025 16:00


eero Pro 7 Review
Posted 23-Jul-2025 12:07


BeeStation Plus Review
Posted 21-Jul-2025 14:21


eero Unveils New Wi-Fi 7 Products in New Zealand
Posted 21-Jul-2025 00:01


WiZ Introduces HDMI Sync Box and other Light Devices
Posted 20-Jul-2025 17:32


RedShield Enhances DDoS and Bot Attack Protection
Posted 20-Jul-2025 17:26


Seagate Ships 30TB Drives
Posted 17-Jul-2025 11:24


Oclean AirPump A10 Water Flosser Review
Posted 13-Jul-2025 11:05


Samsung Galaxy Z Fold7: Raising the Bar for Smartphones
Posted 10-Jul-2025 02:01


Samsung Galaxy Z Flip7 Brings New Edge-To-Edge FlexWindow
Posted 10-Jul-2025 02:01









Geekzone Live »

Try automatic live updates from Geekzone directly in your browser, without refreshing the page, with Geekzone Live now.



Are you subscribed to our RSS feed? You can download the latest headlines and summaries from our stories directly to your computer or smartphone by using a feed reader.