Geekzone: technology news, blogs, forums
Guest
Welcome Guest.
You haven't logged in yet. If you don't have an account you can register now.


View this topic in a long page with up to 500 replies per page Create new topic
1 | 2 | 3 | 4 | 5 | 6 | 7 
BarTender
3598 posts

Uber Geek

ID Verified
Trusted
Lifetime subscriber

  #182477 6-Dec-2008 14:10
Send private message

IMHO I completely know how things like this happen.  You build a Datacenter to hold say 20 full height racks.  You fill that up in a few years, and people keep on wanting to put more and more servers into that Datacenter, take over another floor or two in the building to keep up with the growth, then run out of space again, need a bigger UPS to keep all the power clean, and a bigger Generator in case the power fails and you need to keep everything going.  Big UPSs and Generators and AC set you back a few mil each but because of budget constraints you can't go for the biggest so they all run at 50% utilisation which is what you want so if one or two of them fails or is out of action you still have enough backup supply to keep it all running, you can only go for what will just keep it all going and all of them run at 90%, then the Infrastructure manager for the Centre says "you can't put any more in before you take some out".  Then project manager "A" who needs to get their project over the line complains up the food chain to the senior mgmt saying this project is vital, and there isn't anywhere else where we can house these servers, senior manager comes down to Infrastructure manager saying make it so.  Then Infrastructure manager says there is a risk and you need to accept it mr senior manager, and senior mgr says I accept the risk.

Instead of just one project have 200 projects and the problem just becomes unmanagable.  All you need is someone to do something silly say put a stack of new SAN disk in and power them all on at the same time without checking the power consumption or the current rating on the current circuit, blow the circuit because it's overloaded and kiss goodbye to it all.

Now I am not saying this is what happened at Telecom, but it is whats happened to me at a govt department 5+ years ago when you have an in-house managed datacenter where whoever screams in the right ear wins and the Datacenter manager doesn't have complete right of Veto for any new hardware.  If it were managed by a third party they know it would be their ... if it all blew up so they wouldn't let things like that happen.  Not to say that managed datacenters don't cost you 2-3X more than doing it yourself the bigger the center gets.

It's even better in this day and age with Blades, where you have a 7U blade enclosure which sucks as much power and pumps out as much heat as a fully populated rack in "the old days" holding 14x3U servers.  You can squeeze 3 enclosures into a single rack with 14 blades per enclosure and you figure it out.....

I would really love to know what actually happened, but I doubt anyone other than those in the know (who aren't talking) will ever know the full story.

FredDag
206 posts

Master Geek
Inactive user


  #182496 6-Dec-2008 17:23
Send private message

BarTender  

Yes... The way to manage this is to cost per AU and Per Watt and cover for any other knowns and... a few unknowns in current and future costs i.e. add some fat.
Then when asked, as the Data Centre Manager, for the hosting of a rack of Cognos servers or... RADIUS whatevers, you give them a price. If its inter-business you hope there is something like an interlock between divisions/department or you get paid. So..... by the time the Data Centre is near full you have enough $$ for expansion *OR* you have data showing which departments within the company have chewed through all the resource.. and where requests for budget increases have been ask and what risks are associated at that budget level and risk increase at varying levels of budget funding below that.

If the outage is due to not enough infrastructure I do sooo hope for the Data centre manager that they have done the above else it will be the end of their job . If it is a technical blow out, then it will fall on the DC architect if it is a failure in the design. Else if it is a tech'ies screw up, then it will fall on them and their manager(s) and possibly on the change control manager.

If it was an act of God............. bugger.

When will Anon deliver the goodies on the outage?

Fred



NZFINEST
202 posts

Master Geek

Trusted

  #182534 6-Dec-2008 22:06
Send private message

i work in MDR and there is major work that has been going on in there over the last month.{ from what ive heard }yes the power was cut some how { main power supply }. and some switch that controlls the back up power failed... they are taking this very seriously
in most cases when MDR losses power from the main power source batteries kick in strait away, and after a few minutes ifthe main power supply ant back  3 massive dessiel generators that start up eash one the size of a small shipping container.
back in the late 90's when there was that big power outage in auckland those generators ran 24/7, suppling the excahnge and a few streets around it with power.
i think this was a freak 1 off, and there was nothing to see this happening. im sure xtra messures are been taken so something like that wont happen again



FredDag
206 posts

Master Geek
Inactive user


  #182552 7-Dec-2008 07:06
Send private message

Thanks for that.

Fred

BarTender
3598 posts

Uber Geek

ID Verified
Trusted
Lifetime subscriber

  #182554 7-Dec-2008 07:57
Send private message

NZFINEST: i think this was a freak 1 off, and there was nothing to see this happening. im sure xtra messures are been taken so something like that wont happen again

So what about 4 November 2007?? And 14 May 2006?? And lastly 14 January 2005?? All were because of sigificant power loss in MDR from what I heard... Shall we see if November 2009 is a better month?

w2krules
488 posts

Ultimate Geek


  #182559 7-Dec-2008 10:26
Send private message

Knowing what I do about how Telecom has been run over the last decade, it's not hard to figure out why stuff like this happens.  Undoubtedly this has ended up on Paul Reynolds' desk, and it will be interesting to see if it happens again...




I was a geek before the word was invented!

insane
3223 posts

Uber Geek

ID Verified
Trusted

  #182591 7-Dec-2008 15:33
Send private message


@ Bartender,

much scoping of projects is done before servers are sinmly loaded into racks. Even when new servers are added or old ones upgraded, projects are re-scoped to make sure that power figures are gathered and it has to fit inside the power envelope and cooling capabilities of the DC. Not many places in NZ can handle multiple blade servers per rack without hot or cold aile containment so I would doubt that such a relatively simple mistake was not guarded against. Heads would roll if such things were overlooked and in a large organisation I'd expect there to be enough red tape that overloading the system intentially to meet a deadline would be deat with swiftly.

You would have to ask though why the power systems feeding the whole DC at MDR are not segragated and provided by different physical power feeds though.....



MattD
663 posts

Ultimate Geek
Inactive user


  #182906 9-Dec-2008 09:00
Send private message

Gotta love the status on telecom.co.nz/help/servicealerts
No known issues yet many south island exchanges have been down since 5amAt least the page loads! (thanks to Vodafone 3G for backup connection)

TELECOM SERVICE STATUS

There are no known issues with our services at this time
  • Green
  • Green
  • Green
  • No Alerts
  • No Alerts
  • No Alerts

kingjj
1728 posts

Uber Geek

ID Verified
Trusted

  #182961 9-Dec-2008 12:17
Send private message

Ah so there was an issue in the South Island this morning? I thought it was just my connection that was down, a collegues telecom connection was working. Was it a Radius issue? I could get a DSL connection just couldn't authenticate.
MattD: Gotta love the status on telecom.co.nz/help/servicealerts
No known issues yet many south island exchanges have been down since 5amAt least the page loads! (thanks to Vodafone 3G for backup connection)

TELECOM SERVICE STATUS

There are no known issues with our services at this time
  • Green
  • Green
  • Green
  • No Alerts
  • No Alerts
  • No Alerts






FredDag
206 posts

Master Geek
Inactive user


  #182963 9-Dec-2008 12:26
Send private message

snap just had a 30 minute outage, back online around 12:15

Fred

insane
3223 posts

Uber Geek

ID Verified
Trusted

  #183094 9-Dec-2008 18:35
Send private message

kingjj: Ah so there was an issue in the South Island this morning? I thought it was just my connection that was down, a collegues telecom connection was working. Was it a Radius issue? I could get a DSL connection just couldn't authenticate


not quite, problem at Riccarton RAN 21 which feeds all the local exchanges. In other news the cause for the telecom outage that this thread is about was infact caused by overloading at MDR... started when they plugged some new peice of equipment in and it rolled on from there. guess they would have been running on the edge for a while then...

Zippity
683 posts

Ultimate Geek


  #183096 9-Dec-2008 18:38
Send private message

So the BS from Telecom just keeps on keeping on, and the majority of you fools who use/pay for their crummy service accept it Frown

scottjpalmer
5971 posts

Uber Geek

Moderator
ID Verified
Trusted
Lifetime subscriber

  #183101 9-Dec-2008 19:06
Send private message

Alright we have seen some on topic reasons. Now we are going OT.

If any more comes to light let a mod know and we can unlock but otherwise this is locked before it deteriorates further.

1 | 2 | 3 | 4 | 5 | 6 | 7 
View this topic in a long page with up to 500 replies per page Create new topic





News and reviews »

Māori Artists Launch Design Collection with Cricut ahead of Matariki Day
Posted 15-Jun-2025 11:19


LG Launches Upgraded webOS Hub With Advanced AI
Posted 15-Jun-2025 11:13


One NZ Satellite IoT goes live for customers
Posted 15-Jun-2025 11:10


Bolt Launches in New Zealand
Posted 11-Jun-2025 00:00


Suunto Run Review
Posted 10-Jun-2025 10:44


Freeview Satellite TV Brings HD Viewing to More New Zealanders
Posted 5-Jun-2025 11:50


HP OmniBook Ultra Flip 14-inch Review
Posted 3-Jun-2025 14:40


Flip Phones Are Back as HMD Reimagines an Iconic Style
Posted 30-May-2025 17:06


Hundreds of School Students Receive Laptops Through Spark Partnership With Quadrent's Green Lease
Posted 30-May-2025 16:57


AI Report Reveals Trust Is Key to Unlocking Its Potential in Aotearoa
Posted 30-May-2025 16:55


Galaxy Tab S10 FE Series Brings Intelligent Experiences to the Forefront with Premium, Versatile Design
Posted 30-May-2025 16:14


New OPPO Watch X2 Launches in New Zealand
Posted 29-May-2025 16:08


Synology Premiers a New Lineup of Advanced Data Management Solutions
Posted 29-May-2025 16:04


Dyson Launches Its Slimmest Vaccum Cleaner PencilVac
Posted 29-May-2025 15:50


OPPO Reno13 Pro 5G Review 
Posted 29-May-2025 15:33









Geekzone Live »

Try automatic live updates from Geekzone directly in your browser, without refreshing the page, with Geekzone Live now.



Are you subscribed to our RSS feed? You can download the latest headlines and summaries from our stories directly to your computer or smartphone by using a feed reader.







GoodSync is the easiest file sync and backup for Windows and Mac