Geekzone: technology news, blogs, forums
Guest
Welcome Guest.
You haven't logged in yet. If you don't have an account you can register now.


timmmay

20859 posts

Uber Geek
+1 received by user: 5350

Trusted
Lifetime subscriber

#208826 1-Mar-2017 09:12
Send private message

AWS Simple Storage Service (S3) in US-East-1 is effectively down (up to date status here). They say "increased error rates", but it's down. US-East-1 is their primary region, where all billing is done, and I suspect there may be the odd service that only runs there. You couldn't even guess at the amount of storage and traffic that goes to S3, it probably stores a significant fraction of resources on the internet, though not all will use that region.

 

S3 is core to many different AWS services, so in effect the whole region is having big problems. Existing resources are keeping working, so long as they don't rely on S3. 

 

Many, many websites who rely on S3 and haven't architected for reliability are down. AWS at its core just runs data centers, and data centers fail from time to time, so organisations with a need for reliability need to account for this when they architect their solutions. AWS provides plenty of ways to do this, including S3 cross region replication, which could mitigate these problems if considered in system architectures.

 

AWS services and regions go down fairly rarely. It'll be interesting to see how this evolves, and the root cause.


View this topic in a long page with up to 500 replies per page Create new topic
 1 | 2

gjm

gjm
810 posts

Ultimate Geek
+1 received by user: 122


  #1728130 1-Mar-2017 09:18
Send private message

biggest pain for me so far is Trello is down, we use this a lot. Also imgur....RIP cat gifs :(





Do surveys for Beer money (referral link) - Octopus Group 

 

Link for buying beer (not affiliated, just like beer) - Good George




gcorgnet
1096 posts

Uber Geek
+1 received by user: 273

ID Verified

  #1728150 1-Mar-2017 09:50
Send private message

Yeah, a day like this you realise how much we have come to rely on services like S3...

 

Trying to use inVision to look at some design I need to implement: Seems like inVision uses S3 as well :-(


Inphinity
2780 posts

Uber Geek
+1 received by user: 1184


  #1728151 1-Mar-2017 09:51
Send private message

 SES is also down in that region, which I was initially surprised by - didn't consider it would be S3 backed but presumably is.




ResponseMediaNZ
518 posts

Ultimate Geek
+1 received by user: 196

ID Verified
Trusted

  #1728157 1-Mar-2017 09:54
Send private message

 

 

 

 

also xero is impacted by the AWS S3 Issues... 


SumnerBoy
2079 posts

Uber Geek
+1 received by user: 306

ID Verified
Lifetime subscriber

  #1728161 1-Mar-2017 10:03
Send private message

ResponseMediaNZ:

 

 

 

 

 

also xero is impacted by the AWS S3 Issues... 

 

 

And this is why I try to use as few cloud based services as possible for my home automation. The openHAB mantra is "the intranet of things" and makes a lot of sense at times like this!


timmmay

20859 posts

Uber Geek
+1 received by user: 5350

Trusted
Lifetime subscriber

  #1728176 1-Mar-2017 10:18
Send private message

It's interesting to see how many services are poorly architected. Yes S3 stores data across multiple AZs in order to increase durability and availability, but it seems something must be shared, or they deployed to all AZs in the same time period. Cross region replication is recommended for high reliability.

 

TLDR: many places put all their eggs in one basket.

 

It's coming back up now, but may be an hour before it's fully working.

 

--

 

Update at 12:52 PM PST: We are seeing recovery for S3 object retrievals, listing and deletions. We continue to work on recovery for adding new objects to S3 and expect to start seeing improved error rates within the hour.


 
 
 
 

Shop now on Samsung phones, tablets, TVs and more (affiliate link).
itxtme
2102 posts

Uber Geek
+1 received by user: 557


  #1728179 1-Mar-2017 10:27
Send private message

SES is still down, failures I am seeing relate to 401 errors, via the API -> too many concurrent connections which is not true in my case 


maoriboy
1034 posts

Uber Geek
+1 received by user: 562

Trusted

  #1728193 1-Mar-2017 10:45
Send private message

XERO is down by the looks of things and so is Zendesk, two things I use lots at work.... Oh well at least Ali Express is still running smile






timmmay

20859 posts

Uber Geek
+1 received by user: 5350

Trusted
Lifetime subscriber

  #1728204 1-Mar-2017 11:09
Send private message

Xero is up for me, I just logged in, but they were down. Xero use AWS extensively. Apparently they haven't architected their system to automatically route away from problematic regions. It might be that they don't have a multi-region capability, or that it requires switching over manually.


Behodar
11101 posts

Uber Geek
+1 received by user: 6089

Trusted
Lifetime subscriber

  #1728225 1-Mar-2017 11:57
Send private message

SumnerBoy: And this is why I try to use as few cloud based services as possible for my home automation.

 

Indeed. Having a single point of failure in a third party's system doesn't seem like the greatest design plan...


timmmay

20859 posts

Uber Geek
+1 received by user: 5350

Trusted
Lifetime subscriber

  #1728233 1-Mar-2017 12:09
Send private message

Behodar:

 

SumnerBoy: And this is why I try to use as few cloud based services as possible for my home automation.

 

Indeed. Having a single point of failure in a third party's system doesn't seem like the greatest design plan...

 

 

It's not. The great thing about AWS is their availability zones (2-4 data centers quite close together) and regions (very widely separated groupings of availability zones, most countries only have one region). Today region failed, but a good architecture would account for this and have resources available in other regions. This costs more, but you can have architectures that reduce the cost, for example running very little in the other region and ramping up either manually or automatically if the other region is needed. AWS provides all the facilities required to do this, using Route53 DNS and auto scaling.

 

Because S3 and AWS is typically very reliable I suspect many users haven't bothered to architect for failure.


HP

 
 
 
 

Shop now for HP laptops and other devices (affiliate link).
Groucho
542 posts

Ultimate Geek
+1 received by user: 216


  #1728262 1-Mar-2017 12:47
Send private message

timmmay:

 

Xero is up for me, I just logged in, but they were down. Xero use AWS extensively. Apparently they haven't architected their system to automatically route away from problematic regions. It might be that they don't have a multi-region capability, or that it requires switching over manually.

 

 

I don't understand why Xero doesn't geo-cache data or at least syncs it as a failover.  It seems crazy AWS is in Australia but my NZ-based Xero data is actually stored and processed in the US?


timmmay

20859 posts

Uber Geek
+1 received by user: 5350

Trusted
Lifetime subscriber

  #1728291 1-Mar-2017 13:29
Send private message

Groucho:

 

timmmay:

 

Xero is up for me, I just logged in, but they were down. Xero use AWS extensively. Apparently they haven't architected their system to automatically route away from problematic regions. It might be that they don't have a multi-region capability, or that it requires switching over manually.

 

 

I don't understand why Xero doesn't geo-cache data or at least syncs it as a failover.  It seems crazy AWS is in Australia but my NZ-based Xero data is actually stored and processed in the US?

 

 

I've seen a diagram of the Xero system at an AWS conference, it looked quite complex. Multi-region architecture is more difficult, and I assume Xero has traded off RTO and RPO for reduced costs. It's generally cheaper to do backups than disaster recovery.


Groucho
542 posts

Ultimate Geek
+1 received by user: 216


  #1728301 1-Mar-2017 13:41
Send private message

timmmay:

 

 

 

I've seen a diagram of the Xero system at an AWS conference, it looked quite complex. Multi-region architecture is more difficult, and I assume Xero has traded off RTO and RPO for reduced costs. It's generally cheaper to do backups than disaster recovery.

 

 

I think I've seen something similar online and was way over my pay grade but agree with you.  Xero with their 862,000 subscribers probably can't justify the expense unlike another AWS client calling themselves Netflix whose "NZ content" is streamed from a lot closer to home.


Brumfondl
1198 posts

Uber Geek
+1 received by user: 524

Trusted
Subscriber

  #1728317 1-Mar-2017 13:49
Send private message

Think AWS is back up :)






 1 | 2
View this topic in a long page with up to 500 replies per page Create new topic








Geekzone Live »

Try automatic live updates from Geekzone directly in your browser, without refreshing the page, with Geekzone Live now.



Are you subscribed to our RSS feed? You can download the latest headlines and summaries from our stories directly to your computer or smartphone by using a feed reader.