Blog

  • Home
  • AWS
  • AWS Simple Storage Service (S3) Outage

AWS Simple Storage Service (S3) Outage

  • (5.0)
  • | 653 Ratings

AWS Simple Storage Service (S3) Outage

Introduction on Amazon S3 Outage

Amazon Simple Storage Service is one such cloud computing product that has the ability to store and search big data from any data sourcing centers like websites or mobile apps to enterprise applications to IoT applications and devices. With the help of this Amazon S3, any developer can access the data that is highly scalable and reliable, much faster.

Watch this Video to know- what exactly AWS S3 is for?

Now, Amazon is one such big giant that provides cloud computing solutions to many enterprises across the globe. Anything big or small that it creates or happens is a big news. Once such news recently positioned is called as AWS S3 Outage.

What happened? Why is it in News?

One fine morning on February 28th,2017. At Northern Virginia region, AWS S3 team is working on debugging an S3 Billing system issue. There are S3 debugging teams working to solve that issue. One of the team members, while doing so, entered an incorrect command. 

So? What’s the big deal? It’s just a command. They can wipe it out with the Backspace key. No. It didn’t go as well as you think so.  

That wrong command removed a large set of servers supporting two S3 subsystems.  O! Lol! 

Yes! Isn't it a big deal?

Removing a server means losing data. 

Well! There are data discovery, data recovery features available readily by AWS.  But this time, none will work. 

This is because, the index subsystem, managing the metadata and data source location information are lost in one subsystem and on the second one, the placement subsystem that manages the allocation of new data storage objects is also lost.

The worst part is, there is another region that is relying on the S3 Service are also impacted as S3 is not responding to the service requests even though the system gets restarted. All the S3 API associated are not available.

There are few AWS services that are severely affected are CloudWatch, WorkSpaces, Simple Email Service, Cognito, and DynamoDB. Some of them has suffered complete disruption creating an error like this.

OMG! Is it solved? What did Amazon do?

Yes. It's Amazon after all who made the software. They have solved the issue. Within  5 hours, the index subsystem had activated and all the functions and APIs were functioning as usual.

How can such a big mistake can happen?

Yes. It is just a cloud, They act as per the instructions are given by you.  With every aspect of digitalization, there is an equal scope of impact ranging effects and defects. Anything and from anywhere from websites to IoT until companies rely on such cloud services for every business operation such watch out their actions much clear before they do it.  A cloud computing solution is just a technological methodology consisting of several servers, in/out switches, and code prepared by someone as per his analysis and planning. It is not vulnerable to failures.

What are the precautions from such IT Outages?

1. Concentrate. It the machine after all. Concentrate on every single input you give or command you write. A wrongly entered single input causes a huge damage.  Follow the syntax properly and correctly.

2. No Hurry. Old Wine is Expensive. Right! Do not be in hurry in completing the task. Humans make mistakes in a hurry. Our brain cannot bare tensions much.

3. Do not neglect. Negligence is injurious to work.

4. Do not panic. Humans are made to solve the problems. Afterall, only living being born with intelligence.

5. Backup. A handwritten love letter is much effective than an SMS, you know!

6. Monitor. Maintain. And Measure.

List Of AWS Courses Offered By Mindmajix:

 AWS Certified SysOps Administrator  AWS Certified Solutions Architect / Professional
 AWS Certified Developer  AWS Certified DevOps Engineer
 AWS Technical Essentials  AWS Database Migration Service
 AWS Lambda 2016  and many more...