Thursday, February 21, 2013

Post #22 AWS Outage. My Domain's First

Nothing is perfect. All promises are hollow and if you get a free service you can tell. Yesterday I faced my first domain outage. It was intermittent. When I got to my AWS console I saw my instance state was in error. I had no clue what to do. I tried after some time and the site worked but after navigating a couple of pages it stopped again.

I ended up on Elastic Bean Stalk and decided to restart the instance but there was no such option. I had a reboot option but it did not allow me to reboot. There were 2 other options Stop and Start and so I tried stopping it. It gave an error saying the instance is in no shape to accept my command. I tried accessing again and the intermittent failure continued.

After a while when I reached back on EC2 screen, the instance was shutting down and a brand new instance was available. At this point I did not know what to do. Do I reconfigure the entire domain again. In fact I had forgotten it because I did that in steps over days. That sent shudders up my spine.  I tried ssh using putty but that did not go through either.

It was getting late and I decided to go to sleep. The next morning, I figured that the old instance had completely disappeared. The Load Balancer and other stuff was reconfigured and the site was working but the issue continued.

Also, all my logs and data was missing on the new server. I created a new directory for temp statements. The intermittent nature of the problem however continued.

I checked at all the places. The EC 2 instance, Route 53, Elastic Beanstalk, so on. But could not identify the problem. Later I saw an error with the load balancer. I think the load balancer itself balances the load between 4 instances in the same region and 3 of 4 said not available. I just removed these from there and it worked.

I think I need to document how to manage the site using AWS. Also there are no great forums which help you trouble shooting and what all to check etc. I tried logging a ticket but it did not give me an option saying you need to buy that option.

The question is how do you support potential customers. The fact that I created a site and hosted it there should be a strategy to retain me even if its free trial period so that the customer can feel comfortable. Currently, I do not feel confident about my long term plans to host it on AWS. The support sucks...

No comments: