I have a couple of micro instances that have been working fine for weeks. Both are running WordPress blogs. In the last 24 hours one of them has stopped. I can't ssh in even after a reboot. The other instance is working fine.
ssh: connect to host ec2-xxx-xxx-xxx-xxx.ap-southeast-1.compute.amazonaws.com port 22: Operation timed out
There in nothing obvious in the log that looks like a problem. The last few lines are:
cloud-init: runcmd[ OK ]
Mounting other filesystems: [ OK ]
Retrigger failed udev events[ OK ]
Generating SSH1 RSA host key: [ OK ]
Starting sshd: [ OK ]
Starting ntpd: [ OK ]
Starting sendmail: [ OK ]
Starting sm-client: [ OK ]
Starting crond: [ OK ]
[ OK ]
Starting atd: [ OK ]
Starting yum-updatesd: [ OK ]
Running cloud-init user-scripts (none found)[ OK ]
Amazon Linux AMI release 2011.02.1.1 (beta)
Kernel 2.6.35.11-83.9.amzn1.i686 on an i686
ip-xx-xxx-xx-xx login:
The management console states that everything is running and normal.
I use the same security group and .pem file for both instances.
I suspect that this instance has been getting more traffic than the other one. Is there anyway that the micro instance could run out of memory and just stop responding? What could be going wrong?
Here is a screen shot of the Monitoring panel
Thanks
An AWS EC2 server becomes unresponsive when the load is more than 100% for a very long time. It can happen because of the abnormality in the app/database/web services you deployed on the AWS EC2 server. You cannot connect to AWS EC2 server via SSH during this time.
The following are common reasons why EC2 Instance Connect might not work as expected: EC2 Instance Connect doesn't support the OS distribution. The EC2 Instance Connect package isn't installed on the instance. There are missing or incorrect AWS Identity and Access Management (IAM) policies or permissions.
There are many possible causes of slow or unresponsive EC2 instances when CPU and memory aren't fully used, including: Problems with an external service that your instance relies on. Disk thrashing. Network connectivity issues.
Launching EC2 instance failed. Cause: You have reached the limit on the number of instances that you can launch in a Region. When you create your AWS account, we set default limits on the number of instances you can run on a per-Region basis.
I've seen micro instances lock up for several minutes due to the CPU "stealing" that occurs when you use too much CPU. This is unique to the micro instance. I blogged an example of this (including video) at http://gregsramblings.com/2011/02/07/amazon-ec2-micro-instance-cpu-steal/.
You can move your instance to new resources simply by doing a full STOP and then a START. This will assign it to new hardware and will assign a new IP address (don't forget to re-associate your elastic IP!). A host reboot will not accomplish this. It needs to be stopped via the EC2 console. Terminating it is not necessary.
There are several possibilities, but the two most likely are:
High load on the host that your Micro instance is running on - Micro instances get a small slice of resources anyway, and get scaled back quite harshly when the host is under load.
A fault has occurred on the host which is impacting VM responsiveness - this is actually relatively common, and can exhibit the type of behaviour you're seeing.
In either case, the quickest solution is to nuke the instance and restart it - you'll likely get a new instance on a different host, which may be less stressed or less broken. ;)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With