Question

Today we had an issue with Azure VMs where one VM in availability set of 2 just stopped responding. After few mins we noticed that machine was shutdown and the other VM in the set wasn't turned on (which should be ok as this isn't a failover). We have take a look at the VM monitoring and there wasn't a single log telling us that there was any downtime. The only thing that we found is 2 strange logs in the Management Services - Operation Logs.

  • 11/12/2013 10:12:02 PM AutoscaleAction Succeeded VirtualMachinesAvailabilitySet:xyz Autoscale
  • 11/12/2013 9:36:56 PM AutoscaleAction Succeeded VirtualMachinesAvailabilitySet:xyz Autoscale

The first one was with following details:

Description: The autoscale engine attempting to scale resource 'xyz' from 0 instances count to 1 instances count.

LastScaleActionTime: 20131106T173020Z

NewInstancesCount: 1

OldInstancesCount: 0

Second one:

The autoscale engine attempting to scale resource 'xyz' from 2 instances count to 1 instances count.

LastScaleActionTime: 20131112T203656Z

NewInstancesCount: 1

OldInstancesCount: 2

Does anyone know what may had happen ?

UPDATE

Azure Support has provided me with the feedback and they explained that machines were down due to host update.

Regards

Était-ce utile?

La solution 2

Both of my machines were down because of the host update and AutoScaling set from 1 to 2 machines based on CPU usage. So I have found out that AutoScaling won't turn the second machine on when doing host update (which can be pretty helpful and make my apps online).

I think that will explain the 0 of 1 instances issue, so don't use AutoScaling with above setup to get HA.

Regards

Autres conseils

Whenever you use autoscale you set an instance range that tells Azure the minimum and maximum number of VM's you want to be running at a given point in time. In this case, it looks like you've set the minimum to be 1. That would explain why, when both VM's were stopped, it turned on one of them.

In addition, the scale from 2 to 1 was likely because load was low on your VM's (assuming you're scaling by CPU). If the average CPU remains below the target you've established (by default 60%), it will scale down until it hits the minimum (in this case, 1).

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top