Question

I am trying to monitor my EC2 loadblancer through Nagios using a bash script. Below is the script which I am trying to implement with Nagios.

#!/bin/sh

ST_OK=0
ST_WR=1
ST_CR=2
ST_UK=3


LB_NAME="xxx"
AWS_REGION="us-west-2"
PROFILE="default"


CMD=$(/usr/bin/aws elb describe-instance-health --region ${AWS_REGION} --load-balancer-name ${LB_NAME} --profile ${PROFILE})

if [ $? -eq 0 ]; then

    IN_SERVICE_COUNT=$(echo ${CMD} | jq -c '.InstanceStates[].State' | grep InService |wc -l)
    TOTAL_COUNT=$(echo ${CMD} | jq -c '.InstanceStates[].State' | wc -l)

    if [ ${IN_SERVICE_COUNT} -eq 0 ]; then
                    NAGIOS_STATE=CRITICAL
                    EXIT_CODE=$ST_CR
    elif [ ${TOTAL_COUNT} -eq ${IN_SERVICE_COUNT} ]; then
                    NAGIOS_STATE=OK
                    EXIT_CODE=$ST_OK
    elif [ ${IN_SERVICE_COUNT} -lt ${TOTAL_COUNT} ]; then
                    NAGIOS_STATE=WARNING
                    EXIT_CODE=$ST_WR
    fi
    echo "${NAGIOS_STATE}: ELB:${LB_NAME} is running fine. Total #instances:${TOTAL_COUNT} Healthy instances:${IN_SERVICE_COUNT}"
    else
    echo "Failed to retrieve ELB Instances health from AWS"
    EXIT_CODE=$ST_UK
fi
exit ${EXIT_CODE}

The above script is working fine for me while running manually. Also I have ran it with nagios user and I am able to get the result like below:

OK: ELB:xxx is running fine Total:18 Healthy:18

So, I don't think any permission issue. I have configured AWS credentials for nagios user. But in the nagios interface I am always getting status "UNKNOWN".

Below is the code for command.cfg

define command {
    command_name    check_elb_status
    command_line    /usr/local/nagios/libexec/check_elb_status.sh
}

Below is the code for host file:

define service{
    use generic-service
    host_name Prod-ELB
    service_description Prod ELB Status
    check_command check_elb_status
}

The same script I have used with NRPE from a different host and I am able to get the result:

Code for nrpe.cfg

command[check_elb_sts]=/usr/local/nagios/libexec/check_elb_status.sh

Code for host file

define service{
   use generic-service
   host_name xxx
   service_description Prod ELB Status
   check_command check_nrpe!check_elb_sts
}

Don't know why the script is not able to give result while using on Nagios host. Please help to resolve the issue.

Was it helpful?

Solution

Somehow the bash script was not able to locate the config file for AWS cli. Now I have configured the AWS_CONFIG_FILE location in the bash script and it's working fine.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top