Question

I'm having an issue when i try to port my bash script to nagios.The scripts works fine when I run on console, but when I run it from Nagios i get the msg "(null)" - In the nagios debug log I see that it parse the script well but it returns the error msg..

I'm not very good at scripting so i guess i'll need some help

The objective of the script is to check *.ears version from some servers, md5 them and compare the output to see if the version matches or not. To do that, i have a json on these servers that prints the name of the *.ear and his md5.

so.. The first part of the script gets that info from the json with curl and stores just the md5 number on a .tempfile , then it compares both temp files and if they match i got the $STATE_OK msg. If they dont , it creates a .datetmp file with the date ( the objective of this is to print a message after 48hs of inconsistence). Then, i make a diff of the .datetmp file and the days i wanna check if the result is less than 48hrs it prints the $STATE_WAR, if the result is more than 48 hrs it Prints the $STATE_CRI

The sintaxis of the script is " $ sh script.sh nameoftheear.ear server1 server2 "

Thanks in advance

#/bin/bash

#Variables For Nagios
cont=$1
bas1=$2
bas2=$3

## Here you set the servers hostname
svr1= curl -s "http://$bas1.domain.com:7877/apps.json" | grep -Po '"EAR File":.*? [^\\]",' | grep $cont | awk '{ print $5 }' > .$cont-tmpsvr1
svr2= curl -s "http://$bas2.domain.com:7877/apps.json" | grep -Po '"EAR File":.*? [^\\]",' | grep $cont | awk '{ print $5 }' > .$cont-tmpsvr2
file1=.$cont-tmpsvr1
file2=.$cont-tmpsvr2
md51=$(head -n 1 .$cont-tmpsvr1)
md52=$(head -n 1 .$cont-tmpsvr2)
datenow=$(date +%s)

#Error Msg
ERR_WAR="Not updated $bas1: $cont $md51 --- $bas2: $cont $md52 "
ERR_CRI="48 hs un-updated $bas1: $cont $md51 --- $bas2: $cont $md52 "
OK_MSG="Is up to date $bas1: $cont $md51 --- $bas2: $cont $md52 "

STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2


##Matching md5 Files
if cmp -s "$file1" "$file2"
then
    echo $STATE_OK
    echo $OK_MSG
# I do the rm to delete the date tmp file so i can get the $STATE_OK or $STATE_WARNING
    rm .$cont-datetmp
    exit 0

elif
    echo $datenow >> .$cont-datetmp

#Vars to set modification date

datetmp=$(head -n 1 .$cont-datetmp)
diffdate=$(( ($datenow - $datetmp) /60 ))

#This var is to set the time of the critical ERR

days=$((48*60))

[ $diffdate -lt $days ]

then
    echo $STATE_WARNING
    echo $ERR_WAR
    exit 1
 else
    echo $STATE_CRITICAL
    echo $ERR_CRI
    exit 2
 fi
Was it helpful?

Solution

I am guessing some kind of permission problem - more specifically I don't think the nagios user can write to it's own home directory. You either fix those permissions or write to a file in /tmp (and consider using mktemp?). ...but ideally you'd skip writing all those files, as far as I can see all of those comparisons etc could be kept in memory.

UPDATE

Looked at your script again - I see some obvious errors you can look into:

  • You are printing out the exit value before you print the message.
  • You print the exit value rather than exit with the exit value.

...so this:

echo $STATE_WARNING
echo $ERR_WAR
exit 1

Should rather be:

echo $ERR_WAR
exit $STATE_WARNING

Also I am wondering if this is really the script or if you missed something when pasting. There seems to be missing an 'if' and also a superfluous line break in your last piece of code? Should rather be:

if [ $diffdate -lt $days ]
then
  ...
else
  ...
fi
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top