Question

I have a URL which gives me the below JSON String if I hit them on the browser -

Below is my URL, let's say it is URL-A and I have around three URL's like this -

http://hostnameA:1234/Service/statistics?%24format=json

And below is my JSON String which I get back from the url -

{
 "description": "",
 "statistics": {
  "dataCount": 0,
 }
}

Now I have written a Python script which is scanning all my 3 URL's and then parse then JSON String to extract the value of dataCount from it. And it should keep on running every few seconds to scan the URL and then parse it.

Below are my URL's

hostnameA       http://hostnameA:1234/Service/statistics?%24format=json
hostnameB       http://hostnameB:1234/Service/statistics?%24format=json
hostnameC       http://hostnameC:1234/Service/statistics?%24format=json

And the data which I am seeing on the console is like this after running my python script -

hostnameA - dataCount
hostnameB - dataCount
hostnameC - dataCount

Below is my python script which works fine

def get_data_count(url):
    try:
        req = requests.get(url)
    except requests.ConnectionError:
        return 'could not get page'

    try:
        data = json.loads(req.content)
        return int(data['statistics']['dataCount']) 
    except TypeError:
        return 'field not found'
    except ValueError:
        return 'not an integer'

def send_mail(data):
    sender = 'user@host.com'
    receivers = ['some_name@host.com']

    message = """\
From: user@host.com
To: some_name@host.com
Subject: Testing Script
"""    
    body = '\n\n'
    for item in data:
        body = body + '{name} - {res}\n'.format(name=item['name'], res=item['res'])

    message = message + body

    try:
       smtpObj = smtplib.SMTP('some_server_name' )
       smtpObj.sendmail(sender, receivers, message)
       print "Mail sent"
    except smtplib.SMTPException:
       print "Mail sending failed!"

def main():
        urls = [
            ('hostnameA', 'http://hostnameA:1234/Service/statistics?%24format=json'),
            ('hostnameB', 'http://hostnameB:1234/Service/statistics?%24format=json'),
            ('hostnameC', 'http://hostnameC:1234/Service/statistics?%24format=json')
        ]

    count = 0
    while True:
        data = []
        print('')

        for name, url in urls:
            res = get_data_count(url)
            print('{name} - {res}'.format(name=name, res=res))
            data.append({'name':name, 'res':res})

        if len([item['res'] for item in data if item['res'] >= 20]) >= 1: count = count+1
        else: count = 0

        if count == 2: 
            send_mail(data)
            count = 0
        sleep(10.)

if __name__=="__main__":
    main()

What I am also doing with above script is, suppose if any of the machines dataCount value is greater than equal to 20 for two times continuously, then I am sending out an email and it also works fine.

One issue which I am noticing is, suppose hostnameB is down for whatever reason, then it will print out like this for first time -

hostnameA - 1
hostnameB - could not get page
hostnameC - 10

And second time it will also print out like this -

hostnameA - 5
hostnameB - could not get page
hostnameC - 7

so my above script, sends out an email for this case as well since could not get page was two times continuously but infact, hostnameB dataCount value is not greater than equal to 20 at all two times? Right? So there is some bug in my script and not sure how to solve that?

I just need to send out an email, if any of the hostnames dataCount value is greater than equal to 20 for two times continuously. if the machine is down for whatever reason, then I will skip that case but my script should keep on running.

Was it helpful?

Solution

Without changing the get_data_count function:

I took the liberty to make data a dictionary with the server name as index, this makes looking up the last value easier.

I store the last dictionary and then compare the current and old values to 20. Most strings are > 19, so I create an int object from the result, this throws an exception when the result is a string, which I can then again catch to prevent shut-down servers from being counted.

last = False

while True:
    data = {}
    hit = False
    print('')
    
    for name, url in urls:
        res = get_data_count(url)
        print('{name} - {res}'.format(name=name, res=res))
        data[name] = res
        try:
            if int(res) > 19:
                hit = True
        except ValueError:
            continue

    if hit and last:
            send_mail(data)
    last = hit
    sleep(10.)

Pong Wizard is right, you should not handle errors like that. Either return False or None and reference the value later, or just throw an exception.

OTHER TIPS

You should use False for a failed request, instead of the string "could not get page". This would be cleaner, but a False value will also double as a 0 if it is treated as an int.

>>> True + False
1

Summing two or more False values will therefore equal 0.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top