Question

I have a Python 'worker' using Elastic Beanstalk which reads messages from SQS. This works fine, although the scaling is clunky as it is based on cpu. Hence I'm trying to convert it to use AWS's new "Worker Tier Environment".

In at the deep end with Flask, but I have Flask running on an EB Work Tier. At the moment it is set to simply log the message information that it receives - this is to make sure I can read the information before I move everything else over. Unfortunately I cannot see any sign of the message?

Here is my Flask test code:

import logging
import logging.handlers

from flask import Flask, request

logfile = "/opt/python/log/scan.log"
mylog = logging.getLogger('helloworld')
# (for brevity, log format/config code removed)

application = Flask(__name__)
app = application
app.Debug=True

@app.route('/', methods=['POST'])
def hello():
    global mylog
    err = "Unrecognized method"
    mylog.warning("Hello called")

    request_detail = """
# Before Request #
request.endpoint: {request.endpoint}
request.method: {request.method}
request.view_args: {request.view_args}
request.args: {request.args}
request.form: {request.form}
request.user_agent: {request.user_agent}
request.files: {request.files}
request.is_xhr: {request.is_xhr}
## request.headers ##
{request.headers}
    """.format(request=request).strip()
    mylog.warning(request_detail)

    mylog.warning("Moreinfo:")

    mylog.warning("Args:")
    for k in request.args.keys():
        mylog.warning(k + ": "+request.args[k])
    mylog.warning("Form:")
    for k in request.form.keys():
        mylog.warning(k + ": "+request.form[k])
    mylog.warning("Files:"+len(request.files))
    for k in request.files.keys():
        mylog.warning(k + ": "+request.files[k])

    try:
        myJSON = request.get_json(force=True)
        if myJSON is None:
            mylog.warning("JSON could not be forced")
        else:
            mylog.warning("MyJSON size: " + len(myJSON))
            mylog.warning( "MyJSON: {myJSON}".format(myJSON=myJSON))
        if request.json is None:
            mylog.warning("NO JSON")
    except Exception as e:
        mylog.warning("Exception: " + e)

    # the code below is executed if the request method
    # was GET or the credentials were invalid
    mylog.warning("failure 404")
    return 'Failure: '+err , 404, {'Content-Type': 'text/plain'}


if __name__ == '__main__':
    app.run(host='0.0.0.0', debug=True)

Yes that big long format statement was borrowed from a book :-)

Here is a typical log output for a message:

WARNING:2014-02-20 15:34:37,418: Hello called

WARNING:2014-02-20 15:34:37,419: 
# Before Request # 
request.endpoint: hello 
request.method: POST 
request.view_args: {} 
request.args: ImmutableMultiDict([])
request.form: ImmutableMultiDict([])
request.user_agent: aws-sqsd 
request.files: ImmutableMultiDict([])
request.is_xhr: False
## request.headers ## 
X-Aws-Sqsd-Msgid: 232eea42-5485-478c-a57f-4afddbf77ba9 
X-Aws-Sqsd-Receive-Count: 199 
X-Aws-Sqsd-Queue: #<AWS::SQS::Queue:0xb9255e90> 
Content-Length: 59 
User-Agent: aws-sqsd 
X-Aws-Sqsd-First-Received-At: 2014-02-20T13:55:34Z 
Host: localhost 
Content-Type: application/json

WARNING:2014-02-20 15:34:37,419: Moreinfo:

WARNING:2014-02-20 15:34:37,419: Args:

WARNING:2014-02-20 15:34:37,420: Form:

Note that none of the ImmutableMultiDict structures appear to have any keys. Also, none of the JSON methods/properties are returning anything.

The Content-Length field does vary between log entries, so it does look like the information is there. But how do I read it?

My JSON messages are written to SQS using BOTO, eg:

  my_queue = conn.get_queue('my_queue_name')
  m = Message()
  m.set_body( json.dumps( my_structure ) )
  my_queue.write(m)

I also tried entering a raw JSON message by hand using the SQS web interface. This does not work either - I was speculating we might have a character encoding issue/

Was it helpful?

Solution

It is not clear why your message logging is being cut off; perhaps you have an indentation error and part of your logging code is not considered part of the view method.

However, if I understand you correctly, Boto SQS messages are not only encoded as JSON, the JSON message itself is also base64 encoded, which would mean that the flask.get_json() method won't decode it for you.

Instead, use the request.get_data() method to access the raw POST data and do the decoding yourself; do first verify that the request.content_length value is within a tolerable size to prevent an attacker from sending you an overlarge message:

from flask import json
import base64

if request.mime_type == 'application/json' and request.content_length <= 1024**2:
    # message is JSON and smaller than 1 megabyte
    try:
        decoded = base64.b64decode(request.get_data())
        data = json.loads(decoded)
    except (ValueError, TypeError):
        mylog.exception('Failed to decode JSON')
    else:
        # do something with the decoded data
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top