Question

I'm developing a simple HTTPS proxy (written in Python) which receives POST/GET requests/responses, applies some transformation and finally forwards the result to the recipient. I need to handle chunked-encoded requests/responses in a "streaming" fashion, meaning that as soon as a chunk is received the proxy transforms it and forwards it to the recipient.

Before deciding to support chunked-encoded requests, I've been using mitmproxy http://mitmproxy.org/ and it worked perfectly. Unfortunately, I noticed that it waits until the entire body is received before letting me handle the response/request.

How can I implement a proxy supporting chunked-encoded requests/responses? Has anyone of you ever done something like this?

Thanks

EDIT: MORE INFO ON MY USE CASE

I need to handle POST requests and GET responses.

In the POST request I receive a JSON object and I have to encrypt some of its values.

In the GET response I receive a JSON object and I have to decrypt some of its values.

Till now, the following code has worked perfectly:

 def handle_request(self, r):
    if(r.method=='POST'):
       // encryption of r.get_form_urlencoded()

 def handle_response(self, r):
    if(r.request.method=='GET'):
       // decryption of r.content

How can I do the same thing with single chunks?

EDIT: UPDATES

After evaluating different solutions, I decided to go for Squid (proxy) + ICAP (content adaptation).

I've successfully configured Squid and the performance are just great. Unfortunately, I can't find a suitable ICAP server (in Python, if possible) for doing content adaptation (modification). I thought this one https://github.com/netom/pyicap could do the job but looks like it doesn't read the body of myPOST requests.

Do you guys know a Python ICAP server that I can use together with Squid?

Thanks

Was it helpful?

Solution

The answer below is outdated. You can now pass --stream to mitmproxy, whose behaviour is explained in the mitmproxy documentation.

mitmproxy developer here. This is definitely a feature we want for mitmproxy as well, but it's not that trivial and probably not coming very soon. If you really want to implement that yourself, I can recommend two things:

  1. If you have a very specific use case, you can employ libmproxy.protocol.http.HTTPRequest.from_stream for parsing the header and do the body processing yourself.
  2. If you do not want to modify the request/response body, you may find it sufficient to modify mitmproxy itself. In a nutshell, you would need to read the request/response without content (see 1.), modify it to your needs, pass it to the server and then delegate control to the libmproxy.protocol.tcp (see https://github.com/mitmproxy/mitmproxy/blob/master/libmproxy/proxy/server.py#L169)

If you have further questions, don't hesistate to ask here or on mitmproxy's IRC channel.


Re Comment #1:

You can't take too much out of mitmproxy, but at least you get delegate the header parsing & processing.

# ...accept request, socket.makefile() etc...
req = HTTPRequest.from_stream(client_conn.rfile, include_content=False)
# manually forward to the server (req._assemble_head())
# manually receive response body chunk by chunk and forward it to the server, see
# https://github.com/mitmproxy/netlib/blob/master/netlib/http.py#L98
resp = HTTPResponse.from_stream(server_conn.rfile, include_content=False)
# manually forward headers
# manually process body and forward

That being said, this is a fairly complex topic. Eventually, you're better off hacking that directly into libmproxy.protocol.http.HTTPHandler.

Another option, depending on your use case again: Use mitmproxy, set the conntype to tcp and forward traffic as-is and use regex replacements on the content in libmproxy.protocol.tcp . Probably the easiest way, but the most hacky one. If you can provide some context, I may guide you further in the right direction.


Re Comment #2:

Before we get to the main part: JSON is a really bad choice for streaming/chunking as long as you don't want to encrypt the complete JSON object and treat it as a single string. You should definitely consider something like tnetstrings if you only want to encrypt parts.

Apart from that, hooking into read_chunk works, but first you need to get to the point where you can actually receive chunks over the line. Then, it's as simple as reading the single chunks, encrypting them and forwarding them.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top