I think you typically wouldn't use a HTTPClientFactory
, as it seems it's just a thing that does HTTP requests and not much more. It's pretty low-level.
If you just want to fire a request, there are functions (twisted.web.client.getPage
and .downloadPage
) that construct the factory for you, handling both HTTP and HTTPS.
Agent
is a thing that gives you a higher level abstraction: it keeps a connection pool, handles the HTTP/HTTPS choice based on the url, handles proxies etc. And right, this is the thing you usually want to use.
It seems they they don't share much code and Agent is to HTTP11ClientProtocol
(and HTTP11ClientFactory
) as getPage
is to the old HTTPClientFactory
(and its protocol, HTTPPageGetter
). So there's a twisted.web.client
vs ._newclient
(with the Agent
as its public API) duality. Historical reasons and backward compatibility, I'd guess.
Anyway, this library won't be nice to mix with Agent
out of the box, because the API is broken. twisted-socks's SOCKSWrapper
declares it implements the IStreamClientEndpoint
interface, but the interface demands the .connect
method returns a deffered that will fire with an IProtocol
provider (see docs), while SOCKSWrapper
returns one that fires with the address (here's the line that does this).
It seems you can easily fix it changing the line to:
self.handshakeDone.callback(self.transport.protocol)
Once you do that, you should be able to use twisted-socks with Agent
. Here's an example: (using inlineCallbacks
and the new react
, but you could just as well use the standard .addCallback with deferreds and reactor.run()
)
from twisted.internet.endpoints import TCP4ClientEndpoint
from twisted.internet.defer import inlineCallbacks
from twisted.internet.task import react
from twisted.web.client import ProxyAgent, readBody
from socksclient import SOCKSWrapper
@react
@inlineCallbacks
def main(reactor):
target = TCP4ClientEndpoint(reactor, 'example.com', 80)
proxy = SOCKSWrapper(reactor, 'localhost', 9050, target)
agent = ProxyAgent(proxy)
request = yield agent.request('GET', 'http://example.com/')
print (yield readBody(request))
Also, there's a txsocksx library that seems to be nicer to use (and is pip-installable!). The API is pretty much the same, however you pass the target endpoint where you would pass the proxy endpoint before:
from twisted.internet.endpoints import TCP4ClientEndpoint
from twisted.internet.defer import inlineCallbacks
from twisted.internet.task import react
from twisted.web.client import ProxyAgent, readBody
from txsocksx.client import SOCKS5ClientEndpoint
@react
@inlineCallbacks
def main(reactor):
proxy = TCP4ClientEndpoint(reactor, 'localhost', 9050)
proxied_endpoint = SOCKS5ClientEndpoint('example.com', 80, proxy)
agent = ProxyAgent(proxied_endpoint)
request = yield agent.request('GET', 'http://example.com/')
print (yield readBody(request))