Вопрос

I need a way to manage connections to a hosted Elastic Search provider, to speed up search on my website. We are running Django on Heroku, using the Found ElasticSearch add-on, and pyes, which is an ElasticSearch Python library.

The standard way of setting up a connection to ElasticSearch with pyes is by passing the provider URL into an ES object, like so:

(1) connection = ES(my_elasticsearch_url)

Pyes uses the ES object behind the scenes to establish an open HTTP connection to my ElasticSearch provider, so I can run searches like this:

(2) results = connection.search(some_query, index_name)

Previously, I was doing both those steps in my Django view for search -- every time a user did a search, it opened a new HTTP connection then ran the search. Consequentially the search call was slow.

I sped up search by moving (1) into my app's __init__.py file -- now, I am setting up the connection only once, and importing it into the search view. But I'm worried it will choke that HTTP connection if lots of people are trying search at once.

I'm looking for ideas on how to set up a pool of connections, initiate them once on app start up, and then dole them out to my search view as needed. Ideally I'd like to be able to scale the size of the pool up and down easily with minimal changes to my code.

I can think of a few ways to approach it, but it seems like a common computing related problem, so I'm sure that a lot of you have ideas on good design and best practices for such a system. I'd love to hear them.

Thanks a lot!

Clay

Это было полезно?

Решение

If your running in a multi-threaded environment, it's merely a matter of extending Queue.Queue to create an instance that can fetch and instantiate connections on demand, from multiple threads in which your views are handling the request-response flow. You'll probably want to have a certain cap on how many connections your retaining by limiting the maximum size of the queue, although you can instantiate more connections beyond that and simply discard them if you can put them back into the queue.

The downside of using Queue.Queue is that it can create cross-cutting concerns if your views are responsible for retrieving connections from and returning them back into the queue. You can get a healthier design if you only queue the actual object from pyes.ES that holds the connection and create a wrapper for ES that, when performing a query, creates a new ES instance, fetches a connection from the queue, sets it on the instance, performs the query, returns the connection back into the queue, discards the ES instance and returns the query results.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top