Question

I'm trying to figure out why one of the queries in my rails app is under performing quite considerably. I'm using Postgres 9.3, and rails 4.0.3 with jRuby 1.7.10 so potentially it's an issue with the JDBC driver?

But basically, it's a VERY simple query:

SELECT * FROM table;

The table contains 851 rows so it's hardly a massive dataset, so I'm expecting a fast query. When I execute this query in pgAdmin 3, I get exactly what I expect: all rows returned in anywhere between 15 and 35ms. Nice and fast!

From rails however it's a different story. Running the query in a rails console, the fastest I've been able to achieve is 189ms whilst generally it is around the 200ms mark. This query is run by calling Table.all

My initial thought was simply that ActiveRecord is adding overhead in the instantiation of 851 objects, so that's obviously slowing it down. To test this, I ran

ActiveRecord::Base.connection.execute("SELECT * FROM table")

There was a slight speedup, but again, almost all queries were taking around 150ms, still a long way off the pgAdmin mark. As a last attempt I tried

ActiveRecord::Base.connection.exec_query_raw("SELECT * FROM table")

But this didn't improve the performance at all.

I'm really stumped now as to why this is so much slower, given that I am seeing a 10x performance decrease between pgAdmin and Rails. Having executed just the raw SQL in Rails, I know it's not ActiveRecord slowing things down, so I'm really confused now as to what is.

Does anyone know why this is so much slower than it should be?

UPDATE

I've done some more digging, and it seems to be in the way rails is handling Date fields. If I manually select all of the columns in the table, it is just as slow, but if I select them all except for updated_at and created_at, the query runs in around 2-4ms which is perfect!

My only issue now then, is how to get around this. Is there a way of fixing rails performance issue with dates, or having rails not parse them as dates and keep them as strings or similar?

UPDATE 2

So after doing some more digging and @stonehz pointing me to the bug raised from this post, I've upgraded to Jruby 1.7.12 and rails 4.1.0 and noticed a considerable speed up. It's it's not a lot closer to the performance of pgAdmin, but I think that without removing the date columns completely, I'm not going to get anything better. Below is the benchmark I'm now getting

SELECT *:                        4.080000   0.330000   4.410000 (  5.243000)
SELECT date_fields:              1.960000   0.020000   1.980000 (  2.032000)
SELECT * - date_fields:          3.070000   0.070000   3.140000 (  3.247000)
--------------------------------------------------------- total: 9.530000sec

                                     user     system      total        real
SELECT *:                        3.700000   0.060000   3.760000 (  4.663000)
SELECT date_fields:              1.790000   0.020000   1.810000 (  2.021000)
SELECT * - date_fields:          2.330000   0.060000   2.390000 (  3.180000)

This benchmark is querying 851 rows. The first test is a simple SELECT * statement. The second test only selects the date fields, and the final test selects all fields except for the date fields. Each query is being run 100 times to get the final result.

As this shows, the select * statement is now only taking ~4 seconds to run 100 times, so each query is taking only 40ms which is much closer to the pgAdmin time of ~30ms. MUCH BETTER!

Was it helpful?

Solution

They have addressed the issue and opened couple tickets around the bug you found:

https://github.com/jruby/jruby/issues/1662

https://github.com/jruby/activerecord-jdbc-adapter/issues/540

Using Jruby 1.7.12 will improve performance around 5x (as their benchmarks imply)

OTHER TIPS

I'm really stumped now as to why this is so much slower, given that I am seeing a 10x performance decrease between pgAdmin and Rails. Having executed just the raw SQL in Rails, I know it's not ActiveRecord slowing things down, so I'm really confused now as to what is.

It's not ActiveRecord as much as it is your query.

When you run this query in pg admin, it doesn't actually run it insofar as I'm aware. PgAdmin makes a few assumptions on your usage, namely that you'll frquently end up with a huge set. In the interest of performance, it's more interesting touse a cursor so as to allow you to navigate the huge set by fetching rows as needed rather than all at once. I'd gather that this is precisely what is going on.

When you run the same query in your app, in contrast, you're responsible for doing this kind of thing. Or for that matter, not running this kind of query to begin with. Needing to select all rows from a database table is usually a sign that something's wrong in your app's design.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top