Question

I have an open source Java application that uses Hibernate and HSQLDB for persistence. In all my toy tests, things run fast and everything is good. I have a client who has been running the software for several months continuously and their database has grown significantly over that time, and performance has dropped gradually. It finally occurred to me that the database could be the problem. As far as I can tell from log statements, all of the computation in the server happens quickly, so this is consistent with the hypothesis that the DB might be at fault.

I know how to do normal profiling of a program to figure out where hot spots are and what is taking up significant amounts of time. But all the profilers I know of monitor execution time within the program and don't give you any help about calls to external resources. What tools do people use to profile programs that are using external db calls to find out where to optimize performance?

A little blind searching around has already found a few hot spots--I noticed a call where I was enumerating all the objects of a particular class in order to find out whether there were any. A one line change to the criterion [.setMaxResults(1)] changed that call from a half-second to virtually instantaneous. I also see places where I ask the same question from the db many times within a single transaction. I haven't figured out how to cache the answer yet, but what I really want is a tool to help me look for these kinds of things more systematically.

Was it helpful?

Solution

Unfortunately, as far as I know, there is no tool for that.

But there are some things you might want to check:

  • Are you using eager loading instead of lazy loading? By the description of your problem, it really looks like you are not using lazy loading...
  • Have you turned on and properly configured your second-level caching? Including the Query Cache? Hibernate caching mechanism is extremely powerful and flexible.
  • Have you consider using Hibernate Search? Depending on your query, Hibernate Search Full Text index on top of Apache Lucene can speed up you queries (since it indexing system is so powerful)

OTHER TIPS

How much data are you storing in HSQLDB? I don't think it performs well when managing large sets of data, since it's just storing everything in files...

There was once a tool called IronGrid/IronEye/IronTrackSql that did exactly what you are looking for. Unfortunately, they went out of business. They did open source their product at the last minute, but I have not been able to find source or a binary for quite some time.

I have been using YourKit for profiling lately, partly because you can have it profile SQL time to find your most called statements and longest running statements. It is not as detailed as IronGrid was, but it does give you valuable information. In my latest database/hibernate tuning session, the problem turned out to be hibernate and how and when it was doing eager vs. lazy loading, and adding some judicious overrides of the default when selecting large numbers of items.

Lots to report on here. I have some results, and am still looking for good answers.

I've found a couple of tools that help:

VisualVM (with BTrace, or the built in Trace) claims to help with tracing, but I haven't been able to find any tool that shows timing on method calls.

YourKit is reputed to be useful; I've asked for an open source license.

The most useful thing I found is Hibernate's built in statistics. If you set hibernate.generate_statistics true in your properties, you can send sessionFactory.getStatistics(), and see detailed statistics on what objects have been stored and retrieved and what affects the caches are having. I found one of the answers I wanted in the qeuryStatistics, which reports for each compiled query, the cache hits and misses, the number of times the query has run, how many rows were returned, and the average, max and min execution times. These timings made it abundantly clear where the time was going.

I then did some reading on caching. Razenha's suggestion was right on. [I'll mark his answer as right for now.] I added hibernate.cache.use_query_cache true to my properties, and added query.setCacheable(true); to most of my queries. I also added <cache usage="read-write"/> to a few of my .hbm.xml files. Now most of my statistics are showing a vast predominance of cache hits, and the performance is vastly better.

I'd still like some tools to help me trace execution timing so I can attack the worst problems rather than the most obvious, but this is a big help. Maybe one of the tracing tools above will turn out to help.

In Terracotta 3.1, you can monitor all of those statistics in real-time using the Terracotta Developer Console. You can see historical graphs for cache statistics, and see the hibernate statistics or cache statistics cluster-wide or on an per-node basis.

Terracotta is open source. More details and download is at Terracotta for Hibernate.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top