Well this is embarrassing... As it turns out, the relatively poor performance was due to running my test code as part of a simple web app (I know, that doesn't make sense in the context of this question), which I ran with lein ring server
which I guess is the same as running it using repl
(I just didn't make that connection). When I tried compiling and packaging with lein uberjar
and then executed that jar with java -jar
, it gave me comparable performance to the Java app.
What is an efficient way to access large-ish datasets using JDBC with Clojure
-
30-11-2021 - |
Domanda
EDIT
N00b problem as it turns out. I didn't realize that running lein ring server
results in your app being ran in interpreted mode, which is why it was much slower.
Can the following Clojure/ JDBC fragment be optimized so that it runs (much) faster?
(defn test-sql []
(sql/with-connection (db-connection)
(sql/with-query-results results ["select * from users order by username asc"]
(doseq [row results ]
(println "User" (row :first_name) (row :last_name)) results))))
I am considering using Clojure for an ETL project. First test I wrote was to print out data from a table I have with ~280K records in it. The implementations I came up with so far have been quite slow; what takes ~12 seconds in Java (even using myBatis to populate objects rather than 'raw' access) takes ~9.5 minutes with my Clojure solution.
I tried map instead of doseq, and tried using a cursor like outlined here: http://asymmetrical-view.com/2010/10/14/clojure-and-large-result-sets.html, but I get about the same execution time for each.
FWIW, same result when doing .println java.lang.System/out
(not surprising), and using with-query-results*:
(defn test-sql2 []
(sql/with-connection (db-connection)
(sql/with-query-results* ["select * from users order by username asc"]
(fn [row] (println "User" (row :first_name) (row :last_name))))))
same, same.
Soluzione