Question

I usually write my inserts like this:

query = "
  INSERT INTO
    SomeTable (Column1, Column2)
  VALUES
    #{data.map { |row| "(#{row[0]},#{row[1]})" }.join(?,)}
"

But I shudder to think about the massive strings that are being created when my data number in the hundreds of thousands. Is there a more proper way to do this? All of the string interpolation seems really smelly.

I'm using mysql2 version 0.3.15. Note that this is not Rails or Active Record related.

Était-ce utile?

La solution

If you're loading a lot of data, you can create a CSV file, move it to the DB server and let the database manager load it directly. It'll be a LOT faster.

See "13.2.6 LOAD DATA INFILE Syntax" for more information.

Autres conseils

I have created the following benchmark to compare which concat method is better in your escenario. Looks like you are using the better one.

                           user     system      total        real
<<                     6.540000   0.000000   6.540000 (  6.541577)
+                      7.580000   0.000000   7.580000 (  7.590544)
#{one}#{two}           5.700000   0.000000   5.700000 (  5.703538)

require 'benchmark'

values = Array.new(1000){ |i| [i.to_s, (i+1).to_s] }

Benchmark.bm(20) do |x|
  x.report ('<<') do
    10_000.times do
            values.map { |row| "(" << row[0] << "," << row[1] << ")" }.join(?,)
    end
  end
  x.report('+') do
    10_000.times do
      values.map { |row| "(" + row[0] + "," + row[1] + ")" }.join(?,)
    end
  end
  x.report('#{one}#{two}') do
    10_000.times do
        values.map { |row| "(#{row[0]},#{row[1]})" }.join(?,)
    end
  end
end

Well part of the problem is that queries are sent as strings so no matter how you do it, you eventually need to make a string. For your specific example, I would suggest batching your queries. Just do a portion of your data in a loop to avoid constructing huge strings and also overly stressing your database.

I don't know enough about ruby string internals to tell you the best way to construct your string but with a good batch size (maybe 100 records at a time) you should be fine with what you have.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top