Question

This questions is similar to these:

I have to index certain features from the Wikipedia XML dump. The parsing is fast. However, insert is slow.

Switching off indexing doubled the speed.

I batch insert like this:

    articles.grouped(5000)
           .foreach {
              batch: IterableView[(Article, List[Category], List[Link]), Iterable[_]] =>
                //Save each batch in one transaction
                database withTransaction {
                 implicit session =>
                   for(i <- batch) {
                     articles += i._1
                     categories ++= i._2
                     links ++= i._3
                   }

               }
            }

I read that journal_mode = MEMORY and synchronous = off increase the insert speed. How do I set these with slick? I am using c3p0 as a connection pool and added PRAGMA journal_mode = MEMORY to preferredTestQuery. I don't believe this is the right way to set these options.

Thanks for your help!

Était-ce utile?

La solution

It seems like you are reading data once, locally. You could just not use transactions at all and use withSession instead. And if you still need a pragma you can set it via plain SQL. You probably want to reset the pragma after use to not leave a side-effect.

import scala.slick.jdbc.StaticQuery.interpolation

database withSession {
  implicit session =>
  sqlu"PRAGMA synchronous=OFF".execute
  articles.grouped(5000)
         .foreach {
            batch: IterableView[(Article, List[Category], List[Link]), Iterable[_]] =>
              //Save each batch in one transaction
               for(i <- batch) {
                 articles += i._1
                 categories ++= i._2
                 links ++= i._3
               }
           }
}

Also interesting to know is that not only database has a withTransaction method, but also session. So you can do session.withTransaction within a withSession block re-using the same connection.

Autres conseils

In Slick 3.2.0, the following lines seem to work:

import org.sqlite.SQLiteConfig
import slick.jdbc.JdbcBackend.Database
import slick.jdbc.SQLiteProfile.api._
...
val sqliConfig = new SQLiteConfig();
sqliConfig.setJournalMode(SQLiteConfig.JournalMode.MEMORY)
sqliConfig.setSynchronous(SQLiteConfig.SynchronousMode.OFF)
val emailsDB = Database.forURL(
  "jdbc:sqlite:/path/to/my/dbfile.sqlite",
  driver = "org.sqlite.JDBC",
  prop = sqliConfig.toProperties
)

where org.sqlite is from :

libraryDependencies += "org.xerial" % "sqlite-jdbc" % "3.16.1"
Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top