Question

sqldf and RMySQL are both R packages that allow access to a MySQL database (the former using the latter). They both allow statements like this:

RMySQL: "Run an arbitrary SQL statement and extract all its output (returns a data.frame):"

dbGetQuery(con, "select count(*) from a_table")
dbGetQuery(con, "select * from a_table") 

sqldf:

library(sqldf)
sqldf("select * from iris limit 5")
sqldf("select count(*) from iris")
sqldf("select Species, count(*) from iris group by Species")
# create a data frame
DF <- data.frame(a = 1:5, b = letters[1:5])

So what are the differences? What does sqldf offer that RMySQL doesn't?

Was it helpful?

Solution

sqldf is used to issue SQL statements, and have them act on data frames. iris is not a database table, but a built-in data set.

> head(iris, n=3)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa

sqldf is not used to connect to databases.

OTHER TIPS

Besides the observation by Lundberg that dataframes are acceptable targets for SQL-commands, there is also the point that sqldf can go against any (disk-resident) table in SQLite (the default), H2, MySQL, or postgresSQL: https://code.google.com/p/sqldf/

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top