I am using RMySQL to fetch some rows from a data table (the table is large so I cannot post it here but basically contains a bunch of numbers.. a total of 10 columns and about 12,000 rows). I get the following error when I run fetch(con, n=-1): RS-DBI driver warning: (error while fetching rows) and the server returns only 1713 rows.

If I get rid of some of the columns being fetched then this seems to work fine. Does anyone know what this can be related to? I don't even know where to start debugging. Could this be a server-side setting? My R session has more than enough memory (20 gigs).

有帮助吗?

解决方案

Is each column a number, or a list of numbers? That is, how many bytes are in each column?

I've run into this problem before, and when I've hit it, it was because I was trying to pull too much data too fast. I've found that in those cases, making multiple calls with smaller values of n can sometimes work. Then again, the rows in the databases I've had trouble with have been huge

其他提示

A better soultion is instead of n=-1 try putting a very large number like n=1000000. The error did not come atfer I used this. In my case the number of rows I fetched was 1.13 millions

I have the same type of problems:

  1. Fetch all rows once:

    df = dbFetch(res, n = -1)

    => it returned only a part of the resultset, and stop fetching more rows.

  2. Using a loop to fetch by chunks:

    while (!dbHasCompleted(res)) {
        chunk = dbFetch(res, n = 1000)
        print(nrow(chunk))
        df = rbind(df, chunk)
    }
    

    => it returned some chunks for awhile and then runs into a infinite loop of zero-sized chunk (printing "[1] 0" forever), even when the resultset has not completed to fetch all rows: dbHasCompleted(res) == FALSE.

Then, I used this strategy:

Run a query with "select count(1) from table where ..." to catch the size of the result set. Added 1 to the row count number [row_count = as.integer(dbFetch(res, n = 1)) + 1] and use this "count + 1" as the n parameter to get all rows at once in the next query. This seems to be OK up to now... but then I knew about this form:

my_df = dbGetQuery(con, my_query)

Much better method, no bugs found yet.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top