Pergunta

I know that I can read in a very large csv file much faster with fread using the data.table library than with read.csv that reads a file in as a data.frame. However, dplyr can only perform operations on data.frame.

My questions are:

  1. Why was dplyr built to work with the slower of the two data structures?
  2. When working with big data is it good practice to read in as data.table then convert to data.frame to perform dplyr operations?
  3. Is there another strategy I am missing?

Nenhuma solução correta

Licenciado em: CC-BY-SA com atribuição
scroll top