Theoretical Question: Data.table vs Data.frame with Big Data
-
31-10-2019 - |
Pergunta
I know that I can read in a very large csv
file much faster with fread
using the data.table
library than with read.csv
that reads a file in as a data.frame
. However, dplyr
can only perform operations on data.frame
.
My questions are:
- Why was
dplyr
built to work with the slower of the two data structures? - When working with big data is it good practice to read in as
data.table
then convert todata.frame
to performdplyr
operations? - Is there another strategy I am missing?
Nenhuma solução correta
Licenciado em: CC-BY-SA com atribuição
Não afiliado a datascience.stackexchange