Perhaps better than the alternatives so far is to use an array
to manipulate your data into your desired structure. Since you are just dealing with a single vector and you want to fill your data in by columns, you just need to assign the dim
s to your vector.
Here is a simplified example. We'll start with a vector of length 40.
mydata <- rep(1:8, each = 5)
mydata
# [1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4
# [21] 5 5 5 5 5 6 6 6 6 6 7 7 7 7 7 8 8 8 8 8
Now, imagine we want to convert this into four columns where the first 20 values are grouped together and the second 20 values are grouped together. (In your data, it would be the first 24*18 values grouped together to represent 18 columns of records for one day.)
Here's how we would do that:
myarray <- array(mydata, dim=c(5, 4, 2),
dimnames = list(NULL, NULL,
c("2012-01-01", "2012-01-02")))
myarray
# , , 2012-01-01
#
# [,1] [,2] [,3] [,4]
# [1,] 1 2 3 4
# [2,] 1 2 3 4
# [3,] 1 2 3 4
# [4,] 1 2 3 4
# [5,] 1 2 3 4
#
# , , 2012-01-02
#
# [,1] [,2] [,3] [,4]
# [1,] 5 6 7 8
# [2,] 5 6 7 8
# [3,] 5 6 7 8
# [4,] 5 6 7 8
# [5,] 5 6 7 8
Perhaps you want to stop at this point. However, if you want to go all the way to a single data.frame
, that's also easily possible.
Using @Jilber's sample data just for purposes of easy replication:
set.seed(1)
df <- data.frame(df=sample(1:999, 158112, TRUE))
# Hopefully you've done your math correctly
# R will recycle if the dims aren't correct
# for your data.
Ndays <- nrow(df)/(24*18)
dfarray <- array(df$df,
dim = c(24, 18, Ndays),
# Add dimnames by creating a date sequence
dimnames = list(NULL, NULL, as.character(
seq(as.Date("2012-01-01"), by = "1 day",
length.out = Ndays))))
# Use `apply` to convert this to a `list` of `data.frame`s
temp <- apply(dfarray, 3, as.data.frame)
# Use `lapply` to create your intermediate `data.frame`s
out <- lapply(names(temp), function(x) {
data.frame(date = as.Date(x), temp[[x]])
})
# Use `do.call(rbind, ...)` to get your final `data.frame`
final <- do.call(rbind, out)
The first few lines of the output look like this:
head(final)
# date V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15
# 1 2012-01-01 266 267 732 347 455 991 729 724 101 649 307 702 133 841 443
# 2 2012-01-01 372 386 693 334 410 496 453 338 927 953 578 165 222 720 157
# 3 2012-01-01 573 14 478 476 811 484 175 630 283 953 910 65 227 267 582
# 4 2012-01-01 908 383 861 892 605 174 746 840 590 340 143 754 132 495 970
# 5 2012-01-01 202 869 438 864 655 755 105 856 111 263 415 620 981 84 989
# 6 2012-01-01 898 341 245 390 353 454 864 391 840 166 211 170 327 354 177
# V16 V17 V18
# 1 109 232 12
# 2 333 241 940
# 3 837 797 993
# 4 277 831 358
# 5 587 114 747
# 6 836 963 793
I still do strongly suggest that you become familiar with the "xts" package if you're going to be doing a lot of work with time series data though.
Conversion from the "final" data.frame
above to an xts
object is easy:
library(xts)
Final <- xts(final[-1], order.by=final[[1]])
And this will let you easily do fun things like this:
apply.quarterly(Final, mean)
# V1 V2 V3 V4 V5 V6
# 2012-03-31 490.5256 493.8338 507.4272 503.5421 495.0929 494.4025
# 2012-06-30 511.5792 508.1493 500.9043 500.2152 509.0614 499.9881
# 2012-09-30 496.2672 501.1399 496.3542 493.7423 504.8170 507.1671
# 2012-12-31 503.9583 502.5616 502.8936 509.2120 503.2387 502.4678
# V7 V8 V9 V10 V11 V12
# 2012-03-31 490.2477 492.2115 510.6525 499.8168 506.9510 494.3654
# 2012-06-30 494.0962 497.0357 506.9267 500.2198 501.4263 494.1117
# 2012-09-30 509.9561 487.0543 497.2206 485.4511 498.1191 494.5190
# 2012-12-31 503.0095 500.7903 494.7428 494.1409 502.0181 496.9764
# V13 V14 V15 V16 V17 V18
# 2012-03-31 504.4130 499.8581 503.0023 501.0137 499.1021 504.7711
# 2012-06-30 500.0504 501.2903 490.7582 502.7395 503.5737 496.4821
# 2012-09-30 493.4860 499.2088 500.7260 503.1907 491.9583 490.4293
# 2012-12-31 500.4348 507.9475 499.3637 486.4438 496.8220 492.8890