long format data to wide format (attempted with for loop) Could I use sapply?

StackOverflow https://stackoverflow.com/questions/19328414

  •  30-06-2022
  •  | 
  •  

Вопрос

I'm having trouble setting up a sapply. I do have a for loop that will do the job I need it to, but it's taking too long to complete it.

variable names explained:

dat #raw data
df #empty data frame to preallocate memory
uniq.user #unique user id
uniq.item #unique item id

column names for df: user id, item id 1, item id 2, ..., item id n

I'm trying to create a binary table, indicating which item a user owns.
Example:

USERID1111 1 0 0 0 1
USERID2222 0 1 0 1 1

The raw data looks like this:

USERID1111 ITEM ID 1
USERID1111 ITEM ID 5
USERID2222 ITEM ID 2
USERID2222 ITEM ID 4
USERID2222 ITEM ID 5

The for loop I have is:

for(i in 1:length(uniq.user)){
    df[i, which(uniq.item %in% dat[df[i,1]== dat[,1], 2]) + 1] <- 1 
}

How would I convert this using sapply? (or any other apply functions)

Thank you!

p.s. If there are better ways to perform this task, please let me know! I'm trying to learn more efficient ways to do things in R.

Это было полезно?

Решение

Maybe table could be an alternative:

# some data
df <- data.frame(id = c(1, 1, 2, 2, 2), item = c(1, 5, 2, 4, 5))

# define possible levels of 'item', so that also levels with zero count appear in table
df$item <- factor(df$item, levels = 1:5)

# make table
with(df, table(id, item))
#     item
# id  1 2 3 4 5
#   1 1 0 0 0 1
#   2 0 1 0 1 1
Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top