Question

I am trying to reshape the following data.table to like a contingency table(not exactly because I don't want to get frequency as value, I just want 0 or 1):

Df:

ID          CC
990081899A  CC2
990081899A  CC115
990081899A  CC39
990081899A  CC39
990081899A  CC115
990002362D  CC2
990002362D  CC115
990002362D  CC115
990002362D  CC115
990002362D  CC6
990042716D  CC2

I tried 2 things as follows but getting the same result:

First:

Contingency<-with(Df, table(ID,CC))
Diag6<- cbind(ID = rownames(Contingency), apply(Contingency, 2 , as.character))

Second:

I added the value column in the data like Value = 1: Df:

ID          CC  Value
990081899A  CC2 1
990081899A  CC115   1
990081899A  CC39    1
990081899A  CC39    1
990081899A  CC115   1
990002362D  CC2 1
990002362D  CC115   1
990002362D  CC115   1
990002362D  CC115   1
990002362D  CC6 1
990042716D  CC2 1

and Tried:

Df<- data.table(dcast(Df,ID~CC,value.var="Value"),key="ID")

Both results are same:

ID  CC115   CC2 CC39    CC6
990081899A  2   1   2   0
990002362D  3   1   0   1

Here I don't want the frequency, I just want if it is present the value should be one otherwise 0:

ID       CC115  CC2 CC39    CC6
990081899A  1   1   1   0
990002362D  1   1   0   1

Any suggestions are highly appreciated.

Was it helpful?

Solution

Simply using table(DF) will give you your layout.
Then you can convert all positive values to 1 using sign

     sign(table(DF))

                CC
    ID           CC115 CC2 CC39 CC6
      990002362D     1   1    0   1
      990042716D     0   1    0   0
      990081899A     1   1    1   0

OTHER TIPS

I'm sure this has been answered somewhere before, but table should be able to do this:

with(unique(dat), table(ID,CC) )

#            CC
#ID           CC115 CC2 CC39 CC6
#  990002362D     1   1    0   1
#  990042716D     0   1    0   0
#  990081899A     1   1    1   0

You can wrap the above like:

as.data.frame.matrix(with(unique(dat), table(ID,CC) ))

...if you prefer that output.

#           CC115 CC2 CC39 CC6
#990002362D     1   1    0   1
#990042716D     0   1    0   0
#990081899A     1   1    1   0

You can do this (with or without data.table) by passing your own function to dcast

dcast(DF,  ID~CC, fun = function(x) as.integer(length(x)>0))
# Using CC as value column: use value.var to override.
#           ID CC115 CC2 CC39 CC6
# 1 990002362D     1   1    0   1
# 2 990042716D     0   1    0   0
# 3 990081899A     1   1    1   0

Or by passing a reduced data.frame containing unique combinations

 dcast(unique(DF), ID~CC,fun=length,value.var = 'CC')
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top