Pregunta

I am a bit new to R and am trying to use a function to output a dataframe. I have several dataframes that need deduplication. Each record in the data frame has an index variable (RecID) and a patient ID (PatID). If patients are listed multiple times in the dataframe, I want to choose the record largest RecID.

I want to be able to change this data frame:

PatID   RecID
1       1
1       2
2       3
3       4
3       5
4       6

Into this dataframe

PatID    RecID
1        2
2        3
3        5
4        6

I can use the following code to successfully deduplicate the dataframe.

df <- df[order(df$PatID, -df$RecID),]
df <- df[ !duplicated(df$PatID), ]

I created a function with this code so I can apply my deduplication scheme across multiple data frames easily.

dedupit <- function(x) {
    x <- x[order(x$PatID, -x$RecID),]
    x <- x[ !duplicated(x$PatID), ]
  }

However, when I put use the code dedupit(df), it does not create a new df dataframe with deduplicated records.The function won't output the final dataframes or any of the intermediate dataframes. Is there a way to have functions output dataframes?

¿Fue útil?

Solución

You need to put return(x) at the end of your function.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top