سؤال

I have a quick formatting question. I have a set of data in a data frame that looks like this:

Animal   Food   Food ID
 dog     steak   100
 dog     beef    200
 dog     poo     001
 cat     milk    020
 cat     steak   100
 cat     beef    200

which, for programming input purposes, I need to transform into a '.txt' file with a format like this :

<dog>
steak   100
beef    200
poo     001
</dog>

<cat>     
milk    020
steak   100
beef    200
</cat>

Obviously my real data has tens of thousands of entries or else I could do it by hand. Any suggestions would be great. Thanks.

هل كانت مفيدة؟

المحلول

Here's a way:

# create the string
text <- paste0(sapply(unique(dat$Animal), function(x) {
  subdat <- dat[dat$Animal == x, -1]
  subdat[[2]] <- sprintf("%03d", subdat[[2]])
  paste0("<", x, ">\n",
         paste(capture.output(write.table(subdat, sep = "\t",
                                          quote = FALSE, row.names = FALSE, 
                                          col.names = FALSE)), collapse = "\n"),
         "\n</", x, ">")
}), collapse = "\n\n")

# write it to a file
write(text, file = "filename.txt")

The resulting file:

<dog>
steak   100
beef    200
poo 001
</dog>

<cat>
milk    020
steak   100
beef    200
</cat>

The columns are tab-delimited.

نصائح أخرى

This approach uses the d_ply function to separate the animals before processing. Notice the default delimited (of a space) can be changed.

Will records ever need to be collapsed? For instance, if dog has two rows for steak, should they be combined in some way? If so, the plyr approach should be able to accommodate that, with a little modification.

ProcessAnimal <- function( d, fileLocation, delimiter=" " ) {
  cat(paste0("<", d$Animal[1], ">\n"), file=fileLocation, append=TRUE, sep="")

  cat(sapply(seq_len(nrow(ds)), function(i) {
    paste0(paste0(ds[i, c("Food", "FoodID")], collapse=delimiter), sep="\n")
  }), file=fileLocation, append=TRUE, sep="")

  cat(paste0("</", d$Animal[1], ">\n"), file=fileLocation, append=TRUE, sep="")
}

plyr::d_ply(.data=ds, .variables="Animal", .fun=ProcessAnimal, fileLocation="PetFood.txt")

The text file looks like:

<cat>
steak 100
beef 200
poo 001
milk 020
steak 100
beef 200
</cat>
<dog>
steak 100
beef 200
poo 001
milk 020
steak 100
beef 200
</dog>
مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top