Question

I have a table with names in one column. I have an R script to read this table and then do a write.table to a CSV file for further processing. The script barfs when writing my table if it encounters a name with an apostrophe (single quote) character such as "O'Reilly" in the matrix

library(RCurl)
library(RJSONIO)

dir <- "C:/Users/rob/Data"
setwd(dir)
filename <- "employees.csv"

url <- "https://obscured/employees.html"
html <- getURL(url, ssl.verifypeer = FALSE)
initdata <- gsub("^.*?emp.allEployeeData = (.*?);.*", "\\1", html)
initdata <- gsub("'", '"', initdata)

data <- fromJSON( initdata )

table <- list()
for(i in seq_along(data))
{
    job <- data[[i]][[1]]
    name <- data[[i]][[2]]
    age <- data[[i]][[6]]
    sex <- data[[i]][[7]]
    m <- matrix(nrow = 1, ncol = 4)
    colnames(m) <- c("job", "name", "age", "sex")
    m[1, ] <- c(job, name, age, sex)
    table[[i]] <- as.data.frame(m)
    write.table(table[[i]],file = filename,append = TRUE,sep = ",",col.names = FALSE,row.names = FALSE)
}

When I encounter O'Reilly, the error I am receiving is:

Error in m[1, ] <- c(job, name, age, sex) : 
  number of items to replace is not a multiple of replacement length

I end up with a csv file that includes data for all employees before O'Reilly is encountered. My Googling revealed people trying to add quotes to strings or parse strings already containing escape characters.

Is there a way to escape or remove single quotes inside my data?

Was it helpful?

Solution

I was replacing single quotes with double quotes in line 11, which I don't need to do in this data set. So it wasn't a single quote in a name messing things up, it was replacing that single quote with a double messing things up.

Removed this line:

initdata <- gsub("'", '"', initdata)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top