strsplit in R: How do I split one-column data separated by comma into multiple columns?

StackOverflow https://stackoverflow.com/questions/19038878

  •  29-06-2022
  •  | 
  •  

Question

I am reading data from a website: https://raw.github.com/johnmyleswhite/ML_for_Hackers/master/02-Exploration/data/01_heights_weights_genders.csv

(1) At first I attempted to read the data directly into R with the following code:

raw_data <- read.table("https://raw.github.com/johnmyleswhite/ML_for_Hackers/master/02-Exploration/data/01_heights_weights_genders.csv", stringsAsFactors=FALSE)

But I received the following error:

Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") : unsupported URL scheme

So I simply copied the data into a .csv file. I saved this file as "Raw_Data.csv" in a directory. The data is, however, all in one column.

(2) I read this file into R via the following code

raw_data <- read.csv("Raw_Data.csv", stringsAsFactors=FALSE)

What I would like to do is split this one column into three, with the column names as "Gender", "Height", "Weight". What I tried was this:

for(i in 1:nrow(raw_data)){
    raw_data$Gender[i] <- strsplit(raw_data$Gender[i], ",")[[1]][1]
    raw_data$Height[i] <- strsplit(raw_data$Height[i], ",")[[1]][2]
    raw_data$Weight[i] <- strsplit(raw_data$Weight[i], ",")[[1]][3]
}

However, I get this error:

Error in strsplit(raw_data$Gender[i], ",") : non-character argument

Thank you in advance for your help!

Was it helpful?

Solution

may be it was because of quotes,

try

raw_data <- read.csv("Raw_Data.csv", stringsAsFactors=FALSE, quotes="\"")

OTHER TIPS

I was able to read the data into R with 3 columns just fine.

I'm not sure how you saved the data into a .csv file, but I copied the data right into Notepad++ (http://notepad-plus-plus.org/), saved it as a text file, and read it into R with read.csv("filename.txt").

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top