Your data:
dat <- read.table(text = "Samples Genotype Region
sample1 A Region1
sample1 B Region2
sample1 A Region3
sample2 A Region2
sample2 B Region3
sample3 B Region1
sample3 A Region3", header = TRUE)
You can use the reshape2
package.
library(reshape2)
dat2 <- dcast(dat, Samples ~ Region, value.var = "Genotype")
In the result, missing values are indicated by NA
:
# Samples Region1 Region2 Region3
# 1 sample1 A B A
# 2 sample2 <NA> A B
# 3 sample3 B <NA> A
NA
s are appropriate to represent missing data. But you can replace the NA
s by X
s with the following command:
dat2[is.na(dat2)] <- "X"
# Samples Region1 Region2 Region3
# 1 sample1 A B A
# 2 sample2 X A B
# 3 sample3 B X A