Domanda

I have a data frame with donations and names of donors.

**donation**              **Donor**
 25.00               Steve Smith
 20.00               Jack Johnson
 50.00               Mary Jackson
  ...                   ...

I'm trying to do some clustering using the pvclust package. Unfortunately the package doesn't seem to take non-numerical data.

> rs1.pv1 <- parPvclust(cl, rs1, nboot=10)
Error in cor(x, method = "pearson", use = use.cor) : 'x' must be numeric

I have two questions.

1) Is there another package or method that would do this better?

2) Is there a way to "normalize" the donor names list? Ie get a list of unique donor names, assign each an id number and then insert the id number into the data frame in place of the character name.

È stato utile?

Soluzione

For number 2:

#If donor is a factor then

as.numeric(donor)

#will transform your factor to numeric.
#If it isn't, tranform it to a factor and the to numeric
as.numeric(as.factor(donor))

However, I'm not sure that transforming the donor list to a numeric and then using cor makes sense at all.

HTH

Altri suggerimenti

How about rs1 <- transform(rs1, Donor=as.numeric(factor(Donor))) ? (Warning: I haven't thought about what you're doing enough to know whether that makes sense -- so I'm only answering question #2, not question #1). Typically Donor would already be a factor (this is what e.g. read.table or read.csv would do by default), so the factor() part would be redundant.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top