Pergunta

Imagine you have a data frame with 2 variables - Name & Age. Name is of class factor and Age number. Now imagine now there are thousands of people in this data frame. How do you:

  1. Produce a table with: NAME | COUNT(NAME) for each name uniquely?

  2. Produce a histogram where you can change the minimum number of occurrences to show up in the histogram.?

For part 2, I want to be able to test different minimum frequency values and see how the histogram comes out. Or is there a better method pragmatically to determine the minimum count for each name to enter the histogram?

Thanks!

Edit: Here is what the table would look like in a RDBS:

NAME | COUNT(NAME)

John | 10
Bill | 24
Jane | 12
Tony | 50
Emanuel| 1
...

What I want to be able to do is create a function to graph a histogram, where I can change a value that sets the minimum frequency to be graphed. Make more sense?

Foi útil?

Solução

> x <- read.table(textConnection('
+    Name Age Gender Presents Behaviour
+ 1    John   9   male       25   naughty
+ 2     Bill   5   male       20      nice
+ 3     Jane  4 female       30      nice
+ 4     Jane  4 female       20      naughty
+ 5     Tony   4   male       34   naughty'
+ ), header=TRUE)
> 
> table(x$Name)

Bill Jane John Tony 
   1    2    1    1   
> layout(matrix(1:4, ncol = 2))
> plot(table(x$Name), main = "plot method for class \"table\"")
> barplot(table(x$Name), main = "barplot")
> tab <- as.numeric(table(x$Name))
> names(tab) <- names(table(x$Name))
> dotchart(tab, main = "dotchart or dotplot")
> ## or just this
> ## dotchart(table(dat))
> ## and ignore the warning
> layout(1)  

enter image description here

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top