سؤال

I am trying to plot the frequencies of different journals in a list of research papers I fetched. Each row in my data frame corresponds to a paper, for which I have the associated journal.

I did the following to plot the levels (bins) in a histogram:

journal = main$Publication.Journal
tb <- table(journal)
barplot(tb[order(tb, decreasing=T)])
axis(2,at=seq(0, 12, 1), lab=seq(0, 12, 1))

journal_bins

Only problem is, I want to cut out from the graph (or table itself) the journals with a frequency of 1, since I am trying to observe only the most frequent journals (hence the ordered barplot). Any insight on how I can do this?

Many thanks! Nathanael

هل كانت مفيدة؟

المحلول

Or very simply

tb <- tb[tb>1]

table objects are subsettable the same ways any array objects are.

نصائح أخرى

It's hard to answer your specific problem without the dataset in your example so here's one solution using a mock example:

x <- rpois(100,100)
xt <- table(x)
xtd <- as.data.frame(xt)
xtds <- subset(xtd, Freq>1)  # use subset, as noted by @baptiste
plot(Freq ~ x, xtd, type="h", ylim=c(0,10))
lines(Freq ~ x, xtds, type="h", col="red")

enter image description here

I don't know if you can easily coerce a data.frame to a table, as far as I know, so you may want a different solution. Also, note the results of the logical test, xt > 1 for example, might be useful to you.

You can try something like this:

journal <- read.table(
  header=TRUE, text='Name  Article
JAMA    A
MAD B
Cigar_Afficianado   C
Bowling_Weekly  D
JAMA    E
MAD F
Cigar_Afficianado   G
JAMA    H
MAD I
Cigar_Afficianado   J
')# create data set
library(plyr)
table(journal$Name) # as in your example
journal <- ddply(journal, .(Name), transform, Article_count = length(Article))
journal #shows new column from transform in plyr with a count of articles
journal <- journal[journal$Article_count > 1, ] #removes the low counts
journal #shows that the low counts are removed
مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top