문제

I am analyzing my Facebook page's posts to see what kind of posts attract the most people. So I want to create columns with the tags used. Here's an example of how the data export would look like:

Post              Likes
Blah   #a          10
Blah Blah #b       12
Blah Bleh #a       10
Bleh   #b           9
Bleh Blah #a #b    15

I want to create this:

Post              Likes   tags
Blah   #a          10      #a
Blah Blah #b       12      #b
Blah Bleh #a       10      #a
Bleh   #b           9      #b
Bleh Blah #a #b    15      #a #b
Bleh #b Blah #a    14      #a #b

Is this possible? I thought of using grep1 to check for posts with "#" inside, but I'm stuck at what to do next.

도움이 되었습니까?

해결책

You can use gregexpr for example to find the desired pattern and regmatches to extract it:

txt = c('Bleh Blah #a #b','Blah Bleh #a')
regmatches(txt,gregexpr('#[a-z]',txt))   ## I assume a tag is # followed by lower letter 
[[1]]
[1] "#a" "#b"

[[2]]
[1] "#a"

using alexis example, you write something like this:

DF$tag <- regmatches(DF$Post,gregexpr('#[a-z]',DF$Post)

edit in case tag is someething like #hi (more than one letter):

txt = c('Bleh Blah #hi allo #b','Blah Bleh #a')
regmatches(txt,gregexpr('#[a-z]+',txt))

[1]]
[1] "#hi" "#b" 

[[2]]
[1] "#a"

다른 팁

This seems to work:

#random data
DF <- data.frame(Post = c("asd wer #a", "dfg #b gg", 
                          "wer #c qwe qweeee #a #b", "asd asd, ioi #a #c"),
                 Likes = c(sample(1:50, 4)), stringsAsFactors = F)

#find tags
Tags <- lapply(DF$Post, function(x) { spl <- unlist(strsplit(x, " ")) ; 
                                      paste(spl[grep("#", spl)], collapse = ",") })

DF$Tags <- Tags

> DF
                     Post Likes     Tags
1              asd wer #a     9       #a
2               dfg #b gg    10       #b
3 wer #c qwe qweeee #a #b    46 #c,#a,#b
4      asd asd, ioi #a #c    31    #a,#c
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top