문제

I have a string of HTML where I want to strip all the html tags. The problem is that the plain text of each node is squished together and I need to add some whitespace between each node.

Nokogiri::HTML("<p>Hello</p><p>There</p>").text
Gives  => HelloThere
I want => Hello There

Can I tell Nokogiri to behave like this somehow?

도움이 되었습니까?

해결책

You can do

doc = Nokogiri::HTML("<p>Hello</p><p>There</p>")
doc.xpath('//text()').to_a.join(" ")

다른 팁

Nokogiri::HTML("<p>Hello</p><p>There</p>").xpath("//*[not(child::*)]").map(&:text).join(' ')
# => "Hello There"

EDIT: I tried to do it on my own but ended using a solution which slightly looks like Uri Agassi's :)

irb(main):040:0> Nokogiri::HTML("<p>Hello</p><p>There</p>").xpath("//text()").map(&:text).join(" ")
=> "Hello There"
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top