Extracting text values from atom feed with Ruby RSS

https://stackoverflow.com/questions/19188160

30-06-2022
|

سؤال

I'm trying to use the standard lib ruby RSS::Parser to parse an Atom feed, which sort of works.

When I access the extracted fields, such as .title it returns <title>The title</title> rather than just The title. If you parse e.g. a RSS feed the .channel.title will return The title.

Is there any way to use the standard RSS::Parser for atom feeds? or is it a bug?

I know there are alternatives like Feedzirra, but I would rather use the standard lib.

A quick test to see the problem in ruby 1.9.3 and 2.0:

require "rss"
feed = RSS::Parser.parse(open("http://casadelkrogh.dk/atom.xml").read)
feed.title.to_s #=> "<title>CasaDelKrogh</title>"

المحلول

To get the content of the title your code should be as below :

require "rss"
feed = RSS::Parser.parse(open("http://casadelkrogh.dk/atom.xml").read)
feed.title.to_s
# => "<title>CasaDelKrogh</title>"
feed.title.content
# => "CasaDelKrogh"

نصائح أخرى

It's not a bug.

to_s method is almost inspection of RSS::Atom::Feed::Title.

You can use feed.title.content if you want get title without tag.

مرخصة بموجب: CC-BY-SA مع الإسناد

لا تنتمي إلى StackOverflow