Using feedzirra to parse XML product feeds
-
06-06-2021 - |
Question
I am loading some 200 product feeds into a website which is incredibly time consuming. Looking at new ways to do this (outside of Nokogiri) I am looking into Feedzirra. I am using Feedzirra 0.3.1 at the moment. Installation without any problems.
I want to parse the following XML product feeds (and many others):
feed = "http://adtraction.com/productfeed.htm?type=feed&format=XML&encoding=UTF8&epi=0&zip=0&cdelim=tab&tdelim=singlequote&sd=0&apid=52561763&asid=257117013"
feed_obj = Feedzirra::Feed.fetch_and_parse(feed)
but when I do I only get a nil response. It seems like it it at least it is fetching the feeds since it takes some few secs before the response.
My questions:
- Is it possible to use Feedzirra for this? Or, can Feedzirra only be used for RSS-feeds?
- Can I expect to read and parse them faster using Feedzirra or is this a dead end?
- Do you get the same response and/or can you see what the problem could be?
Edit: Changed the code, that was not really the one I used in my application.
Solution 3
After a closer look it seems like Feedzirra is only for blog feeds and is not really applicable for my problem. I will have to look into other options.
OTHER TIPS
It looks like Feedzirra is using sax-machine for xml parsing which is based on Nokogiri, so its not likely you gaining that much performance gain using it instead of pure nokogiri approach. Where it could be helpful is working with the host it self. you can fairly easy check the headers and not pull the file over and over again if there were no change in it
Most likely you are getting the empty response because it times out.
There are several flaws in your code-snippet:
1
is not a variable in Ruby, it is aFixnum
and no values can be assigned to it- You have to put parentheses around the url in order to turn it into a string and to assign it to a variable.
- Unless you defined it, I am pretty sure that
using
is not a directive in Ruby
Change your code to something like this and it should most probably work:
first = "http://adtraction.com/productfeed.htm?type=feed&format=XML&encoding=UTF8&epi=0&zip=0&cdelim=tab&tdelim=singlequote&sd=0&apid=52561763&asid=257117013"
feed = Feedzirra::Feed.fetch_and_parse(first)
feed
should then be some kind of Feedzirra-object that you can further work on.