Question

when reading my RSS feed with Thunderbird feed reader, some entries are duplicated. Google reader does not have the same problem.

Here is the faulty feed http://plcoder.net/rss.php?rss=Blog

There is a problem, but where?

Regards, Cédric

update : I add guid but the problem remain. Anothers feeds does not duplicate like mine, so I will re-work on this module and replace this old good code.

conclusion : I re-work completely the rss generator code, and it's ok. I think I was using a very old version of RDF.

Was it helpful?

Solution

Try adding a <guid> tag to each item, giving it a permalink. i.e.:

<item rdf:about="http://plcoder.net/?doc=2134&amp;amp;titre=mon-pc-se-la-pete">
  <link>http://plcoder.net/?doc=2134&amp;amp;titre=mon-pc-se-la-pete</link>
  <guid>http://plcoder.net/?doc=2134&amp;amp;titre=mon-pc-se-la-pete</guid>
  ...
</item>

Without a GUID, if any of the content in the post changes, your RSS aggregator might think that it is a new post. With the GUID, even if the content of that item changes, your RSS aggregator should just update the post, instead of treating it as a new item.

OTHER TIPS

At least with Thunderbird 2.0.0.21 the problem is that TBird doesn't seem to respect GUID-tags but it does respect the channel's pubDate-tag. Thus if pubDate is more recent than with last reading, TBird will read all entries (it seems).

I don't know what would happen if channel's pubDate-tag is missing though...

I have experienced this issues with some of my own feeds. What has happened is I start off with a list of entries like this:

Item A
Item B
Item C

The client downloads them and everything is fine. Then I add a new item, so the feed reads as:

Item D
Item A
Item B

D shows up in the reader.

But then I decide I don't want that item, so the list reverts to:

Item A
Item B
Item C

When Thunderbird reads this, it'll count C as a new item. I am using a GUID element, so I doubt that's the problem. I think it's got more to do with Thunderbird's parser not taking older elements into consideration.

The long-winded workaround is to "remember" what items you've already published and have since been pushed off the end of the list by new items. You'll basically need to keep a current list of items in the feed and when you delete items from it, cut it short until there are new items to replace it.

This is explained in Thunderbird documentation (under "Troubleshooting FAQ"):

Q: Why are feed messages sometimes duplicated?

A: Feed messages with identical content but different unique ids are not detected as duplicates. See this post for many more details.

The linked post for reference:

  1. Atom feeds (mandatory) have a unique id; Rss feeds (not mandatory) usually have a unique guid. For Rss feeds without a guid, an attempt is made to create a unique id from mandatory parts of the feed item.
  2. All downloaded feed messages have a record with this id stored in feeditems.rdf and exist there as long as they exist in the publisher's file, with that id. If the publisher removes a message with the id from their file, after 24 hours the feeditems.rdf cache is also purged (on get messages biff).
  3. If a publisher reuses an id after it has been purged, you will get a dupe (if the content is identical). This is an abuse of the intent behind unique ids and the publisher's error.
  4. If a publisher reuses an id before it is purged, and the content is different, you will not see the new content, as it will be treated as a duplicate. Thunderbird does not use the tag currently and its misuse by publishers may make it difficult to implement.
  5. If you view the source (Ctrl-U) of two apparent dupes, you will note the Message-Id header. If two apparent dupes have different Message-Id values, then they are not dupes regardless of potential identical content. Tb does not distinguish duplicate content.

If you want extreme debugging, change the Feeds.logging.console pref to debug or trace and restart, to see what happens during feed processing.

If you unsubscribe a feed url, this will clear the feeditems.rdf cache for that feed. If you subsequently resubscribe you will get dupes of all current items in the publisher's file that also exist in your feed folder.

Compaction has no effect on feed processing, it just removes marked for deletion items from the file. If you delete a folder/move it to trash, it is unsubscribed. Starting with Tb29, if you drag/drop a folder from one feed acount to another feed account, the subscription is retained (but not feeditems). For very old profiles/feed accounts (pre Tb17), it can be a good idea to create a new feed account and drag folders there (Tb29 and up), as a fresh feeds.rdf database is created; the penalty is a one time dupe possibility.

I had the same problem... I switched to Google's feed and it's now fixed, never knew the exact cause though

http://feedproxy.google.com/juanformoso

Thunderbird has a few bugs with duplicating feed entries, perhaps it's just one of them showing up?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top