RSS Aggregation Models

By Deane Barker • September 23, 2004 • 2 min read •

It struck me last night that there are two models of RSS aggregation: “real-time” and “stored” (yes, I just made those two terms up…).

Real-time are aggregators like Mozilla’s Sage extension. This model goes and gets the feed real time and displays it on-demand. In a lot of ways, they’re not even aggregators. They’re just different ways of looking at the content on your site – just like rearranging and simplifying the HTML version.

On the other hand, we have the stored aggregators. When this model refreshes the feed, it creates autonomous “objects” for each post. These objects are persistent beyond connection to the RSS feed. Additionally, you can manipulate the objects. For instance, with aggregators built into mail clients (like NewsGator and the awesome new RSS functionality in Thunderbird), it’s very easy to just forward a post like an email.

(I thought about calling them “caching” instead of “stored,” but the term “cache” implies that it’s holding something for a set period of time for efficiency reasons. I don’t think that applies here.)

Now, this differentiation isn’t earth-shaking, except that the latter type of aggregator is essentially the same thing as someone visiting your site and hitting “Save page as…” and storing copies of your site on their machine.

This effectively circumvents the implicit advantage of a Web site in that it’s supposed to always contain the most up-to-date information. Users with an aggregator that stores posts can very easily be looking at an old copy of some content from your site.

For instance, I was opining about my newest TV obsession over at my fledgling personal blog. I’m using WordPress over there (love it, incidentally), and it’s much easier (much, much easier) than Movable Type to keep working on something in draft. However, eventually I published the post and then kept right on editing – I made quite a few changes after it hit the site (and the feed) for the first time.

This morning, I get to work, and Thunderbird has a copy of the post that’s about three versions old. I look in Sage, and, of course, it’s the latest version.

It all comes down to how your aggregator handles modified posts. I know that NewsGator can be configured to download them new every time they change. But this ends up being kind of goofy because if someone like me keeps editing, you get a new post for every edit that the aggregator detects.

Other aggregators will just highlight an edited post as “unread” (the convention is to make it bold). Others (like Thunderbird, for instance), will apparently not do anything – the first version is the one they keep.

Like I said, this isn’t anything earth-shaking, but it’s worth keeping in mind. We tend to think of the Web as being a “they see what I have up there right now” affair. With RSS, this may or may not be true.