Update - 11/02/2006:
There have been a couple developments since I first published this story. I wanted to put in a quick update to let everyone know what's been happening.
First of all, I'm very pleased to report that this was, in fact, an oversight on the part of the Firefox developers, and they are currently working on a solution to prevent this behavior.
Secondly, it seems that the same issue occurs with Yahoo in addition to Google. This was reported by one of the visitors to this site (thanks, dj), and later confirmed by the German news site Heise Online (here's an English translation of the article).
I'd like to thank the Firefox developers for taking such quick action to fix this issue, as well as those of you that helped raise awareness of it.
----------
I've been testing out the latest release candidate for Firefox 2.0 (which has since been officially released). One of the new features in Firefox 2.0 is ?Previewing and subscribing to Web feeds", which allows users to (according to the release notes):
...decide how to handle Web feeds, either subscribing to them via a Web service or in a standalone RSS reader, or adding them as Live Bookmarks. My Yahoo!, Bloglines and Google Reader come pre-loaded as Web service options, but users can add any Web service that handles RSS feeds.
The nice thing about this feature is the ability to preview feeds. When the user clicks on an RSS or Atom link, Firefox renders the feed in a human-readable format, rather than simply displaying the raw XML as it had in the past*. At the top of the feed page, the user has the choice of subscribing to the feed through different services, as described in the above quote. This is where our story gets interesting.
Before continuing, though, let me provide a brief bit of background on my browsing habits. I always browse the web with Firefox set to prompt me to accept any new cookies. I also use a custom installation package that presets my preferences, so the cookie settings are active on initial load of the browser. This is the reason I noticed this issue to begin with, as I'll explain later.
So, I have Firefox installed and configured, and I begin testing out the new RSS features. The first time I hit a feed (in this case, my own LegRoom.net feed) I was prompted to accept a cookie from fusion.google.com. I didn't think much of it at first and instinctively denied it, but then I noticed the same prompt after a reinstall, and then again on each other feed I visited. This was clearly being triggered by Firefox itself and not by the feed website.
I couldn't find any explanation for this behavior, so at this point I did what any good little geek would do: I fired up a copy of Wireshark (formerly Ethereal) and started sniffing network traffic. After a bit of analysis, I found that immediately after every feed page is loaded, Firefox makes call to Google. Specifically, this is the HTTP data that is exchanged (from Wireshark):
No. Time Source Destination Protocol Info
32 3.456833 x.x.x.x 72.14.203.99 HTTP GET /favicon.ico HTTP/1.1
Hypertext Transfer Protocol
GET /favicon.ico HTTP/1.1\r\n
Request Method: GET
Request URI: /favicon.ico
Request Version: HTTP/1.1
Host: fusion.google.com\r\n
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1) Gecko/20061010 Firefox/2.0\r\n
Accept: image/png,*/*;q=0.5\r\n
Accept-Language: en-us,en;q=0.5\r\n
Accept-Encoding: gzip,deflate\r\n
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7\r\n
Keep-Alive: 300\r\n
Connection: keep-alive\r\n
Referer: http://www.legroom.net/backend.php\r\n
No. Time Source Destination Protocol Info
36 3.587770 72.14.203.99 x.x.x.x HTTP HTTP/1.1 302 Found (text/html)
Hypertext Transfer Protocol
HTTP/1.1 302 Found\r\n
Request Version: HTTP/1.1
Response Code: 302
Location: http://www.google.com/favicon.ico\r\n
Set-Cookie: PREF=ID=7e3b29ec472e1e47:TM=1161728606:LM=1161728606:S=4O7AZZG6VrvWd_V5; expires=Sun, 17-Jan-2038 19:14:07 GMT; path=/; domain=.google.com\r\n
Content-Type: text/html\r\n
Server: igfe\r\n
Transfer-Encoding: chunked\r\n
Content-Encoding: gzip\r\n
Date: Tue, 24 Oct 2006 22:23:26 GMT\r\n
Cache-Control: private, x-gzip-ok=""\r\n
\r\n
HTTP chunked response
Data chunk (197 octets)
Chunk size: 197 octets
Data (197 bytes)
Chunk boundary
Data chunk (10 octets)
Chunk size: 10 octets
Data (10 bytes)
Chunk boundary
Content-encoded entity body (gzip): 207 bytes -> 230 bytes
Line-based text data: text/html
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="http://www.google.com/favicon.ico">here</A>.
</BODY></HTML>
Based on this information, it appears that Firefox is contact Google in order to download the icon used in the Subscription menu select box on the feed preview page. In the process, Google sets a cookie.
Ok, on the surface this sounds innocent enough, but the more I thought about it the more it bothered me. For starters, why would Firefox have to download the favicon Google to begin with? There are three other icons in the same Subscription menu: Live Bookmarks, Bloglines, and My Yahoo. Firefox is obviously able to load local favicons for those services, so why would it possibly need to download Google's favicon from Google's servers? From a technical perspective, it's both wasteful and inefficient.
Second, why is it necessary to include all of the extra data to grab the favicon? Eg, why does Google have to be told which feed I'm browsing (the referer attribute) or what my exact user browser version is (the User-Agent attribute)? The favicon could just as easily be downloaded without providing this information.
Third, why does Google have to set a cookie when providing the favicon? Cookies are used for tracking and session management. In this case, all I'm doing is downloading a graphic. That's it. There is no session management to perform. Again, from a technical perspective this is unnecessary, wasteful, and inefficient.
So at this point, in the best case Google knows your browser version, the page you were visiting, the time you visited, and your IP address for correlation. Now let's examine the cookie. Notice that it's a root domain cookie (.google.com) and not something separate for fusion.google.com. Notice also that it doesn't expire until 2038. Assuming you accept the cookie, which almost everyone will (explained below), Google can also correlate your feed views with all other Google services through the .google.com cookie. I cannot think of any technical need for this, either.
With all of that said, let me stress that I'm not trying to sound any conspiracy theories here. It may very well be some technical limitation or a simple oversight. After all, Google already knows what you search for, what and who you e-mail, who you chat with and what you chat about, who you socialize with, what your social life looks like, what files are stored on your computer, what documents and spreadsheets you work on, what you blog about, what pictures you share, what you shop for, what newsgroups you read, what current events you keep up with, how you run your website, what stocks you monitor, what books you like to read, and, of course, what newsfeeds you read.
Considering all of the above, is there really much benefit to tracking your feed usage through Firefox? To be honest, I just don't know. However, given the fact that Google is very much in the business of collecting data from its users, and that Google has a very well-known relationship with the Mozilla Foundation/Corporation, I feel that this was done intentionally. Furthermore, this tracking is always enabled by default, and there's no way of stopping it.
As I mentioned a couple times above, the only reason I noticed this issue is because I use a customized installer for Firefox that, among other things, begins blocking cookies on initial load. This is important because the default Firefox start page is a Firefox-branded Google search page. So, the first time Firefox is launched it will contact Google and accept Google's cookie (default behavior). Even if the first thing you do after launching Firefox for the first time is to adjust your cookie preferences to prompt you before accepting anything, the Google cookie has already been accepted. Again, I don't think this is part of some conspiracy, but it does compound the issue by making detection of the call to Google's servers that much more difficult.
My biggest question at this point is simply, "Why?". Does anyone know why Firefox does this? Has anyone seen any other reference to this, or acknowledgment of this behavior? I find it very odd that Firefox 2.0 loads the Google favicon, and only the Google favicon, remotely for the Feed Preview page, when the favicons for three other services displayed in the same page are loaded locally. I can think of no technical reason for this, and the only non-technical reason that comes to mind is that Google wants to track your newsfeed habits, and the Firefox made it happen.
What do you think? Am I being overly paranoid here (it's certainly been known to happen)? Feel free to leave your comments below, or just send me an e-mail. (Unfortunately, due to issues with comment spam, I had to disable anonymous comments. You will need to sign in to post a comment.)
-----
*A major issue that I have with RSS Preview is that Firefox will display this preview page even if the webmaster has already written an XSL transform to display the feed in human-readable form. I find this very frustrating, as I spent a lot of time styling the RSS feed for my site, making sure the look and feel matches that of the rest of the site, takes advantage of certain RSS elements available on my site that may not be available on others, etc. However, Firefox 2.0 ignores all of this and instead displays the feed using its own preview style. While this is a great feature for sites that only display raw XML, I strongly feel that Firefox should respect the webmaster's design if he's taken the time to create and specify a particular style/transform for the feed. At the very least there should be an option for users to enable the built-in preview style for all feeds rather than just raw feeds, with it set to only use the preview style for raw feeds by default.