Not Quite Caffeine – This is like Google Light

Notice anything strange about the search results lately?  It hasn’t yet had a big effect on any of my own sites, but there seems to be an odd trend that I noticed when I was searching for something else altogether.

It seems like Google isn’t indexing very many pages right now!  I’m not sure if they’ve really cranked up their link juice thresholds or their spam filtering – or what – but it seems like Google is only indexing a fraction of the pages they used to show.  And its not just a hunch, I found a post from August 11, 2009 that helps put some real numbers behind the intuitive observation:

The Search

“Mike site:www.facebook.com”

  • Google results (Aug 11, 09) “of about 248,000,000” for Mike. (0.32 seconds)
  • Caffeine results (Aug 11, 09) “of about 282,000,000” for Mike. (0.23 seconds)
  • Google-Lite (Feb 22, 09) “of about 7,460,000” from www.facebook.com for Mike. (0.24 seconds)

The Caffeine Buzz with No Calories

So in comparing the results, it would appear as though Google has achieved the speeds of Caffeine, except they’ve done so by drastically reducing the number of pages included in the index instead of by some algorithm or server changes that processes more pages more quickly.

Now unfortunately, the Caffeine test server is no longer publicly available so there is no way to test this again unless you can find someone’s post data from back when Caffeine was still in open testing.

But the current search keyword results are quite different than the ones that were being shown in the Caffeine sandbox six months ago, so I wouldn’t go so far as to say that Caffeine is live.  Its not – at least not in the full form it was originally introduced as.

Now Less Filling, Less Taste!

I’m not entirely sure what to think about this low-volume Google index other than the fact that I don’t like it.  A lot of popular pages are conspicuously missing from the results, especially ones that have been successful virally through the various social networks.  It seems like a strong anti-SEO filter being applied at the moment, but its one that has obviously gone too far and started knocking down legitimate pages.  Even searching for article names with a domain name keyword added often can’t get you the specific page you’re looking for.

So I’m hoping this is a temporary – maybe one extreme in a spectrum of strictness and permissibility.  Whatever it is, its been mentioned already in a few places and some of the social networks are starting to wonder if some kind of conspiracy is afoot to start filtering the results toward a more politically correct end.  I also wouldn’t go that far.

Hopefully, Caffeine shows up in some form resembling the beta testing from August ’09.  That was a pretty popular set of results, and it really did cover more of the web in less time.  There’s an upgrade we can all agree on – but I don’t think many users are happy will be happy with trading 80% of the web’s content for a fractional second of search speed.  Remember Google, we’re supposed to be building the web for humans – not bots!

4 Comments

  1. hi John,
    I just checked your search “Mike site:www.facebook.com” here in California, and got 7,920,000. So if it used to yield 248 million results, it is definately a lot smaller now. The question would be if it is temporary or permanent, and whether they’ve just gotten rid of excess bathwater, or if they’ve tossed a few babies as well?
    I have seen some “weird” results lately. Maybe a storm is a brewin’.
    Steve, tradeshow ninja

  2. Specifically, Reddit brought it my attention that searching for __Epic beard guy dramatica__ doesn’t actually show the Epic Beard Guy article on Encyclopedia Dramatica. It will take you to hundreds of websites linking to the article in question, but it won’t show the specific page we’re obviously searching for.

    So I would say this tweak has gone way too far already and is interfering with the quality of searches. Somehow, I escaped the hit, so I’m not sure exactly what is getting slapped.

    Maybe since they’re so focused on their new social searches, that they’ve tried to separate the net into multiple indexes?

  3. Give it time. It’s still rolling out. Do a multi datacenter pagerank check to see if all DC’s have your site or other people you know updated at all centers. I’m thinking end of the month, full speed ahead.

  4. You can check here. Looks like more than a third are still not on. Interestingly there were more DC’s that showed pagerank for my site a few days ago. Looks like Google is taking 3 forward and two backwards steps on roll out.

Leave a Reply

Your email address will not be published.


*