Catching Up: NYT on Google Algorithm

Filed in Google by Matt McGee on June 10, 2007

This is a catch-up post that I’m writing and publishing mainly for my own future reference. It’s about the New York Times article that ran last Sunday, Google Keeps Tweaking Its Search Engine, which was the hot topic at SMX. Due to my crazy schedule all week, I only got around to reading it today. If you’ve read it and digested all of the other blog/forum discussion about the article, feel free to move along; there’s nothing for you here. πŸ™‚

Takeaways from the article:

  • “…the search-quality team makes about a half-dozen major and minor changes a week to the vast nest of mathematical formulas that power the search engine”
  • Google employees can use the “Buganizer” to report a search-related problem; “about 100 times a day they do”
  • “Recently, a search for ‘French Revolution’ returned too many sites about the recent French presidential election campaign β€” in which candidates opined on various policy revolutions β€” rather than the ouster of King Louis XVI. A search-engine tweak gave more weight to pages with phrases like ‘French Revolution’ rather than pages that simply had both words.”
  • Google has a problem called “Debug” that “shows how its computers evaluate each query and each Web page”
  • fixing a poor-quality local search: “…Google’s formulas were not giving enough importance to links from other sites about Palo Alto.”
  • QDF = “query deserves freshness” — a solution to not seeing enough new content for certain queries; “THE QDF solution revolves around determining whether a topic is ‘hot.’ If news sites or blog posts are actively writing about a topic, the model figures that it is one for which users are more likely to want current information. The model also examines Google’s own stream of billions of search queries, which Mr. Singhal believes is an even better monitor of global enthusiasm about a particular subject.
  • Google “makes a copy of the entire Internet β€” every word on every page β€” that it stores in each of its huge customized data centers so it can comb through the information faster.”
  • Google’s ranking system “involves more than 200 types of information, or what Google calls ‘signals.’ PageRank is but one signal. Some signals are on Web pages β€” like words, links, images and so on. Some are drawn from the history of how pages have changed over time. Some signals are data patterns uncovered in the trillions of searches that Google has handled over the years.”
  • Queries are processed through “classifiers” which determine the type of search. “Classifiers can tell, for example, whether someone is searching for a product to buy, or for information about a place, a company or a person.”
  • Just as spellcheck can fix query typos, Google’s system can recognize abbreviations and synonyms to produce results.

This article generated a lot of discussion, but none of it was more off-target than what Robert Gorell wrote on GrokDotCom: “…aren’t most SEO firms just selling Search Engine Optimism?” Bill Slawski, Stoney deGeyter, and others do a fine job responding to that poorly-worded question (hint: the word “most” is the problem) so I’m leaving it as is. Some of the other useful discussion of the article includes:

Comments (4)

Trackback URL | Comments RSS Feed

Sites That Link to this Post

  1. blog.rightreading.com » New insights into the Google search algorithm | June 10, 2007
  1. Matt,

    Did you really read the comments on my post? I admitted that “most” was a bit unfair (but ‘poorly worded’ is probably more accurate, thanks) and further explained my perspective.

    I’d be curious to know what you think of Mike Grehan’s ClickZ column. Grehan’s much more of an SEO expert than I care to be, and his words are considerably harsher. Is he off-target as well?

    (Nice blog, by the way… We mention it on our Blog Buzz podcast [WebmasterRadio.fm] on occasion.)

    -Robert

  2. Matt McGee says:

    I did read all the comments, Robert – and saw your follow-ups. And thanks for the kind words about SBS and mentions on the podcast. πŸ™‚

    As for Mike G.’s column, his “big picture” point is not at all off-target. But I disagree with the idea that it “changes everything.” The good SEOs have been doing all this “new” stuff for some time now. SEWatch interviewed me about it here:

    What Does Universal Search Mean for SEM?

    And I also wrote about Universal Search here on SBS.

    I would still disagree with the idea that SEO should stand for Search Engine Optimism. That may apply to some of the hucksters, but not at all to the dozens of people I know via conferences, blogs, etc. πŸ™‚

  3. Matt, I’d argue that the dozens of people you know via blogs, conferences, etc., are the serious people in your industry–and that there’s countless other SEO firms that will take anyone’s business, no matter how poorly planned their site/strategy is to begin with. THOSE are the people who add noise to the marketplace, and they’re the SEO firms I’m addressing. Those are the countless little-known firms our SMB clients complain of before we take them on. And, frankly, most of those folks don’t consider themselves to be hucksters (even if we might be quick to point them out).

    That’s why, in my experience, your colleagues remain the few. Even without the SEWatch piece, I don’t have to know about specific work you’ve done to draw that conclusion; the blog and your about page let me know you’re interested in putting strategy before tactics. And, at the SMB level, that remains sadly rare.

    Anyway, we’re starting to face this same phenomenon recently at Future Now at some level. It turns out a lot of former SEO lemmings are moving to “Conversion Rate Marketing.” Oh, well… They’ve got a lot of catching up to do πŸ˜‰

    Thanks for clarifying and for the link to that article… Interesting stuff.