I met Jon Glick in San Jose this past August at the Search Engine Strategies conference. He spoke on the “Search Engine Algorithm Research” panel, and I was fortunate to bump into him one afternoon on the way out of the convention center. We spoke for about 15 minutes about all kinds of search-related topics, and that led to two month’s worth of e-mails to put this interview together.
Jon has 15 years experience in the search industry and was a key member of the team that built Yahoo!’s own search engine. He’s now Senior Director of Product Search and Comparison Shopping for the shopping search engine Become.com. This is Part 3 of our 3-part interview.
(…continued from part two)
Matt: Let’s wrap this up with some Q&A on shopping search, since that’s what your focus is now with Become.com. First, though, what’s your opinion on why Froogle never took off?
Jon: Google never seemed all that committed to Froogle as a product, and in the end it wasn’t all that differentiated from other comparison shopping sites out there. Early on, they had the goal of using a web crawl to provide more comprehensive listings than other sites. However, this led to quite a bit of inaccurate data, so they eventually switched to merchant datafeeds. Ultimately, they had an undifferentiated product without a solid business or growth model.
The irony is Google only made money on Froogle when users ignored the Froogle listings and clicked on the Google AdSense. Also, by not charging for Froogle, lots of affiliate spam was submitted, and merchants had little incentive to keep their listings current.
The lesson here is: In this competitive a market you have to create a great product. Just using the Google homepage and SERPs to drive traffic to Froogle wasn’t enough to make it successful.
You talked a bit earlier about the AIR™ algo you’re using at Become.com. What details can you share about that?
Become.com set out to create shopping-specific web search that significantly outperforms general search like Google’s. To do that we needed to create a completely different algo. Rather than just counting the number of in-links and the PR of the sites that give them, Become.com’s AIR™ technology looks at two additional factors: how contextually relevant is the site that provides the link and the out-links on a given page. If you are a site about golf, you get more credit when other golf sites link to you than if you get links from, say, political sites. This makes it much harder to compile link farms, since trading random off-topic links has little impact on ranking. By looking at out-links, sites that connect to spam pages can be easily eliminated. If our systems detect even one page in a link farm, it impacts the entire link network.
When people read this, I think some will say, “But, relevant linking and outbound links are also important to other search engines.” I’m not sure how much you want to say, or can say, but is it a case of the AIR™ algo being more finely tuned to these specific factors, i.e., more strict than a traditional search algo?
The reality is that traditional search algorithms don’t tightly integrate context into link weighting. There are two reasons for this: As general engines they lack any specific sense of context (that’s why link farming and blog spam still work), and they compute connectivity independently from the evaluation of page content. On the second item, to combine content with connectivity our crawler can’t just go out and find pages, it has to evaluate the HTML content of the page in realtime. Becomebot represents an evolution from just hoovering in pages to intelligent crawling.
From a theoretical perspective, Become.com models the web space as an elastomeric matrix of interconnected sites. I think of it as sites being interconnected by billions of bungee cords. The higher the contextual relevance between two linked sites, the stronger the bungee cord. If a site is connected to highly contextually relevant sites, there is a strong upward pull on its ranking. Spam site links would also create a strong pull, but down instead of up.
Our Engineering Team will be publishing some white papers on this revolutionary technology in ’07, now that our patents are filed.
I’m curious to know… Generally speaking, how different is a shopping search algorithm from a traditional search algo?
Night and day. Shopping algos are based on keyword weighting (i.e., no PageRank component) and factors that web search doesn’t use (like CTR and bid price). I think of shopping algos as half way between web search and paid listings, and you can see elements of both in the ranking criteria. From the engineering side, it’s a very different technical challenge. Web search is billions of pages of highly unstructured information from a crawl, while shopping search is tens of millions of listings of highly structured information from feeds.
Given the reliance on feeds, what chance does a small online retailer have in shopping search? There are so many sites out there selling products — but they don’t have feeds!
One of the advantages of Become.com for small merchants is that it allows them to actively compete. A big component of ranking in comparison shopping is CTR, so if you bid aggressively and have a product that people like, it doesn’t matter if you’re a mom & pop or a national chain. It’s not like web search where Bob’s Hardware is never going to outrank Home Depot because the disparity in PageRank is just too great.
Merchants of all sizes can easily participate. They can develop feeds using Excel, send them in and pay for clicks using a credit card. We have merchants with one product, and merchants with millions. Once a merchant has developed a feed for one shopping engine, it’s also very easy to send the same feed (or a slight variation) to other engines. We encourage merchants to use Become.com, but also to use Shopping.com, Froogle, Shopzilla, etc. As long as the merchant can get good ROI, the more distribution, the better it is for them.
I often see retailers using product descriptions that come straight from the manufacturer and are word-for-word duplicates as dozens of other sites. Is that one of the bigger problems you see online retailers make? What other SEO mistakes jump out at you as you’re looking at retail sites?
It depends on what type of products the retailer is selling. In comparison shopping we tend to think in terms of normalized and un-normalized products. The normalized products are items like digital cameras where a user is going to see a table of 20 or 30 merchants offering the same camera and one overall description that comes from the site’s own product database. Here’s a quick example: http://www.become.com/shop?q=digital+camera&pid=206094212. In this case, the most important data the merchant can supply are: the UPC or ISBN (so that the item gets mapped correctly), and tax and shipping (so that “total price” can be shown). The description, photo, etc. don’t matter that much because they probably won’t be used.
Un-normalized products are clothing, housewares, etc. These items are rarely mapped to an existing product database and the product is ranked on its own. This is where having a well-structured title and description and an appealing picture can really make a difference, both in ranking and CTR (which also impacts ranking). If your title and description are the same as your competitors, you’re really missing out on an opportunity to outrank them with better keyword placement in the title, or by putting in additional descriptive keywords so you show up for other relevant user queries.
Retail sites and manufacturers are often very specific in their descriptions and miss out on more general user queries. A fashion retailer might give us a title like “Lacoste – Blue Pique Polo” and never use the word “shirt,” so when a user searches for “Lacoste shirt” they don’t come up. Using these alternate and general terms in the description copy is a good idea. Having images is also key. Users don’t click on listings without images, so we downrank them (including products with generic “Sorry No Image” images).
You know, I’d never given that any thought until you mentioned it, but it mirrors my shopping behavior exactly. Do you have any stats about the impact of images on CTR? Having seen the poor quality of product photos on some sites, I have a feeling a lot of smaller retailers don’t appreciate the importance of a good photo.
Photos are absolutely critical to a good user experience. Large, vibrant photos draw users in and give them confidence that they know the quality of what they are buying. I don’t have CTR numbers, but the one time our image server went offline, abandonment went through the roof! Users just go somewhere else. The analogy for merchants is who would shop at their offline store if they spray painted all their boxes black and all shoppers could see was a sticker with the product name and price. If they omit an image, they’re basically doing the online equivalent.
Not surprisingly, apparel merchants are on the cutting edge. They have found that some products sell best if photographed on a model, some if photographed on a mannequin, and others laid out against a neutral background.
One complaint I hear a lot from online retailers is that they can’t get quality links from relevant sites to their product pages because the most relevant sites are usually competitors. What kind of SEO tips can you share with retailers?
Fortunately for them, Yahoo!, Google and MSN are general search engines and give credit for off-topic links … that’s why link farming works. On-topic sources that I’ve always recommended are distributors and suppliers. Also, there are media sources that you advertise on, since many also have a website. The merchant has a financial relationship with these businesses, so it’s easy to contact them and they will often give you links as a way of enhancing the offline business relationship.
Great stuff, Jon. Thank you for spending so much time answering questions. We started this about two months ago, believe it or not!
Thanks for all your patience as we went through this! (roughly 15 pages in MS Word)
[tags]seo, sem, google, yahoo, msn search, msn live search, ask.com, jon glick, online shopping, shopping search, froogle, become.com, comparison shopping[/tags]