RSS Feed for This PostCurrent Article

SES Chicago: Bulk Submit 2.0

SES ChicagoMy notes from the Tuesday, 10:15 am session titled “Bulk Submit 2.0.”

Danny Sullivan, Moderator

- used to be able to email Infoseek a list of 500 URLs and they’d be added to their index almost right away
- this went away and kinda got replaced by paid inclusion
- now it’s making a comeback with something like Google Sitemaps
- at PubCon, all 3 SEs now cooperating on sitemap protocol

Amanda Camp, Google

- my first project at Google was Google Sitemaps
- Add URL page is not the recommended way to submit to Google
- recommended way is Google Sitemaps
- Sitemaps helps us be smarter about the way we crawl — tell us when pages have been updated

Four formats
1) Text file of URLs
2) RSS/Atom feeds
3) Sitemap protocol
4) OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting)
* An HTML sitemap is not the same as a Googl Sitemap

Simple Rules
- always contain the full URL
- remove unnecessary parameters, like sessions IDs
- place sitemap in highest directory of URL you’re submitting
- domain needs to match path you’re submitting (www vs. non-www, etc.)
- name sitemap whatever you want
- URLs must use UTF-8 encoding
- sitemaps max = 50,000 URLs or 10mb, index files a max of 1,000 sitemaps
- use GZIP to compress sitemaps

Text File format
- one URL per line
- max of 50k URLs
- text file should contain nothing but list of URLs

Syndication feed
- Google accepts RSS 2.0 or Atom 0.3 feeds that use field
(too fast)

XML format
(gives examples of various XML tags and what they mean)

Sitemap Generators
- Google offers its own, and there are 3rd party generators, too (not endorsed by Google)

Submission
- use Google submission form
- try to accept new sitemaps within half hour (will say “okay” on dashboard, or “errors” if problems found)
- adding a Sitemap is optional; can still use Webmaster Central without having a sitemap

More info:
www.google.com/webmasters
www.sitemaps.org

Amit Kumar, Yahoo - manager, Site Explorer

Site Explorer: browse your pages and links; site authentication; bulk submission; new features coming soon

Site Explorer (SE) interface
(shows screenshots of submit URL page and dashboard)
- submit page doesn’t require login
- authentication process slightly different from Google: download file, upload to your site, click button in SE
- SE shows pages, inlinks, subdomains

Eric Papczun, Performics

* biggest challenge for large sites is getting a complete and accurate list of URLs
* use due-diligence to make sure duplicate pages aren’t listed
* after submission, sitemaps picked up within 1-2 days; entire sitemap crawled within 3-14 days (avg. about a week); small sites and new sites will take longer to get sitemap crawled — submit new content regularly

Sitemap Mgmt. Tips
- have optimized native sitemap (HTML version)
- focus the crawler by excluding redundant content (i.e., print-friendly pages), disembodied content (i.e., Flash objects), spammy stuff
- use “preferred domain” to tell Google if you want www.domain.com or just domain.com to appear in SERPs
- use separate sitemap (in Google) for news and mobile sitemaps

Impact / What to Expect
- sitemaps are tools, not solutions
- we’ve seen two effects: 1) number of pages indexed goes up (example: retailer with ugly URLs went from 61k to 133k URLs); 2) number of pages indexed goes down (eliminating dupe content URLs)

* select URLs for more frequent crawling with “priority” XML tag
* use it to spotlight frequently updated pages, new pages
* we’ve found Google is responsive to this tag

Handling Errors
- 404 errors will probably be most frequent error
- might be from typos in URL, server issues, etc.

* if you don’t have a robots.txt file, Google assumes you’re okay with full crawl of your site
* on large, dynamic sites we typically see about 5% of site crawled per visit

Todd Friesen, Range Online Media

Bulk submit via feeds

Paid Inclusion Feeds
- Yahoo Search Submit
– publish, include, and refresh content without relying on spiders
– refresh natural results within 48 hours
– provide relevant and targeted copy
– quickly update listings to reflect sales/promotions
– detailed tracking

Comparison Shopping Engine Feeds
- comparison engines convert, esp. MSN Shopping — if you’re not using MSN Shopping, you’re missing out

- Google Base
– free, and it converts
– ranks by relevance over price
– limited user support
– lots of competition

- MSN Shopping
– reasonable CPCs
– best grouping algo in the Big 3
- lower volume than Base
– great conversion

- Yahoo Shopping
– highest volume, but most expensive
– send the most traffic
– user reviews somewhat outdated
– poor product grouping
– high CPCs

(shares two case studies)

* paid inclusion can be used for immediate results or to pick up pages where control over on-site content is limited
* results in days, not months
* doesn’t tie up client IT resources
* makes A/B testing in natural SERPs possible

Q&A

Todd: always put fresh content on home page because home page is always the page that gets crawled the most; update and resubmit Sitemaps regularly;

Eric: (question re: news story feeds) — also supplement sitemaps with PPC

Todd: (question re: aging delay and new sites) — yes, use sitemaps if you’re a new site; get trusted links like Yahoo Directory, BOTW, Business.com; buy an old site with trusted links and 301 the whole thing to your site to get credit for links

Technorati Tags: , ,

Feel free to share this with friends: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Sphinn
  • StumbleUpon
  • del.icio.us
  • Mixx
  • Facebook
  • TwitThis
You Might Also Like These Posts:

Trackback URL

  1. 1 Comment(s)

  2. By IEmailer.com on Jun 11, 2007 | Reply

    Well, Thank you for the grate information you got here it really helps me getting indexed as my website was not indexed since a 12 days until i submitted my sitemap in xml file format and now am totally indexed 31 pages so thanks

  1. 2 Trackback(s)

  2. Dec 11, 2006: SES Chicago 2006 Sessions Recap Blog Links.--BizMord Search and Marketing Blog
  3. Mar 21, 2007: SES Chicago 2006 Coverage Round Up | Search Marketing Standard Blog

Post a Comment

Important: Please read my comment policy (link will open in a new window) before you join the conversation.