SES Chicago: Bulk Submit 2.0
By Matt McGee on Dec 5, 2006 in Education/Conferences, Google, Yahoo
My notes from the Tuesday, 10:15 am session titled “Bulk Submit 2.0.”
Danny Sullivan, Moderator
- used to be able to email Infoseek a list of 500 URLs and they’d be added to their index almost right away
- this went away and kinda got replaced by paid inclusion
- now it’s making a comeback with something like Google Sitemaps
- at PubCon, all 3 SEs now cooperating on sitemap protocol
Amanda Camp, Google
- my first project at Google was Google Sitemaps
- Add URL page is not the recommended way to submit to Google
- recommended way is Google Sitemaps
- Sitemaps helps us be smarter about the way we crawl — tell us when pages have been updated
Four formats
1) Text file of URLs
2) RSS/Atom feeds
3) Sitemap protocol
4) OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting)
* An HTML sitemap is not the same as a Googl Sitemap
Simple Rules
- always contain the full URL
- remove unnecessary parameters, like sessions IDs
- place sitemap in highest directory of URL you’re submitting
- domain needs to match path you’re submitting (www vs. non-www, etc.)
- name sitemap whatever you want
- URLs must use UTF-8 encoding
- sitemaps max = 50,000 URLs or 10mb, index files a max of 1,000 sitemaps
- use GZIP to compress sitemaps
Text File format
- one URL per line
- max of 50k URLs
- text file should contain nothing but list of URLs
Syndication feed
- Google accepts RSS 2.0 or Atom 0.3 feeds that use
field
(too fast)
XML format
(gives examples of various XML tags and what they mean)
Sitemap Generators
- Google offers its own, and there are 3rd party generators, too (not endorsed by Google)
Submission
- use Google submission form
- try to accept new sitemaps within half hour (will say “okay” on dashboard, or “errors” if problems found)
- adding a Sitemap is optional; can still use Webmaster Central without having a sitemap
More info:
www.google.com/webmasters
www.sitemaps.org
Amit Kumar, Yahoo - manager, Site Explorer
Site Explorer: browse your pages and links; site authentication; bulk submission; new features coming soon
Site Explorer (SE) interface
(shows screenshots of submit URL page and dashboard)
- submit page doesn’t require login
- authentication process slightly different from Google: download file, upload to your site, click button in SE
- SE shows pages, inlinks, subdomains
Eric Papczun, Performics
* biggest challenge for large sites is getting a complete and accurate list of URLs
* use due-diligence to make sure duplicate pages aren’t listed
* after submission, sitemaps picked up within 1-2 days; entire sitemap crawled within 3-14 days (avg. about a week); small sites and new sites will take longer to get sitemap crawled — submit new content regularly
Sitemap Mgmt. Tips
- have optimized native sitemap (HTML version)
- focus the crawler by excluding redundant content (i.e., print-friendly pages), disembodied content (i.e., Flash objects), spammy stuff
- use “preferred domain” to tell Google if you want www.domain.com or just domain.com to appear in SERPs
- use separate sitemap (in Google) for news and mobile sitemaps
Impact / What to Expect
- sitemaps are tools, not solutions
- we’ve seen two effects: 1) number of pages indexed goes up (example: retailer with ugly URLs went from 61k to 133k URLs); 2) number of pages indexed goes down (eliminating dupe content URLs)
* select URLs for more frequent crawling with “priority” XML tag
* use it to spotlight frequently updated pages, new pages
* we’ve found Google is responsive to this tag
Handling Errors
- 404 errors will probably be most frequent error
- might be from typos in URL, server issues, etc.
* if you don’t have a robots.txt file, Google assumes you’re okay with full crawl of your site
* on large, dynamic sites we typically see about 5% of site crawled per visit
Todd Friesen, Range Online Media
Bulk submit via feeds
Paid Inclusion Feeds
- Yahoo Search Submit
– publish, include, and refresh content without relying on spiders
– refresh natural results within 48 hours
– provide relevant and targeted copy
– quickly update listings to reflect sales/promotions
– detailed tracking
Comparison Shopping Engine Feeds
- comparison engines convert, esp. MSN Shopping — if you’re not using MSN Shopping, you’re missing out
- Google Base
– free, and it converts
– ranks by relevance over price
– limited user support
– lots of competition
- MSN Shopping
– reasonable CPCs
– best grouping algo in the Big 3
- lower volume than Base
– great conversion
- Yahoo Shopping
– highest volume, but most expensive
– send the most traffic
– user reviews somewhat outdated
– poor product grouping
– high CPCs
(shares two case studies)
* paid inclusion can be used for immediate results or to pick up pages where control over on-site content is limited
* results in days, not months
* doesn’t tie up client IT resources
* makes A/B testing in natural SERPs possible
Q&A
Todd: always put fresh content on home page because home page is always the page that gets crawled the most; update and resubmit Sitemaps regularly;
Eric: (question re: news story feeds) — also supplement sitemaps with PPC
Todd: (question re: aging delay and new sites) — yes, use sitemaps if you’re a new site; get trusted links like Yahoo Directory, BOTW, Business.com; buy an old site with trusted links and 301 the whole thing to your site to get credit for links
Technorati Tags: seschicago06, ses, seo
- SES Chicago: Video Search Optimization
My notes from the Monday, 9:00 am, panel on "Video Search Optimization" Jon Leicht, Intuit - nobody is optimizing video... - SES Chicago 06: Images and Search Engines
Here are my notes from the Wednesday, 9 am session titled "Images and Search Engines." Shari Thurow, Grantastic Designs SEO... - SES Chicago: Local Search Marketing Strategies
My notes from the Tuesday, 1:15 pm session on "Local Search Marketing Strategies." Greg Sterling, Moderator * local search very... - Sitemaps not just for Google anymore…
While I haven't personally had much reason to use the Google Sitemaps tool/service, I'm betting a lot of you do...
















1 Comment(s)
By IEmailer.com on Jun 11, 2007 | Reply
Well, Thank you for the grate information you got here it really helps me getting indexed as my website was not indexed since a 12 days until i submitted my sitemap in xml file format and now am totally indexed 31 pages so thanks