Google Sitemaps for Large Scale Sites
October 17th, 2006 by Michael Gray in IdeasIf you're new here, you may want to subscribe to my RSS feed. Read my top posts or learn more about Michael Gray. Want more frequent updates follow me on Twitter. Thanks for visiting!
If you’ve got a any first hand experience working with and submitting sitemaps for sites with over 50K pages drop me an email I’d like to talk with you.
Sphere It










October 17th, 2006 at 12:20 pm
I don’t but Mike, I’ve gotta ask you…what’s better, using the Sitemap plugin for Wordpress or using GSite Crawler and manually submitting the sitemaps?
October 17th, 2006 at 1:08 pm
It’s a non wordpress site that grows somewhat frequently. I’m really leaning toward creating a small bunch of mini-sitemaps just so I can keep it manageable.
October 17th, 2006 at 8:45 pm
I would be very interested in how people do this. I have a few big non-wordpress sites for which I would like to create a sitemap.
October 18th, 2006 at 9:25 am
I was going to email you off line but..
I crawl the site with Xenu. Sometimes I will crawl the actual site and other time I will crawl it on my local server.
I then export to a tab file. You can use Excel along with some clever find/replaces and cell split/merges to come up with a sitemap. Remember you’re only interested in Type=text/html
I also have an Excel spreadsheet with a Macro to automate the entire process. I found it at DP.
October 25th, 2006 at 3:56 pm
You just need to create a sitemap index that then ‘links’ to all your other sitemaps. The other sitemaps can then be broken down by id (eg sitemap_1.xml is id 1 to 25000, sitemap_2.xml is 25001 to 50000, etc), by category (blue.xml, red.xml, etc).
February 15th, 2007 at 4:41 am
I use the apckage from: http://www.xml-sitemaps.com/ and works fine. I produce sitemaps of 40Mb+ each week.
It would be larger but all the permutations of pages on my site go into millions so asked the software to hold it at a level (4 links deep) of 4 i.e. http://www.site.com/1/2/3/4/
March 9th, 2007 at 11:11 am
How do you get a sitemap size of ower 40mb, that’s crazy. Then you have only duplicates of the links.