« Google Browser Sync Helps Recover from Crashes | Main | Useless turns useful with Alchemy »

Google Sitemap Generator in PHP

I've updated the PHP class I built for NewsCloud.com which generates a Google Sitemap. I think my PHP sitemap generator is best for dynamic sites that you've coded yourself.

The new class supports Googles parent Index Map capability. So, now you can generate an Index map linking to a number of child sitemaps.

For large sites, this allows you to break up content in a more sensible manner. The source code below is set up to divide content by recency. This helps to minimize the number of times Google must load your larger site maps with all of your historical content to date.

e.g. Google will check the hourly child map with just the last hour of new content. But it will only check the super large monthly child map every 30 days. You can configure your child maps with a different organizational method. The code is pretty flexible.

The source below is pretty well documented. If you have any questions or suggestions, please post a comment below.

If this helps you, please use the tip jar on the left. Or try out NewsCloud.com and send invites to your friends.

Technorati Tags: , , , , , , , ,

<?php

class siteMap {
// class for building google site map file
// requires PEAR HTTP request library on your server
/* example usage
require_once($_SERVER['DOCUMENT_ROOT'].'/classes/siteMap.class.php');
// instantiate the class object
$siteMapObj=new siteMap();

// you must build the individual maps first

// I set up different maps according to the most frequent changing content of my site
// You can set up your maps any way that you want

// map to all the content from the last hour
$siteMapObj->buildMap('hourly');
// map to all the content from the last day
$siteMapObj->buildMap('daily');
// map to all the content from the last week, etc.
$siteMapObj->buildMap('weekly');
// you could have one called 'all' which builds link to every url on your site

// I divided mine this way so that google only checks a small list of urls hourly
// a slightly larger list daily, a larger list weekly and very large lists only occasionally

// call buildIndexMap after updating any individual child maps above
// just the time stamps from each individual map file are updated in the indexmap
// warning: if a individual map hasn't been built - the index map won't include a reference to it
$siteMapObj->buildIndexMap();
*/

function siteMap() {
// initialize
}

function buildIndexMap() {
// build the parent or Index map
// this will just list the child maps and their modification dates
$FILE = fopen ($_SERVER['DOCUMENT_ROOT'].'/sitemap.xml', "w");
$text='<?xml version="1.0" encoding="UTF-8"?>';
$text.='<sitemapindex xmlns="http://www.google.com/schemas/sitemap/0.84"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.google.com/schemas/sitemap/0.84
http://www.google.com/schemas/sitemap/0.84/siteindex.xsd">';
$text.=$this->addChildMap('http://www.yourdomain.com/map_hourly.xml',$_SERVER['DOCUMENT_ROOT'].'/map_hourly.xml');
$text.=$this->addChildMap('http://www.yourdomain.com/map_daily.xml',$_SERVER['DOCUMENT_ROOT'].'/map_daily.xml');
$text.=$this->addChildMap('http://www.yourdomain.com/map_weekly.xml',$_SERVER['DOCUMENT_ROOT'].'/map_weekly.xml');
$text.='</sitemapindex>';
fwrite ($FILE, $text);
fclose ($FILE);
// notify Google that we've updated the index map
// you don't have to notify them of the child map updates
$this->pingGoogle();
}

function addChildMap($loc,$filename) {
// build a child map xml entry for the Index map file
if (file_exists($filename)) {
$text='<sitemap>';
$text.='<loc>'.$loc.'</loc>';
// look up last modified time for the file and convert to w3c
// to do: php 4 doesn't have P format for grenwich mean time difference, so -07:00 is for west coast, changes with daylight savings
$text.='<lastmod>'.date ("Y-m-d\TH:i:s", filemtime($filename)).'-07:00</lastmod>';
$text.='</sitemap>';
return $text;
} else
return '';
}

function buildMap($mapName='core')
{
// build a child map based on $mapName
$text='<?xml version="1.0" encoding="UTF-8"?>';
$text.='<urlset
xmlns="http://www.google.com/schemas/sitemap/0.84"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.google.com/schemas/sitemap/0.84
http://www.google.com/schemas/sitemap/0.84/sitemap.xsd">';
// choose which child map to build
switch ($mapName) {
case 'hourly':
$filename='map_hourly.xml';
// customize the parts below to the content on your site
// e.g. static
$text.=$this->addURL('http://www.yourdomain.com/cover/','1.0','always');
$text.=$this->addURL('http://www.yourdomain.com/list/topstories/','0.9','always');
$text.=$this->addURL('http://www.yourdomain.com/list/topmedia/','0.9','always');
// e.g. dynamic
// stories from the past hour
while ($data=$this->db->read()) {
$text.=$this->addURL('http://www.yourdomain.com/story/'.$data->contentid.'/','0.9','hourly');
}
break;
case 'daily':
// customize the parts below to the content on your site
// e.g. static
$filename='map_daily.xml';
$text.=$this->addURL('http://www.yourdomain.com/groups/Daily%20Show%20Fans/','0.8','hourly');
$text.=$this->addURL('http://www.yourdomain.com/register/','0.9','monthly');
$text.=$this->addURL('http://www.yourdomain.com/search/','0.9','monthly');
$text.=$this->addURL('http://www.yourdomain.com/learn/more/','0.3','weekly');
$text.=$this->addURL('http://www.yourdomain.com/learn/feedlist/','0.5','weekly');
$text.=$this->addURL('http://www.yourdomain.com/store/','0.5','weekly');
$text.=$this->addURL('http://www.yourdomain.com/faq/','0.3','weekly');
$text.=$this->addURL('http://www.yourdomain.com/learn/about/','0.3','weekly');
$text.=$this->addURL('http://www.yourdomain.com/submit/story/','0.5','monthly');
// e.g. dynamic
// stories from the past day
while ($data=$this->db->read()) {
$text.=$this->addURL('http://www.yourdomain.com/story/'.$data->contentid.'/','0.9','hourly');
}
break;
}
$text.='</urlset>';
$FILE = fopen ($_SERVER['DOCUMENT_ROOT'].'/'.$filename, "w");
fwrite ($FILE, $text);
fclose ($FILE);
}

function addURL($url='http://www.yourdomain.com/',$priority='0.5',$freq='daily') {
// build a single url to add to the current child map file
$code='<url><loc>'.$url.'</loc><priority>'.$priority.'</priority><changefreq>'.$freq.'</changefreq></url>';
return $code;
}

function pingGoogle() {
// requires PEAR Libraries on your server
// pings google that we've updated our sitemap index file
require_once "HTTP/Request.php";
$callstr="http://www.google.com/webmasters/sitemaps/ping?sitemap=http%3A%2F%2Fwww.yourdomain.com%2Fsitemap.xml";
echo $callstr.'<br/>';
$req =& new HTTP_Request($callstr);
$req->addHeader("User-Agent", "yourdomain");
$response = $req->sendRequest();
if (PEAR::isError($response)) {
echo 'Google SiteMap error';
return FALSE;
} else {
echo 'Google SiteMap Submitted<br/>';
$resp=$req->getResponseBody();
$result=parseResponse($resp);
return $result;
}

}

}
?>

Comments

Angela

This is a great source code. I belong to a community of php coders who will appreciate this.

HiLapelPins

It's cool and working
I use it in my website www.hilapelpins.com

Thanks

manish

Hi, i am unable to congigure it with php. Please help

DevCat

Nice Tutorial
But i don't understand howto use it so far.
Do i have to include the old sitemap class to use the upadte or is this update a complete rewrite?

software gadgets

Uhm..stil lil bit confusing here. this source code should be copy n paste then save as sitemap.xml or sitemap.php ? is that right ? tell me if i'm wrong.

Post a comment

Comments are moderated, and will not appear on this weblog until the author has approved them.

If you have a TypeKey or TypePad account, please Sign In.