View Issue Details

IDProjectCategoryView StatusLast Update
0000064WackoWikicorepublic2009-08-19 09:38
ReporterTann San Assigned ToTann San  
PrioritynormalSeverityfeatureReproducibilityN/A
Status resolvedResolutionfixed 
Product Version4.2 
Target Version4.3.rcFixed in Version4.3.rc 
Summary0000064: Added XML Sitemap Support
DescriptionThis was a simple modification so that each time an update is made to a page the wiki will output a new updated xml file in the xml folder that can be submitted to Google for their sitemap program.

There are two additions to the classes/wacko.php file for this to work. The first is near the end of the SavePage() function where it currently says:

$this->WriteRecentChangesXML();

add this line after it that calls the new function:

$this->WriteGoogleSiteMapXML();

The second change is the actual function, i put it in just after the WriteRecentChangesXML() function:

//------------------Start of code--------------------------

function WriteSiteMapXML()
{
$xml = "<?xml version=\"1.0\" encoding=\"windows-1251\"?>\n";
$xml .= "<urlset xmlns=\"http://www.sitemaps.org/schemas/sitemap/0.9\">\n";

if ($pages = $this->LoadRecentlyChanged())
{
foreach ($pages as $i => $page)
{
if ($this->config["hide_locked"]) $access =$this->HasAccess("read",$page["tag"],"guest@wacko");
if ($access && ($count < 100))
{
$count++;
$xml .= "<url>\n";
$xml .= "<loc>".$this->href("", $page["tag"])."</loc>\n";
$xml .= "<lastmod>". substr($page["time"], 0, 10) ."</lastmod>\n";

$daysSinceLastChanged = floor((time() - strtotime(substr($page["time"], 0, 10)))/86400);

if($daysSinceLastChanged < 30)
{
$xml .= "<changefreq>daily</changefreq>\n";
}
else if($daysSinceLastChanged < 60)
{
$xml .= "<changefreq>monthly</changefreq>\n";
}
else
{
$xml .= "<changefreq>yearly</changefreq>\n";
}

// The only thing I'm not sure about how to handle dynamically...
$xml .= "<priority>0.8</priority>\n";
$xml .= "</url>\n";
}
}
}

$xml .= "</urlset>\n";

$filename = "xml/sitemap-wackowiki.xml";

$fp = @fopen($filename, "w");
if ($fp)
{
fwrite($fp, $xml);
fclose($fp);
}
}

//------------------End of code--------------------------

It will output the last 100 pages to of been changed although you can easily alter that to suit your needs. It works out the change frequency based on the last time you edited a page so your most active pages will get a "daily" frequency and the ones you havent edited for a year will get "yearly". I couldnt work out a nice way to handle the "priority" option so I just hardcoded that to 0.8.
Additional Informationhttp://www.sitemaps.org/protocol.php
https://www.google.com/webmasters/tools/docs/en/protocol.html
TagsXML

Relationships

related to 0000191 resolvedTann San XML Sitemap Support for the full tree of pages 
related to 0000207 resolvedTann San add the option to turn XML Sitemap on / off in the config 

Activities

EoNy

2007-08-31 16:31

manager   ~0000061

Last edited: 2009-01-06 16:58

mb
13-05-2006 21:12

    pls. add your hack also here: http://wackowiki.org/Dev/PatchesHacks/XMLSitemapSupport -> 3. Hacks

!/XMLSitemapSupport
 
Tann San II
13-05-2006 22:55

    Added to main site

Tann San

2007-09-01 07:45

manager   ~0000064

I'm still using this hack in it's original form i.e. what is posted here although I recommend changing the path that the xml file is written to. Google recommends that the sitemap.xml file be stored in the site root directory.

I use a blog and a wiki on my site and they both output their own sitemap xml files. I then have a php file in my root which combines the two into one each time google requests it.

Tann San

2007-09-19 23:03

manager   ~0000091

Outputs the Sitemap XML for google to crawl, you should register it manually with google so you can check your stats and check for sitemap page errors.

Added a blank sitemap.xml file to the site root, the purpose of this is that it will be there when the site is first installed and if able the installer script will make it writable for future use.
You have to give this file write permission manually if the installer can't do it.
Changed the number of pages from 100 to 50,000 which is the maximum number google will process in one file.
Changed the path from /xml/sitemap.xml to /sitemap.xml as google recommends using the site root.

administrator

2008-05-06 01:55

administrator   ~0000324

Revision 354 - removed the slash

$filename = "/sitemap.xml";
to
$filename = "sitemap.xml";

administrator

2008-05-06 03:03

administrator   ~0000325

Revision 355 - changed xmlns value to Sitemap Protocol 0.9

Revision 356 - renamed function WriteGoogleSiteMapXML() to WriteSiteMapXML() since it's no more Google specific :)

administrator

2008-07-14 16:04

administrator   ~0000411

add the sitemap location in robots.txt -> http://www.sitemaps.org/protocol.php#submit_robots

Sitemap: http://www.example.com/sitemap.xml

idea:
we can write / set the value with the installer
additional we can then put the sitemap also in the /xml folder
-> http://www.example.com/xml/sitemap.xml

Tann San

2008-07-16 15:26

manager   ~0000412

google guidelines state that the sitemap.xml file should be in the site root. We can add it to the robots file, that's an easy change.

administrator

2008-07-16 15:46

administrator   ~0000413

but since they offer this new sitemap option in the robots.txt this guideline is obsolete, especially as you can now add multiple sitemaps (and locations)

Tann San

2008-08-02 15:13

manager   ~0000455

for the search engine to find the sitemap.xml file in the xml directory they have to first download the robots file to find the relevant path. That is then the only option.

IF the sitemap is in the root then they can either try and directly access it there or discover it via the robots.txt file.

I get the impression you are not going to leave this one so I'll do as you ask. It's just a bit shit in my opinion.

I will also put a redirect in the rewrite code inside the htaccess file as well although this will only benefit people who have url rewriting enabled.

Issue History

Date Modified Username Field Change
2007-08-31 16:30 EoNy New Issue
2007-08-31 16:30 EoNy Legacy => NEW
2007-08-31 16:31 EoNy Note Added: 0000061
2007-08-31 16:31 EoNy Severity minor => feature
2007-08-31 20:51 administrator Assigned To => Tann San
2007-08-31 20:51 administrator Status new => assigned
2007-08-31 21:02 administrator Legacy NEW => NPJ
2007-08-31 22:57 administrator Note Edited: 0000061
2007-08-31 23:08 administrator Category "Web2.0"-Services => Core
2007-09-01 07:45 Tann San Note Added: 0000064
2007-09-19 23:03 Tann San Status assigned => resolved
2007-09-19 23:03 Tann San Fixed in Version => 5.0.0
2007-09-19 23:03 Tann San Resolution open => fixed
2007-09-19 23:03 Tann San Note Added: 0000091
2007-09-20 12:49 administrator Target Version => 5.0.0
2007-09-20 13:14 administrator Reporter EoNy => Tann San
2008-04-10 19:10 administrator Note Edited: 0000061
2008-05-06 01:43 administrator Note Edited: 0000061
2008-05-06 01:52 administrator Additional Information Updated
2008-05-06 01:55 administrator Note Added: 0000324
2008-05-06 02:45 administrator Summary Added Google Sitemap Support => Added XML Sitemap Support
2008-05-06 02:45 administrator Description Updated
2008-05-06 02:49 administrator Note Edited: 0000061
2008-05-06 03:03 administrator Note Added: 0000325
2008-07-14 15:43 administrator Note Edited: 0000061
2008-07-14 16:04 administrator Note Added: 0000411
2008-07-14 16:07 administrator Description Updated
2008-07-14 16:54 administrator Tag Attached: XML
2008-07-16 15:26 Tann San Note Added: 0000412
2008-07-16 15:46 administrator Note Added: 0000413
2008-07-19 13:57 administrator Issue cloned: 0000191
2008-07-19 13:57 administrator Relationship added related to 0000191
2008-08-02 13:53 administrator Relationship added related to 0000207
2008-08-02 15:13 Tann San Note Added: 0000455
2008-08-23 14:01 administrator Note Edited: 0000061
2009-01-06 16:58 administrator Note Edited: 0000061
2009-08-19 09:25 administrator Fixed in Version 5.0.0 => 4.3.rc
2009-08-19 09:38 administrator Target Version 5.0.0 => 4.3.rc
2010-03-08 10:12 administrator Category Core => core