Automatically updating Google Sitemap files with a scheduler
It is possible to automat the usage of the GSiteCrawler to automatically crawl your site and upload the Google Sitemap file via FTP.
For the simple automation, it will crawl the website with your URLs and rules, create a Google Sitemap-File (or several, depending on the size of the site and the settings), upload it via FTP (if you have that set up), ping Google and optionally execute a command of your choice.
Here is a short list of the required steps:
- Make a copy of the shortcut for the GSiteCrawler, add " /auto" to the end of the command to execute, call it "GSiteCrawler automation" (or similar) - this is the shortcut that will start the GSiteCrawler in automation-mode.
- Select the projects you want to run automatically by enabling the simple automation in the settings for that project.
- Test before you schedule: Start the shortcut we created in step 1. It should go through the projects list, crawl the projects, make sitemap files and upload them via FTP. In the end, it should execute the command you have entered (if any).
- When the test works, add it to your Windows scheduler (use the built in one or any of the others available, I use the splinterware.com "Windows Scheduler Professional"). Make sure you have the proper user (with the correct password) set up, etc. With missing (or incorrect) user rights, it might not run, might not work, or just look like it's working but not upload the file...
- Test, test, test :-) (before you trust it)
- When it runs, it creates logs of everything it does (quite verbose): for all projects in an automation run in the main program directory (eg "Automate_2005-09-21_01-19-20.LOG") and also a copy of the project specific log in the project directory, "autolog.txt". Especially when run on the scheduler, it is good to check up to see if everything went by plan.
The command to execute afterwards can be used to finish / clean up after the work of the GSiteCrawler. I personally have a small batch file for each domain, it sends me a copy of the logs via email (I use "blat"), and when I have it run locally on a test-server, it copies the sitemap files into the proper server directory (mapped to the website).