How can I exclude image files from my Google sitemap file?
Generally you can submit the sitemap file with all those files in it as well (Google said that they want all URLs). However you can also either filter them out afterwards or not crawl them in the first place:
- Removing unwanted files after crawling:
Go to the "URL list" tab, pick the text-box with "search for.." and enter ".jpg" (or whichever file extension you wish to remove), make sure search: "URL" is selected; then click on "select". It will now select all matching URLs. Now click on "delete" and they're gone. Repeat the same process for ".gif" (or any other URLs you wish to have removed), etc.
- Make sure you're not crawling unwanted files: (they will come back when you crawl again otherwise)
Go to the "Settings" tab, you should have an entry for "File extentions to check (not to follow)". You will see a list of all document types that are listed and stored in the sitemap file. You can just remove the image types from that text box. Now when you re-crawl, it will not list them any more. You can also change this setting before you start your first crawl, if you want to avoid listing all the image files in the first place. You can also adjust these settings when you first set up the project with the wizard.