Generating Google Base bulk import files from your website using the GSiteCrawler

What is Google Base?

Google Base (http://base.google.com/base/) is a simple database which Google has made available to everyone to store all sorts of information. The are several categories available, you can even make your own. Items in the Google Base can be used by other services from Google, like Googles' search, Froogle, etc.

The idea behind Google Base is to let Google gather better structured information, so that it can better serve the users the things they're *really* looking for. It is very hard for a search engine to extract the relevant information from a web-page, especially if it ignores the standard meta-tags :-). Google Base should help remedy that situation in that users can submit strictly structured data, with expiration date if they desire, adding "labels" (keywords for everyone else) to help end users browse and search for specific pages.

Why export your site to Google Base?

Your site, as crawled by the GSiteCrawler, has lots of information on it that Google and end-users might be interested in. The GSiteCrawler will crawl your pages and extract title, description, date and keywords information along with the URL itself. This data can be used as the basis for a Google Base bulk-submit file.

You can easily append other data to the generated file, add an image link per item, add more information to the description or to the keyword (ehm, "labels"), remove items, etc.. The generated (and edited, if desired) file can then be submitted to Google Base as a bulk-import file.

How to generate Google Base bulk-import files with the GSiteCrawler

To generate Google Base files, you will need to first download the export template for the GSiteCrawler:

Google-Base-Bulk.tsv - just right-click and choose "Save as..." (depending on the browser), and save it in the "templates" subdirectory within your GSiteCrawler folder.

To export your project for Google Base, start your GSiteCrawler, choose the project you want to use. Now click "Generate" (on top, the small arrow next to it) and pick "URL List for export".

Generate Google Base bulk import file with GSiteCrawler for Windows

In the window following that, choose "Google-Base-Bulk.tsv" (the file you downloaded and copied into the templates-folder).

Export Google Base bulk import file with GSiteCrawler for Windows

Next you will be able to specify the file name for your Google Base bulk import file. Viola - your Google Base file based on your website is done!

Currently Google will only accept one Google Base file per user, with the files limited to 1000 entries. The generated file is in the format "tab-delimited" and can generally be opened and edited with Microsoft Excel. You can find more information about the bulk import function on Googles page: http://www.google.com/base/howtobulkupload.html.

Let me know how it works so that I can update the information here!



© SOFTplus Entwicklungen GmbH · Sitemap · Privacy policy · Terms & conditions · Contact me · About