Thursday, September 1, 2005

Clear Google Web Cache - Delete 404 pages

Google allows you to clear and remove pages from Google web search cache. Google updates its entire index automatically on a regular basis. If Googlebot encounters a “Not Found” 404 error page, it doesn’t crawl it further and deletes the outdated link from the next crawl.

Even if you delete a page from your website, it continues to remain in Google’s memory (read, cache) and Google audience can still discover your page and read the contents stored in Google’s cache. Only the next time Google crawls your site will it remove the “dead” page from the cache.

If you do not wish to remain at the mercy of Googlebot and want to remove the deleted page from Google’s cache urgenly, use the automatic URL removal system from Google itself.

Google will accept your removal request only if the page returns a true 404 error via the http headers. Please ensure that you return a true 404 error even if you choose to display a more user-friendly body of the HTML page for your visitors. It won’t help to return a page that says “File Not Found” if the http headers still return a status code of 200, or normal.

Read more on removing content from Google’s index. There are techniques for removing entire websites, part of a website or removing snippets - short text descriptions that appear with the page title in google search results. Webmasters can prevent Google Images from indexing their copyright pictures or remove all files of a specific file type (for example, to include .jpg but not .gif images)

No comments:

Post a Comment