With large CMS or ecommerce websites, webmasters often shoot themselves in the foot (a quote from Matt Cutts) and cause all types of problems such as:
- Missing pages
- Misconfigured pages
- Accidentally blocking users from pages
There are several ways to handle a problem of a user encountering a missing page from a link.
Most common is to serve a 404 error page if the problem is temporary. If the problem is permanent, a webmaster may serve a 410 error, and if a page needs to be blocked from some users, the webmaster may serve a 403 error.
These are some of the most common errors:
- 400 – Bad request
- 401 – Authorization Required
- 403 – Forbidden
- 404 – Not Found
- 410 – Gone
Apache servers use the .htaccess file to configure 4xx errors. Other servers handle custom 4xx error pages differently, sometimes on a page level.
This all seems pretty straight forward and simple until you consider the sheer volume of 404 errors a large site encounters on a daily basis. Most webmasters consult a tool (sometimes Google Webmaster Tools) periodically to fix broken page links. They add redirects onto pages or into the .htaccesss file of the server. The problem is that these redirects can become bulky and effect load time.
So what is the best practice for webmasters to use when fixing 404 error pages?
Webmasters need a plan.
- Don’t panic when your site has 404 errors
- Review all the 404 links on Google and Bing Webmaster Tools
- Add 301 redirects for the URLs which have a good number of referrals and get traffic on a regular basis (consult analytics to determine whether URLs get traffic)
- Change the header code to 410 for the pages which do not have any traffic or have very few referrals. If the page is permanently gone Google will drop the page from its index 24 hours after indexing the page. You can keep this page a 404 as well because Google handles these two variations with very minor differences. In both cases, Google will to reconfirm (by re-indexing) before they remove the page.
So in my humble opinion you will speed up Google’s indexing process by up to three weeks if you add the 410 error code redirects to the server or site pages instead of 404 codes every time the page is actually deleted (and there is not a real reason to use a 301 redirect). If you use 404’s (or do nothing) it can take Google more than a month to remove error pages from its index.
(ref: https://support.google.com/webmasters/answer/2409439?hl=en & https://www.youtube.com/watch?v=xp5Nf8ANfOw) Currently Google treats 410s (Gone) the same as 404s (Not found).