Solving 404 Not Found Errors with Google Webmasters Tools

This site, in one iteration or another, has been around for very near to 5 years now.  Even with my somewhat lacking posting habits over the last year or so, there are over 740 posts.  As you can imagine, any small change to the structure or linking of the site has the potential to break a great deal of the site.  And that can lead to some issues.  Some of them are little things that are not necessarily worth your time, but some are quite large.  One such is the breakage of links.  When your links break, any thing coming into them gets a 404 page instead of what they are looking for.  That can be a pain and it can cause for a bounce rate that is quite high.  Not good.

Several months ago, this site fell victim to a obnoxious hack.  It wasn’t anything serious, but it did mangle the links and permalinks.  After fixing and cleaning, I had to reset the permalinks for the site.  I did so, but also decided to change the permalink structure a bit.  The old structure had looked like this: and the new structure looked like this:   Notice the missing /archives/ in the url.  I didn’t like it there anymore.  All new posts have the new structure, so they never had an issue.  However, any old posts would now load with the new structure, and any links with the old structure were getting 404 messages.

Using Google’s Webmasters tools, I was able to pull a list of the URLs that were linked to my site and that were actively being clicked through.  From the dashboard, you can see the “not found” entry and the number of URLs that google was having problems indexing.

As you can see, there were 66 URL’s that were not found.  Clicking on the “Not Found” link gives us a list of the troublesome URLs.

You can see the URLs on the left with the /Archives/ part of them circled in orange.  Obviously a problem.  On the right, you can see how many links there are in Google’s index that point to that URL as well as the date that the link was last attempted to be crawled.  Circled in red, you can see that there were several of the URLs that were fairly popular.  My hope is that fixing the issue will help with organic traffic from Google.

How did I fix them?  Simple.  A 301 redirect.  I added the following line into my .htaccess file that resides in the root of my webspace.

RedirectMatch 301 ^/archives/(.+)$$1

One line of code to rule them all.  Now, when someone enters in one of the URLs with the /archives/ in it, it gets redirected to the new structured URL and they no longer get a 404 page.  It will take a few weeks (or months) for Google to get around to recrawling those URLs, but the errors should eventually go away when they have been successfully crawled.

I have a very love-hate relationship with The Great Googleymoogly, but Webmasters Tools are one of the things that I find to be incredibly useful.  If you aren’t using it, I suggest going and giving it a try.  You never know, you might find a few things to fix and help your self out a little.