Solving 404 Not Found Errors with Google Webmasters Tools

This site, in one iteration or another, has been around for very near to 5 years now.  Even with my somewhat lacking posting habits over the last year or so, there are over 740 posts.  As you can imagine, any small change to the structure or linking of the site has the potential to break a great deal of the site.  And that can lead to some issues.  Some of them are little things that are not necessarily worth your time, but some are quite large.  One such is the breakage of links.  When your links break, any thing coming into them gets a 404 page instead of what they are looking for.  That can be a pain and it can cause for a bounce rate that is quite high.  Not good.

Several months ago, this site fell victim to a obnoxious hack.  It wasn’t anything serious, but it did mangle the links and permalinks.  After fixing and cleaning, I had to reset the permalinks for the site.  I did so, but also decided to change the permalink structure a bit.  The old structure had looked like this: and the new structure looked like this:   Notice the missing /archives/ in the url.  I didn’t like it there anymore.  All new posts have the new structure, so they never had an issue.  However, any old posts would now load with the new structure, and any links with the old structure were getting 404 messages.

Using Google’s Webmasters tools, I was able to pull a list of the URLs that were linked to my site and that were actively being clicked through.  From the dashboard, you can see the “not found” entry and the number of URLs that google was having problems indexing.

As you can see, there were 66 URL’s that were not found.  Clicking on the “Not Found” link gives us a list of the troublesome URLs.

You can see the URLs on the left with the /Archives/ part of them circled in orange.  Obviously a problem.  On the right, you can see how many links there are in Google’s index that point to that URL as well as the date that the link was last attempted to be crawled.  Circled in red, you can see that there were several of the URLs that were fairly popular.  My hope is that fixing the issue will help with organic traffic from Google.

How did I fix them?  Simple.  A 301 redirect.  I added the following line into my .htaccess file that resides in the root of my webspace.

RedirectMatch 301 ^/archives/(.+)$$1

One line of code to rule them all.  Now, when someone enters in one of the URLs with the /archives/ in it, it gets redirected to the new structured URL and they no longer get a 404 page.  It will take a few weeks (or months) for Google to get around to recrawling those URLs, but the errors should eventually go away when they have been successfully crawled.

I have a very love-hate relationship with The Great Googleymoogly, but Webmasters Tools are one of the things that I find to be incredibly useful.  If you aren’t using it, I suggest going and giving it a try.  You never know, you might find a few things to fix and help your self out a little.

Tool to See Competitors Adwords Keywords

I got pinged this evening on a new free tool that has been floating around. It’s a plugin for firefox called PPCSpy. Basically, you install it in firefox, and it puts a little icon down in the lower right corner of your page.

Turn the plugin on by clicking on the icon and do a google search for your desired niche. You’ll see a bunch of green boxes pop up under the adwords ads. Click those boxes and you get a sampling of the keywords that that advertiser is using.

With the free version, you only get to see the first 10 keywords, but there are levels of paid use that allow for 50 and 100 keywords. There’s plenty of other features as well that make it well worth picking up for free if you do any PPC work. Or at least at first glance.

I’ll be giving it a bit of a test drive over the next few weeks.

Need to Track URLs that Adsense Shows Up On

So, I have a conundrum.  My adsense account is showing clicks that aren’t showing as views.  They don’t appear in my URL channels or any of my other defined channels.  On the one hand, I don’t mind a few extra clicks here and there, but it’s a fine line.  If, for instance, those clicks and views are from a URL that has nudity or violence on them, I would be in violation of the Adsense TOS.  If  they aren’t, then I could just let them go for ever and rack up the money.

I don’t know whether they are from old sites that are cached, or revenue share sites that I submitted my ID to, or if someone has jacked my Publisher ID for some odd reason.  And I’d like to find out.  Which is where the problem surfaces.  I can’t find a good way to find the URLs.  I could turn on the “Allowed” URL feature in adsense, but that feature has some limitations that I don’t really want to deal with.  The biggest being that it’s limited to only 100 allowed URLs.  I don’t have anywhere near 100 sites, but if you were to try and allow all the Google cache sites, you’d quickly surpass that limit and you’d be losing out on some very valid adsense impressions and possibly some clicks too.

I’ve tried to search via Google with my publisher ID, but since the code is in Javascript, I can’t find anything.  I found a site a few weeks ago that would somehow do it, but it was a paid site.  I’m cheap, so I passed it up.  Of course, I can’t remember what site it was either.  I’d still like to find a free way to do it.  If they can do it for a paid site, there has to be a way to do it, so what is it?

Anyone ever done anything like this?  I tried to ask Matt Cutts on Twitter (@mattcutts), but he didn’t respond.  Either he doesn’t know, he sicked the Google KGB on me for a shady question, or he missed the question, or he’s ignoring me.  Any way you crack it, it doesn’t help me.

Repair Links to 404 Pages

Matt Cutts has a great post today on getting “Free links to your site“. It’s not so much about getting new “free” links to your site so much as getting broken links repaired.

The basics of it are that Google’s Webmaster tools has a report of the links that it crawled that pointed to an unfound url on your website. A broken link that results in a 404 on your website. And unless you have a custom 404 page, any visitors that click on that link will get a nasty looking 404 page from your host and probably go on their merry way somewhere else.

So, instead, you can take a quick look at this report.  Jot down the urls that are throwing errors and where they are coming from.  Then you have two choices.  You can try and contact the webmaster of the referrer site and have them change it.  That’s probably the better way to do it, but also not very likely to work.  The second way that will work 100% of the time is to 301 redirect the not found URL to something worthwhile.  Perhaps a search results page for the keywords of the missing/not found page?  Or maybe you moved the page and just need to redirect to the proper page.  Either way, you’ve turned a not found 404 link into a working link that could result in better ranking and higher traffic counts.

It’s also a really good time to double check that you don’t have the default 404 page.  The one here, for instance, is a pretty simple one that uses the base template of the site and then suggests a quick search of the site to find what they were looking for.  Not the best in the world, but better than the default too.

P.S.  You can visit the link above for detailed instructions from Matt Cutts on how to find the report.  And I wonder if it occurred to him that he’s going to be picking up all kinds of links with the anchor text of “Free links to your site.”?  Asking for spambots I’d say…