Custom 404s
How many times have you been surfing the web and come across a plain old boring "default" 404 page. You know, the one that lists all the possible reasons for the error and provides you with a link to the main domain belonging to the site? Wouldn't it be nice if you could let your visitors know that even though the page they requested might not be there any more, the site is still about and at such and such a place...
Well, it is possible to be able to do this, but how easy it is to do will depend on your server, its settings, and the amount of access you have to it (doesn't it always?) However, before configuring the server, you really need to design and build your error pages.
Building a Custom Error Page
How you go about designing and building your custom error page will depend on the technologies you have available to you and the extent of your knowledge. If you just want a fancier version of the typical error message, so that it coordinates and links to other pages on your site then you can simply use your page template and upload a static HTML page detailing each error. This is probably the easiest way.
It is also possible to have all your error messages redirected to a script that automatically emails you details of the error, or can print it to a log or even have a go at guessing where it was the user was trying to get to. Your lost visitor would surely appreciate the last feature!
If you are going to have a go at making your own custom error messages then you should really make it say a little more than there was such and such an error. Make sure that there are links to the most important areas of your site. If you have a site search engine, you may also wish to add a search box to it so they can search for whatever it was they were looking for. The more you make the error page (or script) "belong" to your site, the more likely people are to stay at your site and use it, rather than click back and go to the site that had the broken link.
However you design your page, there are some things you should take into consideration if you want to keep your server resources to a minimum.
Spider Traps
One day in September 2001 I was stunned to get an email from my host telling me that they had taken my Petz site offline for using too much of the server's processing power. The thing was, I had not updated the site for a few months and so could not think of the reason why I'd suddenly use so much.
It turned out that a web bot was trying to spider links on my site that just didn't exist, only when it got to the 404 error page, it was spidering the links from that page too! This bot was trying to retrieve pages up to about 6 levels deep at the rate of about 10 a minute.
The error page was designed to log the URLs tried by anyone that could be a legitimate surfer, but since the bot didn't identify itself as an obvious bot, it was logging each request and checking it against the log for each new request it made. Before I could identify the problem, this log of bad referrers was fast approaching 500kb; the longer the bot tried to retrieve pages, the slower it took to deal with the requests.
My advice to you is to make sure that you do two things whenever you design a new error page; use full URLs for all links back to your site (so a bot can't get trapped so easily) and hide the error pages and their links from web bots!
Hiding your Error Pages
More often than not you will not want your error page(s) to show up in any search engine results. To hide these pages from spiders and crawlers, you should create a robots.txt and upload it to your root directory. Below is an example of how you can hide a certain file or directory. Edit it to suit your own needs.
User-Agent: *
Disallow: /private_files/
Disallow: /404error.html
Disallow: /500error.asp
To stop any web bot that has stumbled onto your error page from getting any further, make sure that you also add a robots meta tag with the <head> tag of any error page.
<meta name="robots" content="noindex, nofollow">
File Size
It seems that Internet Explorer 5 for Windows has a problem displaying a custom error page if it is too small; the thing is, the size that this is triggered varies depending on the error code. The table below details the error codes, meaning and the minimum size of the custom error page if you want to avoid this problem with IE5 users. If you use the browser and want to know how to fix this behaviour of IE5, then you can find more information from the Microsoft knowledge base (Q218155).
| Code | Description | File Size |
| 400 | Bad Request | > 512 bytes |
| 403 | Forbidden | > 256 bytes |
| 404 | Not Found | > 512 bytes |
| 405 | Method Not Allowed | > 256 bytes |
| 406 | Not Acceptable | > 512 bytes |
| 408 | Request Time-out | > 512 bytes |
| 409 | Conflict | > 512 bytes |
| 410 | Gone | > 256 bytes |
| 500 | Internal Server Error | > 512 bytes |
| 501 | Not Implemented | > 512 bytes |
| 505 | HTTP Version Not Supported | > 512 bytes |
