Whenever I deploy a Rails site, I install the exception notification plugin so I get an email if a user provokes a bug I hadn’t found. It’s a piece of cake to install:
1. Install the plugin
ruby script/plugin install exception_notification
2. Include the plugin in your ApplicationController (in the application.rb file)
include ExceptionNotifiable
3. Add one line to your environment.rb file to specify where to send the email
ExceptionNotifier.exception_recipients = %w(person1@domain.com person2@domain.com)
4. Make sure you have ActionMailer configured (to use either sendmail or STMP), which you’ll already have done if your app sends email for any purpose.
What’s this? Errors already?
A few minutes after I first installed the Exception Notification plugin, I checked my inbox and found a dozen error emails from the site! After a brief moment of panic, I realized that all were coming from search engine spiders, and that the URLs were all invalid. I had replaced an old, crufty, static HTML site, and the spiders were rechecking pages they had indexed in the past.
So the next question was what to do with the old URLs. I could remap each URL to the most appropriate page on the new site, but the old site got little traffic and there wasn’t a clear mapping between the two sets of pages, so it hardly seemed worth it. I decided to map index.html to the new home page, since many people might have bookmarked that page, and it was clear what it should map to. As for everything else, I wanted a way to tell the spiders to stop trying to index them, and to tell anyone who accessed them that this was no longer a valid URL and they should explore the new site.
The heart of the fix is to add a few lines to routes.rb. Fortunately, the old site design had put all the HTML pages except for the index page into a directory called html. There were also some PDF files, conveniently in a directory named web_pdfs.
Here’s the routes, which I added at the end just before the default route:
map.connect '/index.html', :controller => 'page', :action => 'home'
map.connect '/html/*any', :controller => 'page', :action => 'oldpage'
map.connect '/web_pdfs/*any', :controller => 'page', :action => 'oldpage'
I have a controller called page_controller that manages the public parts of the site. The first route simply maps index.html to the home page action.
The second route maps any URL in the html directory to a new page, which I called simply “oldpage”. The oldpage.rhtml file simply has some text telling the visitor that this page no longer exists, and gives them some suggestions for exploring the new site.
The “*any” in the route definition absorbs whatever comes after “/html/”. With this technique, I don’t have to worry about whether there might be subdirectories within html, or what the file name extensions might be.
The final piece was to tell the spiders not to bother with these pages any more. I didn’t want to use a redirect, because the page to which I wanted to redirect the user wasn’t the new permanent page, but rather one that explained that the site had been changed. I decided to set the HTTP header to the “gone” status code (410). This just takes one line in the page controller oldpage action:
def oldpage
render :template => 'page/oldpage', :status => :gone
end
Eventually the spiders will stop spidering these old pages, but since index.html will still lead to the new home page without any error code, they will find all the new pages from there.