Tuckey URLRewrite How-To


Today I will walk through how to put into practice use the Tuckey URL Rewrite java web filter under an Apache Tomcat web server.

URL rewriting is the method of converting complex URL parameters into more human readable format to allow more simple and memorable URLs. This can be an important function if you start using frameworks or content management systems which automatically generate long and at times cryptic URLs. While URL rewrite on the more popular Apache HTTP Server is relatively easy to set up using the default mod_rewrite module, reproducing this functionality on Tomcat requires a little more work.

Standard URL: http://www.example.com/list.cfm?product=fruit&page=1&order=asc&perpage=30
Rewrite URL: http://www.example.com/list/fruit/asc/30/1

Installation

A downloadable copy of URLRewrite can be found from one of 2 sources. The outdated website at http://www.tuckey.org/urlrewrite/ lists version 3.2 as the most recent version. But there is a more recent and in my testing still stable 4.0 beta at Google Code http://code.google.com/p/urlrewritefilter/downloads/list that contains some critical bug fixes. This article will assume you have downloaded the 4.0 beta and not the 3.2 stable.

URLRewrite can be found from one of 2 sources. The official website at http://www.tuckey.org/urlrewrite/ the developer’s repository at Google Code http://code.google.com/p/urlrewritefilter/downloads/list.

Extracting the downloaded URLRewrite archive reveals a single WEB-INF folder which contains a lib folder and the file urlrewrite.xml. Both these items will need to be copied to the WEB-INF folder of your Tomcat server root directory. For example if example.com was located in /var/www/www.example.com or c:\www\www.example.com the lib folder and urlrewrite.xml would go in /var/www/www.example.com/WEB-INF/ or c:\www\www.example.com\WEB-INF.

Content of Tuckey URLRewrite 4.0′s archive

Content of WEB-INF with URLRewrite

Content Of A Clean WEB.XML

Generally for most default installations of Tomcat the WEB-INF folder will only contain the single file web.xml. We will need to edit web.xml using a text editor to enable URLRewrite on Tomcat but because it is an XML text file. It can be machined parsed so I’d recommend editing it using a source code text editor such as NotePad++ on Windows or Textmate on OS/X.

Add the following code to web.xml anywhere contained within the <web-app></web-app> tags.

<!-- URL ReWriter -->
<filter>
<filter-name>UrlRewriteFilter</filter-name>
<filter-class>org.tuckey.web.filters.urlrewrite.UrlRewriteFilter</filter-class>
<!-- set the amount of seconds the conf file will be checked for reload
can be a valid integer (0 denotes check every time,
empty/not set denotes no reload check) -->
<init-param>
<param-name>confReloadCheckInterval</param-name>
<param-value>0</param-value>
</init-param>
<!-- you can disable status page if desired
can be: true, false (default true) -->
<init-param>
<param-name>statusEnabled</param-name>
<param-value>true</param-value>
</init-param>
<init-param>
<param-name>logLevel</param-name>
<param-value>DEBUG</param-value>
</init-param>
<init-param>
<param-name>statusEnabledOnHosts</param-name>
<param-value>localhost</param-value>
</init-param>
</filter>

<filter-mapping>
<filter-name>UrlRewriteFilter</filter-name>
<url-pattern>/*</url-pattern>
<dispatcher>REQUEST</dispatcher>
<dispatcher>FORWARD</dispatcher>
</filter-mapping>

Edited WEB.XML

Let’s quickly go through these settings.

confReloadCheckInterval is a numeric value in seconds that tells how frequently URLRewrite should check your urlrewrite.xml rules for any changes. Normally with Tomcat the modification of an XML configuration file requires a restart before the changes are reflected. You can set this value to -1 to disable automatic checking, while our value of 0 will mean that URLRewrite will check the urlrewrite.xml on every HTTP request. It is a great setting while testing and developing but a resource waste if used on a production server.

statusEnabled is a Boolean value that enables a URLRewrite status page that is reachable via a web browser at http://www.example.com/rewrite-status. It is probably best to disable this feature on production servers.

logLevel sets how much logging should be produced by URLRewrite. While the default setting is INFO, I suggest using DEBUG while you are testing. Through in a production environment you will probably want to use ERROR or FATAL to limit logging as URLRewrite can generate some large log files very quickly with more verbose log settings.

statusEnabledOnHosts allows you set which IP addresses and hosts that have access to the URLRewrite status page previously mentioned.

Finally the <filer-mapping> tag tells what kinds of methods to pass through via URLRewrite. The tags < url-pattern ></ url-pattern> should be left as is to apply URLRewrite to the whole site. While the 2 < dispatcher ></ dispatcher > tags mean that URLRewrite should be used for all HTTP REQUESTS and HTTP internal FORWARDing.

Once done, save your web.xml file and restart your Tomcat server. If all goes well you should be able to point your browser to http://www.example.com/rewrite-status and an UrlRewriteFilter 3.2.0 build 1 configuration overview should be shown. Yes that 3.2 version number is incorrectly listed in 4.0. Point your browser to http://www.example.com/test/status/ and you should be automatically forwarded to /rewrite-status. When this works then congratulations as you now have URLRewrite enabled on your server. Now I will give you some helpful example rules that may come in use. These rules all go in-between the <urlrewrite></urlrewrite> tags located in the urlrewrite.xml file within the WEB-INF folder. Whenever a page is requested on your Tomcat server the URLRewrite application will in a sequential order process ALL the rules contained in the urlrewrite.xml.

Pretty URL, SES Friendly URL Pass-Through

The most common use of URLRewrite would probably be to enable a 3rd party framework or CMS to use pretty URLs. The rule below is a generic setup that could be adapted for many uses. Generally speaking this should always be the LAST rule listed in your urlrewrite.xml rule set. The rule passes all URL requests to the index.cfm file except requests with URLs pointing to files or folders listed in the <condition></condition> tag regular expression value. So with this rule the URL http://www.example.com/list/apples would be displayed as is in the user’s browser but URLRewrite will actually pass http://www.example.com/index.cfm/list/apples to the Tomcat server. You do want to make sure that the page that contained within the <to></to> tag value is also listed in the (not equal) <condition></condition> value otherwise you could run into an infinite loop.

<rule enabled="true">
<name>Generic Pretty URLs Pass-through</name>
<condition type="request-uri" operator="notequal">^/(index.cfm|robots.txt|osd.xml|flex2gateway|cfide|cfformgateway|railo-context|admin-context|files|images|jrunscripts|javascripts|miscellaneous|stylesheets)</condition>
<from>^/(.*)$</from>
<to type="passthrough">/index.cfm/$1</to>
</rule>

Permanent Redirection

This rule is specifically for when you want to do a permanent redirection using the HTTP code 301. If we break this rule down, the <rule enable=””> Boolean enables you to selectively turn off this rule without the need to comment it out. The <name></name> tags contains the label you wish to use to describe the rule. <from></from> tag contains a regular expression to forward all requests for the documents.cfm page plus any URL parameters. While <to></to> is the new URL to redirect to. The attribute type tells URLRewrite to send a permanent direct code to the browser requesting the URL, while the attribute last = true tells URLRewrite not to process any further rules for this page request.

<rule enabled="true">
<name>Permanent redirect example</name>
<from>^/documents.html(.*)$</from>
<to type="permanent-redirect" last="true">/file/list/document</to>
</rule>

Selective HTTPS Enforcement

If you have HTTPS setup on your server you can use URLRewrite to enforce certain folders, URL paths or files to only be served on an encrypted HTTPS protocol. The <condition></condition> tag is used to enforce additional policies as to when the rule should be implemented. The attribute type with a value of scheme and the attribute operator with a value of equal states that when the URL scheme (http, https, ftp, etc) is equal to HTTP then apply this rule.

<rule enabled="false">
<name>Force HTTPS example</name>
<note>Automatically redirects adminstration requests to a secure protocol.</note>
<condition type="scheme" operator="equal">^http$</condition>
<from>^/CFIDE/administrator/(.*)$</from>
<to type="permanent-redirect" last="true">https://www.example.com/CFIDE/administrator/$1</to&gt;
</rule>

Railo HTTPS Enforcement railo-content.

<rule enabled="false">
<name>Force HTTPS example</name>
<note>Automatically redirects adminstration requests to a secure protocol.</note>
<condition type="scheme" operator="equal">^http$</condition>
<from>^/railo-context/admin/(index|web|server).cfm$</from>
<to type="permanent-redirect" last="true">https://www.example.com/railo-context/admin/$1.cfm</to&gt;
</rule>

Conditions Based On URL Parameters

You can also apply conditions to user supplied URL parameters. In the example below the condition looks for the URL parameter named fruit and sees if its value is either kiwi, apple or orange. If the values match then it redirects to a replacement URL which also incorporates the parameter. The URL request http://www.example.com/list.html?fruit=apple would forward to http://www.example.com/list/fruit/apple.

<rule enabled="true">
<name>Selective fruit example redirect</name>
<condition type="parameter" name="fruit" operator="equal">(apple|kiwi|orange)</condition>
<from>^/list.html(.*)$</from>
<to type="permanent-redirect" last="true">list/fruit/%{parameter:fruit}</to>
</rule>