URL Rewrite
1. Mod Rewrite Overview
We’re hearing more and more about mod rewrite every day. E commerce applications are making add-ons to their software to allow for it, content mangement systems are using it and most of all blogs are taking it to the next level as they are now becoming more and more popular.
So what is mod rewrite, what can it do and how can I use it to further the opimization of my website?
Mod rewrite is just one of the many modules that are tacked on to the Apache webserver. It’s main purpose is to detect incoming uri request (file name) and rewrite them to a redirect or to an alias file.
What does mod_rewrite really rewrite?
First lets start off by killing the implied meaning of a mod rewrite.
- A mod rewrite DOES NOT rewrite the url in the browser. A mod rewrite is not a magical url changer. You cannot flip a switch and hope to change all your dynamic urls to static urls.
- A mod rewrite does it’s rewriting on the server AFTER a file has been requested.
In this sample below we’re telling the mod rewrite that if the file called file_name.htm is request that it should be rewritten to its alias file_name.php.
Rewrite Sample: RewriteRule ^file_name.htm /file_name.php [L]
SEOs use mod rewrite more as a translator. An SEO can instruct his web developer to out put clean urls for a dynamic application and then by using mod rewrite translate those clean urls back to it’s dynamic form on the server.
In this example the mod rewrite is looking for the directory named mod/rewrite/tutorial/ it is then going to make three references () and then call those references and rewrite the folder to it’s alias some_file.php?name=mod_rewrite_tutorial
Rewrite Sample: #rule to rewrite /mod/rewrite/tutorial/ to it’s dynamic
#query string some_file.php?name=mod_rewrite_tutorial
RewriteRule ^(mod)/(rewrite)/(tutorial)/ /some_file.php?name=$1_$2_$3 [L]
So lets dig in and start learning the mod rewrite syntax.
2. Mod Rewrite syntax
The key to good mod rewritting is patterns. Patterns in your urls are how we are going to distinguish what to rewrite and what not to rewrite. We’ll get to that later, first we’ll need to go over the basics of the mod rewrite syntax.
RewriteRules
Rewriterules are the heart and sole of the mod rewrite, here is where you declare the file to be rewritten, where it is to be rewritten and tack on any special commands.
Rewrite rules are broken down into 4 simple blocks. I’ll refer to these blocks as the Call to action, Pattern, Rewrite and Command Flag.
Example:
RewriteRule ^dir/([0-9]+)/?$ /index.php?id=$1 [L]
Call to action: RewriteRule
Pattern: ^dir/([0-9]+) /?$
Rewrite: /index.php?id=$1
Command Flag: [L]
Between each of these blocks of the rewrite rule there should be a space. With that being said let’s go ahead and break down each of these 4 blocks and discuss what they do.
Call to action Block
The only way to screw this up is to spell RewriteRule incorrectly or leave out the space between this and the starting of the pattern block. If you do spell it incorrectly you’ll trigger an error and the browser will out put a 500 error. Note if you ever see a 500 error on your site it mostlikely due to a bad line of code in your .htaccess file.
Pattern Block
This one little piece of the mod rewrite is where the power is. In the pattern block of the rewrite rule we use regular expressions to detect the requested file name or uri and from this we can extract key parts to pass to the rewrite block.
Pay attention because this is the hardest part of mod rewrite.
Regular expressions is just a method to detect letters, numbers and symbols using special characters. These special characters are called metacharacters.
Pattern Matching metacharacter Definitions
| Char. | Definition |
| \ | Use before any of the following characters to escape or null the meaning or it. \* \. \$ \+ \[ \] |
| ^ | Start matching at this point |
| $ | End point of the match |
| . | Any character |
| [] | Starts a class |
| | | Starts alternative match this|that would mean match this or that |
| () | starts a back reference point |
| ? | match 0 or 1 time Quantifier |
| + | match atleast 1 or more times Quantifier |
| * | match 0 to infinite times Quantifier |
| {} | match minimum to maximum Quantifier {0,3} match up to 3 times |
| Char. | Definition |
| ^ | Negates the class. [^A-Z]+ means don’t match any uppercases |
| \ | Use before any of the following characters to escape or null the meaning or it. [\+]+ |
| - | Range for matching [0-9]+ [a-zA-Z]+ |
I’ll show a few quick samples just so you understand how to use all of the above. Then we’re going to move right on to the Rewrite Block since we’ll be going over all of this in our basic section.
In this example we just need the numbers in the ulrs below to pass through the mod rewrite to make our query. First we have to ask ourselves, “What is the common pattern in these urls”?
Example 1
In this example there are two common patterns that we can match against. The first one is they all start with category/. The second is they all end in .htm. This should be an easy match
- category/1.htm
- category/56.htm
- category/092340923.htm
- category/9334.htm
So to use regular expressions to match all of these urls below we need to set our starting point to ^category/.
Now we need to tell the rewrite rule to look for any number 1 or more times. We’ll use a character class to do this [0-9]+. Since we need this number to complete our rewrite block we’re going to tell the mod to reference this so we can use it later. We do this by surrounding the the
[0-9]+ with brachets like this ([0-9]+).
To finish the match we’re going to negate the . (remember this means any 1 character) even though a . is considered 1 character we’re going to go ahead and negate it to read as a dot and then finish the match with htm$.
Mouse over the characters for a definition:
RewriteRule ^category/([0-9]+)\.htm$ /category.php?cat_id=$1 [L]
Example 2
In this example we’re going to pass a name through the rewrite. The name we want to use is the name of the first folder. So like before we need to find a pattern so we can match and extract the name of the first folder.
- kitchen-ware/spoons.htm
- bathware2/towels/duck-patterns.htm
- dinnerware-pieces/
The only thing we have to work with that is common among all the examples is the trailing slash /. This is kind of tricky since you can type in the 3rd url with out the trailing slash and it would still show up in your browser. We’ll get to the trailing slash in a bit though lets start with the collection of the words and numbers before the /.
There are a few ways to do this. We can do a wild card match which picks up everything (.+) or (.*). We can make a class that looks for all numbers, dashes, commas and numbers. ([-a-zA-Z0-9]+) or we can use a negated class which will look for anything but a / like this ([^/]+). We’ll use the latter even though all of the above would do the job.
Note: The best to use is the negated class since .+ will pick up a / since a / is defined as any given character. The [-a-zA-Z0-9]+ would just take up too much computing power over the long run. Remember the more you define the more strain there is on the system. Since a search for every thing but a / ([^/]+) requires less computing power it’s not only fast it most optimal.
Our final result to pick up everything before the first trailing slash then would look like this ^([^]+)
Next we’ll need to account for the possible missing trailing slash. For this we have 2 options the first option is the min max {min,max} metacharacter. If we write /{0,1} this is telling the rewrite block to look for a / 0 to 1 times. That would match both dinnerware-pieces/ and dinnerware-pieces every time. But the easier way to do this is to use the ? metacharacter. ? just means match the preceding character 0 or 1 times and we don’t have to type as much.
So up to this point our pattern block should look like this. ^([^]+)/?
Then we can tack on a $ to the end so we know to stop if the trailing slash is or isn’t found. An we get our final rewrite rule below.
Mouse over the characters for a definition
RewriteRule ^([^/]+)/?$ /catalog.php?product_id=$1 [L]
A word of warning if you plan to use the folder names, especially the first folder as a variable that will be passed through the mod you better know that it’s going to pass all real files as well through to be rewritten. images/, includes/ css/ img/ cgi-bin/ all of these common folders are perfect matches for ^([^]+)/?$ if this is your first time doing mod rewrite you may want to put your variables in file names instead of 1st tier folders. We go over how to by pass the rewritting of all our static folders in the advanced tutorials. For now just keep this in mind.
It all looks like nonsense, I know I’ve been here before scratching my head trying to figure it all out. Just memorize these 3 pattern matches because you’ll use them the most ([0-9]+) , ([^/]+) , (.*) These translate to match any number, match any folder name, or match everything. Becareful with that one though! A RewriteRule ^(.*)$ will shoot a 500 error faster than lightning. Always use .* with another pattern that can be matched like RewriteRule ^(.*).htm$.
A few more things about the pattern block
You cannot use a RewriteRule to match a query string from a dynamic url. RewriteRule is for request_uri matching. A requested uri is in bold below
www.somesite.com/some/folder/index.php?id=23&name=foo
You can however get variables from a RewriteCond but we cover how to use RewriteCond together with RewriteRule in the medium tutorials.
Ok that’s enough for now. For more information on regular expressions check the on page resources on the right for links to more tutorials.
Rewrite Block
This part is a piece of cake. Now that we’ve used the pattern block to reference our matches ([0-9]+) we need to rewrite to the url and add the references as needed.
Remember a reference is anything that was picked up in the () in the rewrite.
To call a reference you just add a $ follow by the reference number. This all goes in order like so. Below we’ll make 3 references.
RewriteRule ^dir/(.*)/(.*)\.(.htm|.html)$ /$1/$2.$3 [R=301,L]
Rewrites using a 301 redirect
dir/some/folder/file.htm to /some/folder/file.htm
You can mix up the references if you want like so:
RewriteRule ^dir/(.*)/(.*)\.(.htm|.html)$ /$2/$1.$3 [R=301,L]
you can also not call a reference like so:
RewriteRule ^dir/(.*)/(.*)\.(.htm|.html)$ /$2/$1.php [R=301,L]
So lets recap a bit. The rewrite block serves 2 purposes. 1 to finalize the total mod rewrite by declare where to rewrite or to redirect. and 2. it allows us to call the backreferences we collect from the Pattern Block.
Note: We can use the RewriteBase to set a base directory that we want to rewrite to so you don’t always have to write it in your rules.
Example: RewriteBase /dir/
RewriteRule ^somefile-([0-9]+)\.htm$ index.php?id=$1 [L]
is the same as
RewriteRule ^somefile-([0-9]+)\.htm$ /dir/index.php?id=$1 [L]
So if you are doing all your rewites to the same directory save some time and declare you RewriteBase before all your rules. You can even declare / as your base.
Command Flag Block (Optional)
Ok I didn’t tell you this is optional because half of you would skip this part. Learning the different Command Flags is a must.
The command flag definitions are as follows:
| Char. | Definition |
| [R] | Redirect you can add an =301 or =302 to change the type. |
| [F] | Forces the url to be forbidden. 403 header |
| [G] | Forces the url to be gone 401 header |
| [L] | Last rule. (You should use this on all your rules that don’t link together) |
| [N] | Next round. Rerun the rules again from the start |
| [C] | Chains a rewrite rule together with the next rule. |
| [T] | use T=MIME-type to force the file to be a mime type |
| [NS] | Use if no sub request is requested |
| [NC] | Makes the rule case INsensitive |
| [QSA] | Query String Append use to add to an existing query string |
| [NE] | Turns of normal escapes that are default in the rewriterule |
| [PT] | Pass through to the handler (together with mod alias) |
| [S] | Skip the next rule S=3 skips the next 3 rules |
| [E] | E=var sets an enviromental variable that can be called by other rules |
Ok next is into the tutorials. If you are confused about any of the above don’t be scared to move along. We will recap everything so we don’t get confused. I know for myself I had to see it work and see the code before I could grasp the full mod rewite experience.
3. Mod Rewrite Basic tutorials
The majority of you came to this site to find out how to do one of these basic tutorials. So lets go ahead and line up what we’ll be learning.
- Simple Static to Dynamic Rewrites
- 301 and 302 redirects using mod rewrite
- File and file Extension rewrites
Tutorial 1
Simple Static to Dynamic Mod Rewrites ![]()
It’s easier than you think. Out of all the rewrites I’ve done I’d say I still do this type of rewrite the most. If you are working on a non commercial web applications then you should be able to implement this type of mod rewrite in just a few minutes.
Scenario: We have a dynamic site that out puts query strings in the urls. We want to change these dynamic urls to static search engine friendly urls. All the current web pages use a .php extension. We want the urls to look like a plain html page with a .html ext.
Tutorial 2
301 and 302 redirects using mod rewrite![]()
Redirects are some of the easiest rewrites to do. They are very powerful so becareful. I’ll just go over the most popular and then let anyone that wants to write a note about redirects post away after the tutorial.
Scenario 1: We have changed the file name /blue-widgets.html to /awesome-blue-widgets.html. We need to set up a 301 permanent redirect so any user or search engine bot that visits the old file will be redirected to the new.
Scenario 2: We’ve changed our domain name. We’ve set up every single file on the new domain. All the files will keep the same name, directory structure and extensions. We need to redirect users and search engine bots to our new domain via 301 so the new site starts getting picked up.![]()
Tutorial 3
File and file Extension rewrites![]()
These rewrites in this tutorial are helpful when you are changing old static .html files into a more dynamic server side scripting file like .asp, .jsp, .cfm or .php. Some of these also fall under the “if I did this it would look cool” category. Enjoy!
Scenario 1: We have a dynamic site using .php extensions. We just want to make the site look less dynamic and more like hand made. All the file names are staying the same, we just want to change the extensions to .
Scenario 2: We own a music store online. For our cd catalog online we want to change the file extensions from .php to .cds. We think users will get a laugh if the call up the file /catalog/metalica.cds .![]()
4. Mod Rewrite medium tutorials
Here we’re going to learn how to combine the RewriteCond and the RewriteRule so we can do conditional rewrite. For you programmers out there you can use RewriteCond like an if statement.
- Preventing 3rd party sites from using your images
- Advanced 301 and 302 redirects using mod rewrite
5. Mod Rewrite Advanced tutorials
The majority of you came to this site to find out how to do one of these basic tutorials. So lets go ahead and line up what we’ll be learning.
- Simple Static to Dyanmic Rewrites
- 301 and 302 redirects using mod rewrite
- File Extension rewrites
Original Source : http://www.webforgers.net/mod-rewrite/
[…] http://sandaldjepit.com/2008/04/10/url-rewrite/if this is your first time doing mod rewrite you may want to put your variables in file names instead of 1st tier folders. We go over how to by pass the rewritting of all our static folders in the advanced tutorials. … […]