Redirect, Change URLs or Redirect HTTP to HTTPS in Apache - Everything You Ever Wanted to Know About Mod_Rewrite Rules but Were Afraid to AskHow do I use .htaccess to always redirect from HTTP to HTTPS?How to add exceptions to apache reverse proxy rulesRedirect from http to httpsHow do I make RewriteCond %HTTP_COOKIE match a cookie value exactly?Apache webserver: How to redirect all requests to domain.com to www.domain.com.htaccess - redirect non www to www and retain subdomains from redirectingapache force if url has specific pattern redirect to httpsRewriteCond not matching on my IP when matching on %REMOTE_ADDRHow do you prevent mod_rewrite from decoding a string in the path?How to set mod_rewrite in WAMP?mod_rewrite REQUEST_FILENAME doesn't contain absolute pathCannot 301 redirect with IIS URL Rewrite ModuleApache Rewrite Rules breaking each other?mod_rewrite rules for non-default page cache directory for Rails application with distinction between mobile directory and www directoryRewriting the whole URL using mod_rewriteUse subpath internal proxy for subdomains, but redirect external clients if they ask for that subpath?apache 2.2 mod_rewrite redirect some urls to https and force http for othersHow to make an Apache 301 redirect to retain the referrer?Why do redirects from my subdomain end up on my primary domain?Redirection, subtitute or rewrite
Working in the USA for living expenses only; allowed on VWP?
C SIGINT signal in Linux
Why doesn't my simple mesh wall "difference boolean cut" the door away?
Why don't B747s start takeoffs with full throttle?
PhD student with mental health issues and bad performance
Avoiding cliches when writing gods
Pronoun introduced before its antecedent
Importance sampling estimation of power function
Word for a small burst of laughter that can't be held back
Will TSA allow me to carry a Continuous Positive Airway Pressure (CPAP)/sleep apnea device?
Are there cubesats in GEO?
What is the purpose of building foundations?
How hard would it be to convert a glider into an powered electric aircraft?
Why does the Schrödinger equation work so well for the Hydrogen atom despite the relativistic boundary at the nucleus?
How do I write "Show, Don't Tell" as an Asperger?
My coworkers think I had a long honeymoon. Actually I was diagnosed with cancer. How do I talk about it?
Reading two lines in piano
What's the correct term describing the action of sending a brand-new ship out into its first seafaring trip?
In this example, which path would a monster affected by the Dissonant Whispers spell take?
Building a road to escape Earth's gravity by making a pyramid on Antartica
What is in `tex.print` or `tex.sprint`?
What is the right way to float a home lab?
What is the advantage of carrying a tripod and ND-filters when you could use image stacking instead?
How to skip replacing first occurrence of a character in each line?
Redirect, Change URLs or Redirect HTTP to HTTPS in Apache - Everything You Ever Wanted to Know About Mod_Rewrite Rules but Were Afraid to Ask
How do I use .htaccess to always redirect from HTTP to HTTPS?How to add exceptions to apache reverse proxy rulesRedirect from http to httpsHow do I make RewriteCond %HTTP_COOKIE match a cookie value exactly?Apache webserver: How to redirect all requests to domain.com to www.domain.com.htaccess - redirect non www to www and retain subdomains from redirectingapache force if url has specific pattern redirect to httpsRewriteCond not matching on my IP when matching on %REMOTE_ADDRHow do you prevent mod_rewrite from decoding a string in the path?How to set mod_rewrite in WAMP?mod_rewrite REQUEST_FILENAME doesn't contain absolute pathCannot 301 redirect with IIS URL Rewrite ModuleApache Rewrite Rules breaking each other?mod_rewrite rules for non-default page cache directory for Rails application with distinction between mobile directory and www directoryRewriting the whole URL using mod_rewriteUse subpath internal proxy for subdomains, but redirect external clients if they ask for that subpath?apache 2.2 mod_rewrite redirect some urls to https and force http for othersHow to make an Apache 301 redirect to retain the referrer?Why do redirects from my subdomain end up on my primary domain?Redirection, subtitute or rewrite
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
This is a Canonical Question about Apache's mod_rewrite.
Changing a request URL or redirecting users to a different URL than the one they originally requested is done using mod_rewrite. This includes such things as:
- Changing HTTP to HTTPS (or the other way around)
- Changing a request to a page which no longer exist to a new replacement.
- Modifying a URL format (such as ?id=3433 to /id/3433 )
- Presenting a different page based on the browser, based on the referrer, based on anything possible under the moon and sun.
- Anything you want to mess around with URL
Everything You Ever Wanted to Know about Mod_Rewrite Rules but Were Afraid to Ask!
How can I become an expert at writing mod_rewrite rules?
- What is the fundamental format and structure of mod_rewrite rules?
- What form/flavor of regular expressions do I need to have a solid grasp of?
- What are the most common mistakes/pitfalls when writing rewrite rules?
- What is a good method for testing and verifying mod_rewrite rules?
- Are there SEO or performance implications of mod_rewrite rules I should be aware of?
- Are there common situations where mod_rewrite might seem like the right tool for the job but isn't?
- What are some common examples?
A place to test your rules
The htaccess tester web site is a great place to play around with your rules and test them. It even shows the debug output so you can see what matched and what did not.
apache-2.2 mod-rewrite redirect redirection 301-redirect
|
show 2 more comments
This is a Canonical Question about Apache's mod_rewrite.
Changing a request URL or redirecting users to a different URL than the one they originally requested is done using mod_rewrite. This includes such things as:
- Changing HTTP to HTTPS (or the other way around)
- Changing a request to a page which no longer exist to a new replacement.
- Modifying a URL format (such as ?id=3433 to /id/3433 )
- Presenting a different page based on the browser, based on the referrer, based on anything possible under the moon and sun.
- Anything you want to mess around with URL
Everything You Ever Wanted to Know about Mod_Rewrite Rules but Were Afraid to Ask!
How can I become an expert at writing mod_rewrite rules?
- What is the fundamental format and structure of mod_rewrite rules?
- What form/flavor of regular expressions do I need to have a solid grasp of?
- What are the most common mistakes/pitfalls when writing rewrite rules?
- What is a good method for testing and verifying mod_rewrite rules?
- Are there SEO or performance implications of mod_rewrite rules I should be aware of?
- Are there common situations where mod_rewrite might seem like the right tool for the job but isn't?
- What are some common examples?
A place to test your rules
The htaccess tester web site is a great place to play around with your rules and test them. It even shows the debug output so you can see what matched and what did not.
apache-2.2 mod-rewrite redirect redirection 301-redirect
9
The idea behind this question is to give a close path for all the endless mod_rewrite questions that drive our more regular users crazy. This is very similar to what was done with subnetting at serverfault.com/questions/49765/how-does-subnetting-work .
– Kyle Brandt
Dec 20 '10 at 17:00
1
Also, I don't really want too many upvotes on this question, rather they should go to the answer. I don't want to CW this because I want to make sure the poster gets full credit for what I am hoping is the mod_rewrite answer to end all mod_rewrite questions.
– Kyle Brandt
Dec 20 '10 at 17:09
4
Sorry, I upvoted the question. ;-) I really think it needs to show up at (or near) the top ofmod-rewrite
tag searches/filters.
– Steven Monday
Dec 20 '10 at 19:07
Someone Else (tm) should handle the common use-cases. I don't know them well enough to do it justice.
– sysadmin1138♦
Dec 20 '10 at 20:55
Perhaps this question should be linked into the mod-rewrite tag wiki to make the path even shorter.
– beldaz
Jan 1 '11 at 3:50
|
show 2 more comments
This is a Canonical Question about Apache's mod_rewrite.
Changing a request URL or redirecting users to a different URL than the one they originally requested is done using mod_rewrite. This includes such things as:
- Changing HTTP to HTTPS (or the other way around)
- Changing a request to a page which no longer exist to a new replacement.
- Modifying a URL format (such as ?id=3433 to /id/3433 )
- Presenting a different page based on the browser, based on the referrer, based on anything possible under the moon and sun.
- Anything you want to mess around with URL
Everything You Ever Wanted to Know about Mod_Rewrite Rules but Were Afraid to Ask!
How can I become an expert at writing mod_rewrite rules?
- What is the fundamental format and structure of mod_rewrite rules?
- What form/flavor of regular expressions do I need to have a solid grasp of?
- What are the most common mistakes/pitfalls when writing rewrite rules?
- What is a good method for testing and verifying mod_rewrite rules?
- Are there SEO or performance implications of mod_rewrite rules I should be aware of?
- Are there common situations where mod_rewrite might seem like the right tool for the job but isn't?
- What are some common examples?
A place to test your rules
The htaccess tester web site is a great place to play around with your rules and test them. It even shows the debug output so you can see what matched and what did not.
apache-2.2 mod-rewrite redirect redirection 301-redirect
This is a Canonical Question about Apache's mod_rewrite.
Changing a request URL or redirecting users to a different URL than the one they originally requested is done using mod_rewrite. This includes such things as:
- Changing HTTP to HTTPS (or the other way around)
- Changing a request to a page which no longer exist to a new replacement.
- Modifying a URL format (such as ?id=3433 to /id/3433 )
- Presenting a different page based on the browser, based on the referrer, based on anything possible under the moon and sun.
- Anything you want to mess around with URL
Everything You Ever Wanted to Know about Mod_Rewrite Rules but Were Afraid to Ask!
How can I become an expert at writing mod_rewrite rules?
- What is the fundamental format and structure of mod_rewrite rules?
- What form/flavor of regular expressions do I need to have a solid grasp of?
- What are the most common mistakes/pitfalls when writing rewrite rules?
- What is a good method for testing and verifying mod_rewrite rules?
- Are there SEO or performance implications of mod_rewrite rules I should be aware of?
- Are there common situations where mod_rewrite might seem like the right tool for the job but isn't?
- What are some common examples?
A place to test your rules
The htaccess tester web site is a great place to play around with your rules and test them. It even shows the debug output so you can see what matched and what did not.
apache-2.2 mod-rewrite redirect redirection 301-redirect
apache-2.2 mod-rewrite redirect redirection 301-redirect
edited Feb 13 '15 at 1:26
HopelessN00b
48.7k25117194
48.7k25117194
asked Dec 20 '10 at 16:59
Kyle BrandtKyle Brandt
66.8k62265414
66.8k62265414
9
The idea behind this question is to give a close path for all the endless mod_rewrite questions that drive our more regular users crazy. This is very similar to what was done with subnetting at serverfault.com/questions/49765/how-does-subnetting-work .
– Kyle Brandt
Dec 20 '10 at 17:00
1
Also, I don't really want too many upvotes on this question, rather they should go to the answer. I don't want to CW this because I want to make sure the poster gets full credit for what I am hoping is the mod_rewrite answer to end all mod_rewrite questions.
– Kyle Brandt
Dec 20 '10 at 17:09
4
Sorry, I upvoted the question. ;-) I really think it needs to show up at (or near) the top ofmod-rewrite
tag searches/filters.
– Steven Monday
Dec 20 '10 at 19:07
Someone Else (tm) should handle the common use-cases. I don't know them well enough to do it justice.
– sysadmin1138♦
Dec 20 '10 at 20:55
Perhaps this question should be linked into the mod-rewrite tag wiki to make the path even shorter.
– beldaz
Jan 1 '11 at 3:50
|
show 2 more comments
9
The idea behind this question is to give a close path for all the endless mod_rewrite questions that drive our more regular users crazy. This is very similar to what was done with subnetting at serverfault.com/questions/49765/how-does-subnetting-work .
– Kyle Brandt
Dec 20 '10 at 17:00
1
Also, I don't really want too many upvotes on this question, rather they should go to the answer. I don't want to CW this because I want to make sure the poster gets full credit for what I am hoping is the mod_rewrite answer to end all mod_rewrite questions.
– Kyle Brandt
Dec 20 '10 at 17:09
4
Sorry, I upvoted the question. ;-) I really think it needs to show up at (or near) the top ofmod-rewrite
tag searches/filters.
– Steven Monday
Dec 20 '10 at 19:07
Someone Else (tm) should handle the common use-cases. I don't know them well enough to do it justice.
– sysadmin1138♦
Dec 20 '10 at 20:55
Perhaps this question should be linked into the mod-rewrite tag wiki to make the path even shorter.
– beldaz
Jan 1 '11 at 3:50
9
9
The idea behind this question is to give a close path for all the endless mod_rewrite questions that drive our more regular users crazy. This is very similar to what was done with subnetting at serverfault.com/questions/49765/how-does-subnetting-work .
– Kyle Brandt
Dec 20 '10 at 17:00
The idea behind this question is to give a close path for all the endless mod_rewrite questions that drive our more regular users crazy. This is very similar to what was done with subnetting at serverfault.com/questions/49765/how-does-subnetting-work .
– Kyle Brandt
Dec 20 '10 at 17:00
1
1
Also, I don't really want too many upvotes on this question, rather they should go to the answer. I don't want to CW this because I want to make sure the poster gets full credit for what I am hoping is the mod_rewrite answer to end all mod_rewrite questions.
– Kyle Brandt
Dec 20 '10 at 17:09
Also, I don't really want too many upvotes on this question, rather they should go to the answer. I don't want to CW this because I want to make sure the poster gets full credit for what I am hoping is the mod_rewrite answer to end all mod_rewrite questions.
– Kyle Brandt
Dec 20 '10 at 17:09
4
4
Sorry, I upvoted the question. ;-) I really think it needs to show up at (or near) the top of
mod-rewrite
tag searches/filters.– Steven Monday
Dec 20 '10 at 19:07
Sorry, I upvoted the question. ;-) I really think it needs to show up at (or near) the top of
mod-rewrite
tag searches/filters.– Steven Monday
Dec 20 '10 at 19:07
Someone Else (tm) should handle the common use-cases. I don't know them well enough to do it justice.
– sysadmin1138♦
Dec 20 '10 at 20:55
Someone Else (tm) should handle the common use-cases. I don't know them well enough to do it justice.
– sysadmin1138♦
Dec 20 '10 at 20:55
Perhaps this question should be linked into the mod-rewrite tag wiki to make the path even shorter.
– beldaz
Jan 1 '11 at 3:50
Perhaps this question should be linked into the mod-rewrite tag wiki to make the path even shorter.
– beldaz
Jan 1 '11 at 3:50
|
show 2 more comments
5 Answers
5
active
oldest
votes
mod_rewrite syntax order
mod_rewrite has some specific ordering rules that affect processing. Before anything gets done, the RewriteEngine On
directive needs to be given as this turns on mod_rewrite processing. This should be before any other rewrite directives.
RewriteCond
preceding RewriteRule
makes that ONE rule subject to the conditional. Any following RewriteRules will be processed as if they were not subject to conditionals.
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule $/blog/(.*).html $/blog/$1.sf.html
In this simple case, if the HTTP referrer is from serverfault.com, redirect blog requests to special serverfault pages (we're just that special). However, if the above block had an extra RewriteRule line:
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule $/blog/(.*).html $/blog/$1.sf.html
RewriteRule $/blog/(.*).jpg $/blog/$1.sf.jpg
All .jpg files would go to the special serverfault pages, not just the ones with a referrer indicating it came from here. This is clearly not the intent of the how these rules are written. It could be done with multiple RewriteCond rules:
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^/blog/(.*).html /blog/$1.sf.html
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^/blog/(.*).jpg /blog/$1.sf.jpg
But probably should be done with some trickier replacement syntax.
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^/blog/(.*).(html|jpg) /blog/$1.sf.$2
The more complex RewriteRule contains the conditionals for processing. The last parenthetical, (html|jpg)
tells RewriteRule to match for either html
or jpg
, and to represent the matched string as $2 in the rewritten string. This is logically identical to the previous block, with two RewriteCond/RewriteRule pairs, it just does it on two lines instead of four.
Multiple RewriteCond lines are implicitly ANDed, and can be explicitly ORed. To handle referrers from both ServerFault and Super User (explicit OR):
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$) [OR]
RewriteCond %HTTP_REFERER ^https?://superuser.com(/|$)
RewriteRule ^/blog/(.*).(html|jpg) /blog/$1.sf.$2
To serve ServerFault referred pages with Chrome browsers (implicit AND):
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteCond %HTTP_USER_AGENT ^Mozilla.*Chrome.*$
RewriteRule ^/blog/(.*).(html|jpg) /blog/$1.sf.$2
RewriteBase
is also order specific as it specifies how following RewriteRule
directives handle their processing. It is very useful in .htaccess files. If used, it should be the first directive under "RewriteEngine on" in an .htaccess file. Take this example:
RewriteEngine On
RewriteBase /blog
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^(.*).(html|jpg) $1.sf.$2
This is telling mod_rewrite that this particular URL it is currently handling was arrived by way of http://example.com/blog/ instead of the physical directory path (/home/$Username/public_html/blog) and to treat it accordingly. Because of this, the RewriteRule
considers it's string-start to be after the "/blog" in the URL. Here is the same thing written two different ways. One with RewriteBase, the other without:
RewriteEngine On
##Example 1: No RewriteBase##
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule /home/assdr/public_html/blog/(.*).(html|jpg) $1.sf.$2
##Example 2: With RewriteBase##
RewriteBase /blog
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^(.*).(html|jpg) $1.sf.$2
As you can see, RewriteBase
allows rewrite rules to leverage the web-site path to content rather than the web-server, which can make them more intelligible to those who edit such files. Also, they can make the directives shorter, which has an aesthetic appeal.
RewriteRule matching syntax
RewriteRule itself has a complex syntax for matching strings. I'll cover the flags (things like [PT]) in another section. Because Sysadmins learn by example more often than by reading a man-page I'll give examples and explain what they do.
RewriteRule ^/blog/(.*)$ /newblog/$1
The .*
construct matches any single character (.
) zero or more times (*
). Enclosing it in parenthesis tells it to provide the string that was matched as the $1 variable.
RewriteRule ^/blog/.*/(.*)$ /newblog/$1
In this case, the first .* was NOT enclosed in parens so isn't provided to the rewritten string. This rule removes a directory level on the new blog-site. (/blog/2009/sample.html becomes /newblog/sample.html).
RewriteRule ^/blog/(2008|2009)/(.*)$ /newblog/$2
In this case, the first parenthesis expression sets up a matching group. This becomes $1, which is not needed and therefore not used in the rewritten string.
RewriteRule ^/blog/(2008|2009)/(.*)$ /newblog/$1/$2
In this case, we use $1 in the rewritten string.
RewriteRule ^/blog/(20[0-9][0-9])/(.*)$ /newblog/$1/$2
This rule uses a special bracket syntax that specifies a character range. [0-9] matches the numerals 0 through 9. This specific rule will handle years from 2000 to 2099.
RewriteRule ^/blog/(20[0-9]2)/(.*)$ /newblog/$1/$2
This does the same thing as the previous rule, but the 2 portion tells it to match the previous character (a bracket expression in this case) two times.
RewriteRule ^/blog/([0-9]4)/([a-z]*).html /newblog/$1/$2.shtml
This case will match any lower-case letter in the second matching expression, and do so for as many characters as it can. The .
construct tells it to treat the period as an actual period, not the special character it is in previous examples. It will break if the file-name has dashes in it, though.
RewriteRule ^/blog/([0-9]4)/([-a-z]*).html /newblog/$1/$2.shtml
This traps file-names with dashes in them. However, as -
is a special character in bracket expressions, it has to be the first character in the expression.
RewriteRule ^/blog/([0-9]4)/([-0-9a-zA-Z]*).html /newblog/$1/$2.shtml
This version traps any file name with letters, numbers or the -
character in the file-name. This is how you specify multiple character sets in a bracket expression.
RewriteRule flags
The flags on rewrite rules have a host of special meanings and usecases.
RewriteRule ^/blog/([0-9]4)/([-a-z]*).html /newblog/$1/$2.shtml [L]
The flag is the [L]
at the end of the above expression. Multiple flags can be used, separated by a comma. The linked documentation describes each one, but here they are anyway:
L = Last. Stop processing RewriteRules once this one matches. Order counts!
C = Chain. Continue processing the next RewriteRule. If this rule doesn't match, then the next rule won't be executed. More on this later.
E = Set environmental variable. Apache has various environmental variables that can affect web-server behavior.
F = Forbidden. Returns a 403-Forbidden error if this rule matches.
G = Gone. Returns a 410-Gone error if this rule matches.
H = Handler. Forces the request to be handled as if it were the specified MIME-type.
N = Next. Forces the rule to start over again and re-match. BE CAREFUL! Loops can result.
NC = No case. Allows jpg
to match both jpg and JPG.
NE = No escape. Prevents the rewriting of special characters (. ? # & etc) into their hex-code equivalents.
NS = No subrequests. If you're using server-side-includes, this will prevent matches to the included files.
P = Proxy. Forces the rule to be handled by mod_proxy. Transparently provide content from other servers, because your web-server fetches it and re-serves it. This is a dangerous flag, as a poorly written one will turn your web-server into an open-proxy and That is Bad.
PT = Pass Through. Take into account Alias statements in RewriteRule matching.
QSA = QSAppend. When the original string contains a query (http://example.com/thing?asp=foo) append the original query string to the rewritten string. Normally it would be discarded. Important for dynamic content.
R = Redirect. Provide an HTTP redirect to the specified URL. Can also provide exact redirect code [R=303]. Very similar to RedirectMatch
, which is faster and should be used when possible.
S = Skip. Skip this rule.
T = Type. Specify the mime-type of the returned content. Very similar to the AddType
directive.
You know how I said that RewriteCond
applies to one and only one rule? Well, you can get around that by chaining.
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^/blog/(.*).html /blog/$1.sf.html [C]
RewriteRule ^/blog/(.*).jpg /blog/$1.sf.jpg
Because the first RewriteRule has the Chain flag, the second rewrite-rule will execute when the first does, which is when the previous RewriteCond rule is matched. Handy if Apache regular-expressions make your brain hurt. However, the all-in-one-line method I point to in the first section is faster from an optimization point of view.
RewriteRule ^/blog/([0-9]4)/([-0-9a-zA-Z]*).html /newblog/$1/$2.shtml
This can be made simpler through flags:
RewriteRule ^/blog/([0-9]4)/([-0-9a-z]*).html /newblog/$1/$2.shtml [NC]
Also, some flags also apply to RewriteCond. Notably, NoCase.
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$) [NC]
Will match "ServerFault.com"
9
Well done. [filler]
– EEAA
Dec 20 '10 at 19:20
3
Very nicemod_rewrite
and regex primer. +1.
– Steven Monday
Dec 20 '10 at 23:24
3
It's sometimes useful to know that theRewriteCond
is actually processed after theRewriteRule
is matched. You might want to say "more on that later" near the top where you say "RewriteCond preceding RewriteRule makes that ONE rule subject to the conditional." You might want to mention that the regexes are Perl-compatible regular expressions. Also you have an extraneous apostrophe in "...the RewriteRule considers it's string-start..."
– Dennis Williamson
Dec 20 '10 at 23:57
2
RewriteRule ^/blog/.*/(.*)$ /newblog/$1
does not match the first directory component - rewriterules are greedy by default. /.*/(.*) matches both /1/(2)/ and /1/2/3/4/5/(6)/, so you need /[^/]*/ to only match the FIRST path component.
– adaptr
Apr 12 '12 at 12:55
1
@sysadmin1138, I think this answer is good but it can be better if you elaborate more on the flags E, N, NS, P, PT, and S with examples because those flags ain't obvious how they work etc.
– Pacerier
Aug 5 '13 at 2:17
|
show 8 more comments
What is the fundamental format and
structure of mod_rewrite rules?
I'll defer to sysadmin1138's excellent answer on these points.
What form/flavor of regular
expressions do I need to have a solid
grasp of?
In addition to the syntax order, syntax matching/regular expressions, and RewriteRule flags outlined by sysadmin1138, I believe it bears mentioning that mod_rewrite exposes Apache environment variables based on HTTP request headers and Apache's configuration.
I would recommend AskApache's mod_rewrite Debug Tutorial for a comprehensive list of variables which may be available to mod_rewrite.
What are the most common
mistakes/pitfalls when writing rewrite
rules?
Most problems with RewriteRule's stem from a misunderstanding of PCRE syntax/failure to properly escape special characters or a lack of insight into the content of the variable(s) used for matching.
Typical problems and recommended troubleshooting:
500 - Internal Server Error - Remove Windows carriage controls in configuration file(s) if present, make sure mod_rewrite is enabled (wrap directives inIfModule
conditional to avoid this scenario), check directive syntax, comment out directives until problem is identified
Redirect loop - Make use of RewriteLog and RewriteLogLevel, comment out directives until problem is identified
What is a good method for testing and
verifying mod_rewrite rules?
First, look at the contents of the environment variable(s) you plan to match against - if you have PHP installed, this is as simple as adding the following block to your application:
<?php
var_dump($_SERVER);
?>
... then write your rules (preferably for testing on a development server) and note any inconsistent matching or activity in your Apache ErrorLog file.
For more complex rules, use mod_rewrite's RewriteLog
directive to log activity to a file and set RewriteLogLevel 3
Are there SEO or performance
implications of mod_rewrite rules I
should be aware of?
AllowOverride all
impacts server performance as Apache must check for .htaccess
files and parse directives with each request - if possible, keep all directives in the VirtualHost configuration for your site or enable .htaccess
overrides only for the directories which need them.
Google's Webmaster Guidelines explicitly state: "Don't deceive your users or present different content to search engines than you display to users, which is commonly referred to as 'cloaking.'" - avoid creating mod_rewrite directives which filter for search engine robots.
Search engine robots prefer a 1:1 content:URI mapping (this is the basis for ranking links to content) - if you are using mod_rewrite to create temporary redirects or you are serving the same content under multiple URI's, consider specifying a canonical URI within your HTML documents.
Are there common situations where
mod_rewrite might seem like the right
tool for the job but isn't?
This is a huge (and potentially contentious) topic in its own right - better (IMHO) to address uses on a case-by-case basis and let askers determine whether the resolutions suggested are appropriate to their needs.
What are some common examples?
AskApache's mod_rewrite Tricks and Tips covers just about every common use-case that pops up regularly, however, the "correct" solution for a given user may depend upon the sophistication of the user's configuration and existing directives (which is why it is a generally a good idea to see which other directives a user has in place whenever a mod_rewrite question comes up).
Thanks for the AskApache link. It's what I was looking for!
– sica07
Nov 23 '11 at 22:14
The AskApache clown is officially unsupported by the ASF. Much of what he says is debatable or plain wrong.
– adaptr
Apr 12 '12 at 12:59
@adaptr Please share the superior resources which you are apparently aware of.
– danlefree
Apr 13 '12 at 0:56
"common situations where mod_rewrite might seem like the right tool for the job but isn't?" - simple redirects, where mod_rewrite is not already being used. Use mod_aliasRedirect
orRedirectMatch
instead. See also the Apache docs: When not to use mod_rewrite
– MrWhite
Dec 8 '16 at 15:27
add a comment |
Like many admin/developers I've been fighting the intricacies of rewrite rules for years and am unhappy with the existing Apache documentation, so I decided as a personal project to get to the bottom of how mod_rewrite
actually works and interacts with the rest of the Apache core, so over the last few months I've been instrumenting test cases with strace
+ drilling into the source code to get a handle on all of this.
Here are some key comments that rewrite rule developers need to consider:
- Some aspects of rewriting are common to server config, virtual host, directory, .htaccess processing however
- Some processing is very different for the root config (server config, virtual host and directory) as opposed to the PerDir (
.htaccess
) processing. - Worse because PerDir processing can almost indiscriminately trigger INTERNAL REDIRECT cycling, the root config elements have to be written aware that such PerDir processing can trigger this.
I would go as fas as to say that because of this you almost need to split the rewrite user communities into two categories and treat them as entirely separate:
Those with root access to the Apache config. These are typically admin/developer with an application dedicated server/VM, and the message here is quite simple: avoid using
.htaccess
files if at all possible; do everything in your server or vhost config. Debugging is reasonable easy since the developer can set debugging and has access to the rewrite.log files.Users of a shared hosted service (SHS).
- Such users have to use
.htaccess
/ Perdir processing as there is no alternative available. - Worse, the skill level of such users (as far as using the regexp driven ladder-logic of mod_rewrite) is generally significantly less than experienced admins.
- Apache and the hosting providers offer no debugging / diagnostic support. The only diagnostic information is a successful redirection, a redirection to the wrong URI. or a 404/500 status code. This leaves them confused and helpless.
- Apache is extremely weak explaining how rewriting works for this use case. For example it does not provide a clear explanation of what PerDir
.htaccess
file is selected and why. It does not explain the intricacies of PerDir cycling and how to avoid this.
- Such users have to use
There is possibly a third community: admin and support staff in SHS providers who end up with a foot in both camps and have to suffer the consequences of the above.
I have written a couple of article-style blog posts (e.g More on using Rewrite rules in .htaccess files) which covers a lot of detailed points which I won't repeat here to keep this post short. I have my own shared service as well as supporting some dedicated & VM FLOSS projects. I started out using a standard LAMP VM as a test vehicle for my SHS account, but in the end I found it better to do a proper mirror VM (described here).
However, in terms of how the admin community should support .htaccess
users, I feel that we need to develop and to offer:
- A coherent description of how the rewrite system actually works in PerDir processing
- A set of guidelines/best practices on how to write
.htaccess
rewrite rules - A simple web based rewrite script parser sort of similar to the W3C html parsers, but by which users can input test URIs or test vectors of the same and get an immediate log of the rewrite logic flow/
Hints on how to get built-in diagnostics from your rules (e.g.
- Use
[E=VAR:EXPR]
exploiting the fact thatEXPR
will expand backreferences ($N or %N) to make them available as diagnostics to the target script. If you topically order your rewrite rules using [OR],[C],[SKIP] and [L] flags so that the entire rewrite scheme works without the need to exploit internal redirection, then you can add the following as rule 1 to avoid all looping hassle:
RewriteCond %ENV:REDIRECT_STATUS !=""
RewriteRule . - [L]
- Use
This is well-documented. Why do you say the documentation does not explain this?
– adaptr
Apr 12 '12 at 12:57
2
All you have to do is to subscribe to the.htaccess
topics and you will see. Most beginners get hopelessly confused -- most of these have their first experience of a LAMP service and mod_rewrite on a shared service and therefore have no root access to the system/vhost configs and have to use per dir processing through.htaccess
files. There are important differences which the beginner has to "bleed over". I would regard myself as a power-user and am still discovering subtleties. As I sayu I've had to use strace and source-code scanning to work out some aspects.S houldn't be needed. :-(
– TerryE
Apr 13 '12 at 16:25
I totally agree. "We need to split the rewrite user communities into two categories and treat them as entirely separate." Some users are using shared hosting and need to rely on.htaccess
, which is terribly fragile, complicated, and confusing, even for experts. I'm STILL having trouble.
– Ryan
Jul 19 '17 at 17:05
add a comment |
Using rewritemap
There are lots of things you can do with rewritemaps. Rewritemaps get declared using the Rewritemap directive, and can then be used both in RewritCond evaluations, and in RewriteRule Subsitutions.
The general syntax for RewriteMap is:
RewriteMap MapName MapType:MapSource
For example:
RewriteMap examplemap txt:/path/to/file/map.txt
You can then use the mapname for constructs like this:
$examplemap:key
The map contains key/value pairs. If the key is found, the value is subsituted. Simple maps are just plain text files, but you can use hash maps, and even SQL queries. More details are in the docs:
http://httpd.apache.org/docs/2.2/mod/mod_rewrite.html#rewritemap
Unescaping strings.
There are four internal maps you can use to do some manipulations. Especially unescaping strings can come in handy.
For example: I want to test for the string "café" in the query string. However, the browser will escape this before sending it to my server, so I 'll need to either figure out what the URL escaped version is for every string I wish to match, or I can just unescape it...
RewriteMap unescape int:unescape
RewriteCond %QUERY_STRING (location|place)=(.*)
RewriteCond $unescape:%2 café
RewriteRule ^/find/$ /find/1234? [L,R]
Note how I use one RewriteCond to just capture the argument toe the query string parameter, and then use the map in the second rewriteCond to unescape it. This then gets compared.
Also note how I need to us %2 as key in the rewritemap, as %1 will contain either "location" or "place". When you use parentheses to group patterns they will also be captured, wether you plan to use the result of the capture or not...
The last sentence isn't quite true. Themod_rewrite
regexp engine supports non-capturing groups such as(?:location|place)
and this will only have one capture in the example.
– TerryE
Mar 10 '17 at 23:32
add a comment |
What are the most common
mistakes/pitfalls when writing rewrite
rules?
A really easy pitfall is when you rewrite URLs that alter the apparent path, e.g. from/base/1234/index.html
to /base/script.php?id=1234
. Any images or CSS with relative paths to the script location will not be found by the client. A number of options to resolve this can be found on this faq.
1
Thanks for the link. Particularly when working with other team members that are not familiar with rewriting, I find adding a<base>
tag to be most easy to follow and still enable relative paths.
– kontur
May 20 '12 at 11:12
add a comment |
protected by Chris S Feb 27 '14 at 16:16
Thank you for your interest in this question.
Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).
Would you like to answer one of these unanswered questions instead?
5 Answers
5
active
oldest
votes
5 Answers
5
active
oldest
votes
active
oldest
votes
active
oldest
votes
mod_rewrite syntax order
mod_rewrite has some specific ordering rules that affect processing. Before anything gets done, the RewriteEngine On
directive needs to be given as this turns on mod_rewrite processing. This should be before any other rewrite directives.
RewriteCond
preceding RewriteRule
makes that ONE rule subject to the conditional. Any following RewriteRules will be processed as if they were not subject to conditionals.
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule $/blog/(.*).html $/blog/$1.sf.html
In this simple case, if the HTTP referrer is from serverfault.com, redirect blog requests to special serverfault pages (we're just that special). However, if the above block had an extra RewriteRule line:
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule $/blog/(.*).html $/blog/$1.sf.html
RewriteRule $/blog/(.*).jpg $/blog/$1.sf.jpg
All .jpg files would go to the special serverfault pages, not just the ones with a referrer indicating it came from here. This is clearly not the intent of the how these rules are written. It could be done with multiple RewriteCond rules:
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^/blog/(.*).html /blog/$1.sf.html
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^/blog/(.*).jpg /blog/$1.sf.jpg
But probably should be done with some trickier replacement syntax.
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^/blog/(.*).(html|jpg) /blog/$1.sf.$2
The more complex RewriteRule contains the conditionals for processing. The last parenthetical, (html|jpg)
tells RewriteRule to match for either html
or jpg
, and to represent the matched string as $2 in the rewritten string. This is logically identical to the previous block, with two RewriteCond/RewriteRule pairs, it just does it on two lines instead of four.
Multiple RewriteCond lines are implicitly ANDed, and can be explicitly ORed. To handle referrers from both ServerFault and Super User (explicit OR):
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$) [OR]
RewriteCond %HTTP_REFERER ^https?://superuser.com(/|$)
RewriteRule ^/blog/(.*).(html|jpg) /blog/$1.sf.$2
To serve ServerFault referred pages with Chrome browsers (implicit AND):
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteCond %HTTP_USER_AGENT ^Mozilla.*Chrome.*$
RewriteRule ^/blog/(.*).(html|jpg) /blog/$1.sf.$2
RewriteBase
is also order specific as it specifies how following RewriteRule
directives handle their processing. It is very useful in .htaccess files. If used, it should be the first directive under "RewriteEngine on" in an .htaccess file. Take this example:
RewriteEngine On
RewriteBase /blog
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^(.*).(html|jpg) $1.sf.$2
This is telling mod_rewrite that this particular URL it is currently handling was arrived by way of http://example.com/blog/ instead of the physical directory path (/home/$Username/public_html/blog) and to treat it accordingly. Because of this, the RewriteRule
considers it's string-start to be after the "/blog" in the URL. Here is the same thing written two different ways. One with RewriteBase, the other without:
RewriteEngine On
##Example 1: No RewriteBase##
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule /home/assdr/public_html/blog/(.*).(html|jpg) $1.sf.$2
##Example 2: With RewriteBase##
RewriteBase /blog
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^(.*).(html|jpg) $1.sf.$2
As you can see, RewriteBase
allows rewrite rules to leverage the web-site path to content rather than the web-server, which can make them more intelligible to those who edit such files. Also, they can make the directives shorter, which has an aesthetic appeal.
RewriteRule matching syntax
RewriteRule itself has a complex syntax for matching strings. I'll cover the flags (things like [PT]) in another section. Because Sysadmins learn by example more often than by reading a man-page I'll give examples and explain what they do.
RewriteRule ^/blog/(.*)$ /newblog/$1
The .*
construct matches any single character (.
) zero or more times (*
). Enclosing it in parenthesis tells it to provide the string that was matched as the $1 variable.
RewriteRule ^/blog/.*/(.*)$ /newblog/$1
In this case, the first .* was NOT enclosed in parens so isn't provided to the rewritten string. This rule removes a directory level on the new blog-site. (/blog/2009/sample.html becomes /newblog/sample.html).
RewriteRule ^/blog/(2008|2009)/(.*)$ /newblog/$2
In this case, the first parenthesis expression sets up a matching group. This becomes $1, which is not needed and therefore not used in the rewritten string.
RewriteRule ^/blog/(2008|2009)/(.*)$ /newblog/$1/$2
In this case, we use $1 in the rewritten string.
RewriteRule ^/blog/(20[0-9][0-9])/(.*)$ /newblog/$1/$2
This rule uses a special bracket syntax that specifies a character range. [0-9] matches the numerals 0 through 9. This specific rule will handle years from 2000 to 2099.
RewriteRule ^/blog/(20[0-9]2)/(.*)$ /newblog/$1/$2
This does the same thing as the previous rule, but the 2 portion tells it to match the previous character (a bracket expression in this case) two times.
RewriteRule ^/blog/([0-9]4)/([a-z]*).html /newblog/$1/$2.shtml
This case will match any lower-case letter in the second matching expression, and do so for as many characters as it can. The .
construct tells it to treat the period as an actual period, not the special character it is in previous examples. It will break if the file-name has dashes in it, though.
RewriteRule ^/blog/([0-9]4)/([-a-z]*).html /newblog/$1/$2.shtml
This traps file-names with dashes in them. However, as -
is a special character in bracket expressions, it has to be the first character in the expression.
RewriteRule ^/blog/([0-9]4)/([-0-9a-zA-Z]*).html /newblog/$1/$2.shtml
This version traps any file name with letters, numbers or the -
character in the file-name. This is how you specify multiple character sets in a bracket expression.
RewriteRule flags
The flags on rewrite rules have a host of special meanings and usecases.
RewriteRule ^/blog/([0-9]4)/([-a-z]*).html /newblog/$1/$2.shtml [L]
The flag is the [L]
at the end of the above expression. Multiple flags can be used, separated by a comma. The linked documentation describes each one, but here they are anyway:
L = Last. Stop processing RewriteRules once this one matches. Order counts!
C = Chain. Continue processing the next RewriteRule. If this rule doesn't match, then the next rule won't be executed. More on this later.
E = Set environmental variable. Apache has various environmental variables that can affect web-server behavior.
F = Forbidden. Returns a 403-Forbidden error if this rule matches.
G = Gone. Returns a 410-Gone error if this rule matches.
H = Handler. Forces the request to be handled as if it were the specified MIME-type.
N = Next. Forces the rule to start over again and re-match. BE CAREFUL! Loops can result.
NC = No case. Allows jpg
to match both jpg and JPG.
NE = No escape. Prevents the rewriting of special characters (. ? # & etc) into their hex-code equivalents.
NS = No subrequests. If you're using server-side-includes, this will prevent matches to the included files.
P = Proxy. Forces the rule to be handled by mod_proxy. Transparently provide content from other servers, because your web-server fetches it and re-serves it. This is a dangerous flag, as a poorly written one will turn your web-server into an open-proxy and That is Bad.
PT = Pass Through. Take into account Alias statements in RewriteRule matching.
QSA = QSAppend. When the original string contains a query (http://example.com/thing?asp=foo) append the original query string to the rewritten string. Normally it would be discarded. Important for dynamic content.
R = Redirect. Provide an HTTP redirect to the specified URL. Can also provide exact redirect code [R=303]. Very similar to RedirectMatch
, which is faster and should be used when possible.
S = Skip. Skip this rule.
T = Type. Specify the mime-type of the returned content. Very similar to the AddType
directive.
You know how I said that RewriteCond
applies to one and only one rule? Well, you can get around that by chaining.
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^/blog/(.*).html /blog/$1.sf.html [C]
RewriteRule ^/blog/(.*).jpg /blog/$1.sf.jpg
Because the first RewriteRule has the Chain flag, the second rewrite-rule will execute when the first does, which is when the previous RewriteCond rule is matched. Handy if Apache regular-expressions make your brain hurt. However, the all-in-one-line method I point to in the first section is faster from an optimization point of view.
RewriteRule ^/blog/([0-9]4)/([-0-9a-zA-Z]*).html /newblog/$1/$2.shtml
This can be made simpler through flags:
RewriteRule ^/blog/([0-9]4)/([-0-9a-z]*).html /newblog/$1/$2.shtml [NC]
Also, some flags also apply to RewriteCond. Notably, NoCase.
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$) [NC]
Will match "ServerFault.com"
9
Well done. [filler]
– EEAA
Dec 20 '10 at 19:20
3
Very nicemod_rewrite
and regex primer. +1.
– Steven Monday
Dec 20 '10 at 23:24
3
It's sometimes useful to know that theRewriteCond
is actually processed after theRewriteRule
is matched. You might want to say "more on that later" near the top where you say "RewriteCond preceding RewriteRule makes that ONE rule subject to the conditional." You might want to mention that the regexes are Perl-compatible regular expressions. Also you have an extraneous apostrophe in "...the RewriteRule considers it's string-start..."
– Dennis Williamson
Dec 20 '10 at 23:57
2
RewriteRule ^/blog/.*/(.*)$ /newblog/$1
does not match the first directory component - rewriterules are greedy by default. /.*/(.*) matches both /1/(2)/ and /1/2/3/4/5/(6)/, so you need /[^/]*/ to only match the FIRST path component.
– adaptr
Apr 12 '12 at 12:55
1
@sysadmin1138, I think this answer is good but it can be better if you elaborate more on the flags E, N, NS, P, PT, and S with examples because those flags ain't obvious how they work etc.
– Pacerier
Aug 5 '13 at 2:17
|
show 8 more comments
mod_rewrite syntax order
mod_rewrite has some specific ordering rules that affect processing. Before anything gets done, the RewriteEngine On
directive needs to be given as this turns on mod_rewrite processing. This should be before any other rewrite directives.
RewriteCond
preceding RewriteRule
makes that ONE rule subject to the conditional. Any following RewriteRules will be processed as if they were not subject to conditionals.
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule $/blog/(.*).html $/blog/$1.sf.html
In this simple case, if the HTTP referrer is from serverfault.com, redirect blog requests to special serverfault pages (we're just that special). However, if the above block had an extra RewriteRule line:
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule $/blog/(.*).html $/blog/$1.sf.html
RewriteRule $/blog/(.*).jpg $/blog/$1.sf.jpg
All .jpg files would go to the special serverfault pages, not just the ones with a referrer indicating it came from here. This is clearly not the intent of the how these rules are written. It could be done with multiple RewriteCond rules:
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^/blog/(.*).html /blog/$1.sf.html
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^/blog/(.*).jpg /blog/$1.sf.jpg
But probably should be done with some trickier replacement syntax.
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^/blog/(.*).(html|jpg) /blog/$1.sf.$2
The more complex RewriteRule contains the conditionals for processing. The last parenthetical, (html|jpg)
tells RewriteRule to match for either html
or jpg
, and to represent the matched string as $2 in the rewritten string. This is logically identical to the previous block, with two RewriteCond/RewriteRule pairs, it just does it on two lines instead of four.
Multiple RewriteCond lines are implicitly ANDed, and can be explicitly ORed. To handle referrers from both ServerFault and Super User (explicit OR):
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$) [OR]
RewriteCond %HTTP_REFERER ^https?://superuser.com(/|$)
RewriteRule ^/blog/(.*).(html|jpg) /blog/$1.sf.$2
To serve ServerFault referred pages with Chrome browsers (implicit AND):
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteCond %HTTP_USER_AGENT ^Mozilla.*Chrome.*$
RewriteRule ^/blog/(.*).(html|jpg) /blog/$1.sf.$2
RewriteBase
is also order specific as it specifies how following RewriteRule
directives handle their processing. It is very useful in .htaccess files. If used, it should be the first directive under "RewriteEngine on" in an .htaccess file. Take this example:
RewriteEngine On
RewriteBase /blog
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^(.*).(html|jpg) $1.sf.$2
This is telling mod_rewrite that this particular URL it is currently handling was arrived by way of http://example.com/blog/ instead of the physical directory path (/home/$Username/public_html/blog) and to treat it accordingly. Because of this, the RewriteRule
considers it's string-start to be after the "/blog" in the URL. Here is the same thing written two different ways. One with RewriteBase, the other without:
RewriteEngine On
##Example 1: No RewriteBase##
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule /home/assdr/public_html/blog/(.*).(html|jpg) $1.sf.$2
##Example 2: With RewriteBase##
RewriteBase /blog
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^(.*).(html|jpg) $1.sf.$2
As you can see, RewriteBase
allows rewrite rules to leverage the web-site path to content rather than the web-server, which can make them more intelligible to those who edit such files. Also, they can make the directives shorter, which has an aesthetic appeal.
RewriteRule matching syntax
RewriteRule itself has a complex syntax for matching strings. I'll cover the flags (things like [PT]) in another section. Because Sysadmins learn by example more often than by reading a man-page I'll give examples and explain what they do.
RewriteRule ^/blog/(.*)$ /newblog/$1
The .*
construct matches any single character (.
) zero or more times (*
). Enclosing it in parenthesis tells it to provide the string that was matched as the $1 variable.
RewriteRule ^/blog/.*/(.*)$ /newblog/$1
In this case, the first .* was NOT enclosed in parens so isn't provided to the rewritten string. This rule removes a directory level on the new blog-site. (/blog/2009/sample.html becomes /newblog/sample.html).
RewriteRule ^/blog/(2008|2009)/(.*)$ /newblog/$2
In this case, the first parenthesis expression sets up a matching group. This becomes $1, which is not needed and therefore not used in the rewritten string.
RewriteRule ^/blog/(2008|2009)/(.*)$ /newblog/$1/$2
In this case, we use $1 in the rewritten string.
RewriteRule ^/blog/(20[0-9][0-9])/(.*)$ /newblog/$1/$2
This rule uses a special bracket syntax that specifies a character range. [0-9] matches the numerals 0 through 9. This specific rule will handle years from 2000 to 2099.
RewriteRule ^/blog/(20[0-9]2)/(.*)$ /newblog/$1/$2
This does the same thing as the previous rule, but the 2 portion tells it to match the previous character (a bracket expression in this case) two times.
RewriteRule ^/blog/([0-9]4)/([a-z]*).html /newblog/$1/$2.shtml
This case will match any lower-case letter in the second matching expression, and do so for as many characters as it can. The .
construct tells it to treat the period as an actual period, not the special character it is in previous examples. It will break if the file-name has dashes in it, though.
RewriteRule ^/blog/([0-9]4)/([-a-z]*).html /newblog/$1/$2.shtml
This traps file-names with dashes in them. However, as -
is a special character in bracket expressions, it has to be the first character in the expression.
RewriteRule ^/blog/([0-9]4)/([-0-9a-zA-Z]*).html /newblog/$1/$2.shtml
This version traps any file name with letters, numbers or the -
character in the file-name. This is how you specify multiple character sets in a bracket expression.
RewriteRule flags
The flags on rewrite rules have a host of special meanings and usecases.
RewriteRule ^/blog/([0-9]4)/([-a-z]*).html /newblog/$1/$2.shtml [L]
The flag is the [L]
at the end of the above expression. Multiple flags can be used, separated by a comma. The linked documentation describes each one, but here they are anyway:
L = Last. Stop processing RewriteRules once this one matches. Order counts!
C = Chain. Continue processing the next RewriteRule. If this rule doesn't match, then the next rule won't be executed. More on this later.
E = Set environmental variable. Apache has various environmental variables that can affect web-server behavior.
F = Forbidden. Returns a 403-Forbidden error if this rule matches.
G = Gone. Returns a 410-Gone error if this rule matches.
H = Handler. Forces the request to be handled as if it were the specified MIME-type.
N = Next. Forces the rule to start over again and re-match. BE CAREFUL! Loops can result.
NC = No case. Allows jpg
to match both jpg and JPG.
NE = No escape. Prevents the rewriting of special characters (. ? # & etc) into their hex-code equivalents.
NS = No subrequests. If you're using server-side-includes, this will prevent matches to the included files.
P = Proxy. Forces the rule to be handled by mod_proxy. Transparently provide content from other servers, because your web-server fetches it and re-serves it. This is a dangerous flag, as a poorly written one will turn your web-server into an open-proxy and That is Bad.
PT = Pass Through. Take into account Alias statements in RewriteRule matching.
QSA = QSAppend. When the original string contains a query (http://example.com/thing?asp=foo) append the original query string to the rewritten string. Normally it would be discarded. Important for dynamic content.
R = Redirect. Provide an HTTP redirect to the specified URL. Can also provide exact redirect code [R=303]. Very similar to RedirectMatch
, which is faster and should be used when possible.
S = Skip. Skip this rule.
T = Type. Specify the mime-type of the returned content. Very similar to the AddType
directive.
You know how I said that RewriteCond
applies to one and only one rule? Well, you can get around that by chaining.
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^/blog/(.*).html /blog/$1.sf.html [C]
RewriteRule ^/blog/(.*).jpg /blog/$1.sf.jpg
Because the first RewriteRule has the Chain flag, the second rewrite-rule will execute when the first does, which is when the previous RewriteCond rule is matched. Handy if Apache regular-expressions make your brain hurt. However, the all-in-one-line method I point to in the first section is faster from an optimization point of view.
RewriteRule ^/blog/([0-9]4)/([-0-9a-zA-Z]*).html /newblog/$1/$2.shtml
This can be made simpler through flags:
RewriteRule ^/blog/([0-9]4)/([-0-9a-z]*).html /newblog/$1/$2.shtml [NC]
Also, some flags also apply to RewriteCond. Notably, NoCase.
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$) [NC]
Will match "ServerFault.com"
9
Well done. [filler]
– EEAA
Dec 20 '10 at 19:20
3
Very nicemod_rewrite
and regex primer. +1.
– Steven Monday
Dec 20 '10 at 23:24
3
It's sometimes useful to know that theRewriteCond
is actually processed after theRewriteRule
is matched. You might want to say "more on that later" near the top where you say "RewriteCond preceding RewriteRule makes that ONE rule subject to the conditional." You might want to mention that the regexes are Perl-compatible regular expressions. Also you have an extraneous apostrophe in "...the RewriteRule considers it's string-start..."
– Dennis Williamson
Dec 20 '10 at 23:57
2
RewriteRule ^/blog/.*/(.*)$ /newblog/$1
does not match the first directory component - rewriterules are greedy by default. /.*/(.*) matches both /1/(2)/ and /1/2/3/4/5/(6)/, so you need /[^/]*/ to only match the FIRST path component.
– adaptr
Apr 12 '12 at 12:55
1
@sysadmin1138, I think this answer is good but it can be better if you elaborate more on the flags E, N, NS, P, PT, and S with examples because those flags ain't obvious how they work etc.
– Pacerier
Aug 5 '13 at 2:17
|
show 8 more comments
mod_rewrite syntax order
mod_rewrite has some specific ordering rules that affect processing. Before anything gets done, the RewriteEngine On
directive needs to be given as this turns on mod_rewrite processing. This should be before any other rewrite directives.
RewriteCond
preceding RewriteRule
makes that ONE rule subject to the conditional. Any following RewriteRules will be processed as if they were not subject to conditionals.
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule $/blog/(.*).html $/blog/$1.sf.html
In this simple case, if the HTTP referrer is from serverfault.com, redirect blog requests to special serverfault pages (we're just that special). However, if the above block had an extra RewriteRule line:
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule $/blog/(.*).html $/blog/$1.sf.html
RewriteRule $/blog/(.*).jpg $/blog/$1.sf.jpg
All .jpg files would go to the special serverfault pages, not just the ones with a referrer indicating it came from here. This is clearly not the intent of the how these rules are written. It could be done with multiple RewriteCond rules:
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^/blog/(.*).html /blog/$1.sf.html
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^/blog/(.*).jpg /blog/$1.sf.jpg
But probably should be done with some trickier replacement syntax.
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^/blog/(.*).(html|jpg) /blog/$1.sf.$2
The more complex RewriteRule contains the conditionals for processing. The last parenthetical, (html|jpg)
tells RewriteRule to match for either html
or jpg
, and to represent the matched string as $2 in the rewritten string. This is logically identical to the previous block, with two RewriteCond/RewriteRule pairs, it just does it on two lines instead of four.
Multiple RewriteCond lines are implicitly ANDed, and can be explicitly ORed. To handle referrers from both ServerFault and Super User (explicit OR):
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$) [OR]
RewriteCond %HTTP_REFERER ^https?://superuser.com(/|$)
RewriteRule ^/blog/(.*).(html|jpg) /blog/$1.sf.$2
To serve ServerFault referred pages with Chrome browsers (implicit AND):
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteCond %HTTP_USER_AGENT ^Mozilla.*Chrome.*$
RewriteRule ^/blog/(.*).(html|jpg) /blog/$1.sf.$2
RewriteBase
is also order specific as it specifies how following RewriteRule
directives handle their processing. It is very useful in .htaccess files. If used, it should be the first directive under "RewriteEngine on" in an .htaccess file. Take this example:
RewriteEngine On
RewriteBase /blog
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^(.*).(html|jpg) $1.sf.$2
This is telling mod_rewrite that this particular URL it is currently handling was arrived by way of http://example.com/blog/ instead of the physical directory path (/home/$Username/public_html/blog) and to treat it accordingly. Because of this, the RewriteRule
considers it's string-start to be after the "/blog" in the URL. Here is the same thing written two different ways. One with RewriteBase, the other without:
RewriteEngine On
##Example 1: No RewriteBase##
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule /home/assdr/public_html/blog/(.*).(html|jpg) $1.sf.$2
##Example 2: With RewriteBase##
RewriteBase /blog
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^(.*).(html|jpg) $1.sf.$2
As you can see, RewriteBase
allows rewrite rules to leverage the web-site path to content rather than the web-server, which can make them more intelligible to those who edit such files. Also, they can make the directives shorter, which has an aesthetic appeal.
RewriteRule matching syntax
RewriteRule itself has a complex syntax for matching strings. I'll cover the flags (things like [PT]) in another section. Because Sysadmins learn by example more often than by reading a man-page I'll give examples and explain what they do.
RewriteRule ^/blog/(.*)$ /newblog/$1
The .*
construct matches any single character (.
) zero or more times (*
). Enclosing it in parenthesis tells it to provide the string that was matched as the $1 variable.
RewriteRule ^/blog/.*/(.*)$ /newblog/$1
In this case, the first .* was NOT enclosed in parens so isn't provided to the rewritten string. This rule removes a directory level on the new blog-site. (/blog/2009/sample.html becomes /newblog/sample.html).
RewriteRule ^/blog/(2008|2009)/(.*)$ /newblog/$2
In this case, the first parenthesis expression sets up a matching group. This becomes $1, which is not needed and therefore not used in the rewritten string.
RewriteRule ^/blog/(2008|2009)/(.*)$ /newblog/$1/$2
In this case, we use $1 in the rewritten string.
RewriteRule ^/blog/(20[0-9][0-9])/(.*)$ /newblog/$1/$2
This rule uses a special bracket syntax that specifies a character range. [0-9] matches the numerals 0 through 9. This specific rule will handle years from 2000 to 2099.
RewriteRule ^/blog/(20[0-9]2)/(.*)$ /newblog/$1/$2
This does the same thing as the previous rule, but the 2 portion tells it to match the previous character (a bracket expression in this case) two times.
RewriteRule ^/blog/([0-9]4)/([a-z]*).html /newblog/$1/$2.shtml
This case will match any lower-case letter in the second matching expression, and do so for as many characters as it can. The .
construct tells it to treat the period as an actual period, not the special character it is in previous examples. It will break if the file-name has dashes in it, though.
RewriteRule ^/blog/([0-9]4)/([-a-z]*).html /newblog/$1/$2.shtml
This traps file-names with dashes in them. However, as -
is a special character in bracket expressions, it has to be the first character in the expression.
RewriteRule ^/blog/([0-9]4)/([-0-9a-zA-Z]*).html /newblog/$1/$2.shtml
This version traps any file name with letters, numbers or the -
character in the file-name. This is how you specify multiple character sets in a bracket expression.
RewriteRule flags
The flags on rewrite rules have a host of special meanings and usecases.
RewriteRule ^/blog/([0-9]4)/([-a-z]*).html /newblog/$1/$2.shtml [L]
The flag is the [L]
at the end of the above expression. Multiple flags can be used, separated by a comma. The linked documentation describes each one, but here they are anyway:
L = Last. Stop processing RewriteRules once this one matches. Order counts!
C = Chain. Continue processing the next RewriteRule. If this rule doesn't match, then the next rule won't be executed. More on this later.
E = Set environmental variable. Apache has various environmental variables that can affect web-server behavior.
F = Forbidden. Returns a 403-Forbidden error if this rule matches.
G = Gone. Returns a 410-Gone error if this rule matches.
H = Handler. Forces the request to be handled as if it were the specified MIME-type.
N = Next. Forces the rule to start over again and re-match. BE CAREFUL! Loops can result.
NC = No case. Allows jpg
to match both jpg and JPG.
NE = No escape. Prevents the rewriting of special characters (. ? # & etc) into their hex-code equivalents.
NS = No subrequests. If you're using server-side-includes, this will prevent matches to the included files.
P = Proxy. Forces the rule to be handled by mod_proxy. Transparently provide content from other servers, because your web-server fetches it and re-serves it. This is a dangerous flag, as a poorly written one will turn your web-server into an open-proxy and That is Bad.
PT = Pass Through. Take into account Alias statements in RewriteRule matching.
QSA = QSAppend. When the original string contains a query (http://example.com/thing?asp=foo) append the original query string to the rewritten string. Normally it would be discarded. Important for dynamic content.
R = Redirect. Provide an HTTP redirect to the specified URL. Can also provide exact redirect code [R=303]. Very similar to RedirectMatch
, which is faster and should be used when possible.
S = Skip. Skip this rule.
T = Type. Specify the mime-type of the returned content. Very similar to the AddType
directive.
You know how I said that RewriteCond
applies to one and only one rule? Well, you can get around that by chaining.
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^/blog/(.*).html /blog/$1.sf.html [C]
RewriteRule ^/blog/(.*).jpg /blog/$1.sf.jpg
Because the first RewriteRule has the Chain flag, the second rewrite-rule will execute when the first does, which is when the previous RewriteCond rule is matched. Handy if Apache regular-expressions make your brain hurt. However, the all-in-one-line method I point to in the first section is faster from an optimization point of view.
RewriteRule ^/blog/([0-9]4)/([-0-9a-zA-Z]*).html /newblog/$1/$2.shtml
This can be made simpler through flags:
RewriteRule ^/blog/([0-9]4)/([-0-9a-z]*).html /newblog/$1/$2.shtml [NC]
Also, some flags also apply to RewriteCond. Notably, NoCase.
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$) [NC]
Will match "ServerFault.com"
mod_rewrite syntax order
mod_rewrite has some specific ordering rules that affect processing. Before anything gets done, the RewriteEngine On
directive needs to be given as this turns on mod_rewrite processing. This should be before any other rewrite directives.
RewriteCond
preceding RewriteRule
makes that ONE rule subject to the conditional. Any following RewriteRules will be processed as if they were not subject to conditionals.
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule $/blog/(.*).html $/blog/$1.sf.html
In this simple case, if the HTTP referrer is from serverfault.com, redirect blog requests to special serverfault pages (we're just that special). However, if the above block had an extra RewriteRule line:
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule $/blog/(.*).html $/blog/$1.sf.html
RewriteRule $/blog/(.*).jpg $/blog/$1.sf.jpg
All .jpg files would go to the special serverfault pages, not just the ones with a referrer indicating it came from here. This is clearly not the intent of the how these rules are written. It could be done with multiple RewriteCond rules:
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^/blog/(.*).html /blog/$1.sf.html
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^/blog/(.*).jpg /blog/$1.sf.jpg
But probably should be done with some trickier replacement syntax.
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^/blog/(.*).(html|jpg) /blog/$1.sf.$2
The more complex RewriteRule contains the conditionals for processing. The last parenthetical, (html|jpg)
tells RewriteRule to match for either html
or jpg
, and to represent the matched string as $2 in the rewritten string. This is logically identical to the previous block, with two RewriteCond/RewriteRule pairs, it just does it on two lines instead of four.
Multiple RewriteCond lines are implicitly ANDed, and can be explicitly ORed. To handle referrers from both ServerFault and Super User (explicit OR):
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$) [OR]
RewriteCond %HTTP_REFERER ^https?://superuser.com(/|$)
RewriteRule ^/blog/(.*).(html|jpg) /blog/$1.sf.$2
To serve ServerFault referred pages with Chrome browsers (implicit AND):
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteCond %HTTP_USER_AGENT ^Mozilla.*Chrome.*$
RewriteRule ^/blog/(.*).(html|jpg) /blog/$1.sf.$2
RewriteBase
is also order specific as it specifies how following RewriteRule
directives handle their processing. It is very useful in .htaccess files. If used, it should be the first directive under "RewriteEngine on" in an .htaccess file. Take this example:
RewriteEngine On
RewriteBase /blog
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^(.*).(html|jpg) $1.sf.$2
This is telling mod_rewrite that this particular URL it is currently handling was arrived by way of http://example.com/blog/ instead of the physical directory path (/home/$Username/public_html/blog) and to treat it accordingly. Because of this, the RewriteRule
considers it's string-start to be after the "/blog" in the URL. Here is the same thing written two different ways. One with RewriteBase, the other without:
RewriteEngine On
##Example 1: No RewriteBase##
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule /home/assdr/public_html/blog/(.*).(html|jpg) $1.sf.$2
##Example 2: With RewriteBase##
RewriteBase /blog
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^(.*).(html|jpg) $1.sf.$2
As you can see, RewriteBase
allows rewrite rules to leverage the web-site path to content rather than the web-server, which can make them more intelligible to those who edit such files. Also, they can make the directives shorter, which has an aesthetic appeal.
RewriteRule matching syntax
RewriteRule itself has a complex syntax for matching strings. I'll cover the flags (things like [PT]) in another section. Because Sysadmins learn by example more often than by reading a man-page I'll give examples and explain what they do.
RewriteRule ^/blog/(.*)$ /newblog/$1
The .*
construct matches any single character (.
) zero or more times (*
). Enclosing it in parenthesis tells it to provide the string that was matched as the $1 variable.
RewriteRule ^/blog/.*/(.*)$ /newblog/$1
In this case, the first .* was NOT enclosed in parens so isn't provided to the rewritten string. This rule removes a directory level on the new blog-site. (/blog/2009/sample.html becomes /newblog/sample.html).
RewriteRule ^/blog/(2008|2009)/(.*)$ /newblog/$2
In this case, the first parenthesis expression sets up a matching group. This becomes $1, which is not needed and therefore not used in the rewritten string.
RewriteRule ^/blog/(2008|2009)/(.*)$ /newblog/$1/$2
In this case, we use $1 in the rewritten string.
RewriteRule ^/blog/(20[0-9][0-9])/(.*)$ /newblog/$1/$2
This rule uses a special bracket syntax that specifies a character range. [0-9] matches the numerals 0 through 9. This specific rule will handle years from 2000 to 2099.
RewriteRule ^/blog/(20[0-9]2)/(.*)$ /newblog/$1/$2
This does the same thing as the previous rule, but the 2 portion tells it to match the previous character (a bracket expression in this case) two times.
RewriteRule ^/blog/([0-9]4)/([a-z]*).html /newblog/$1/$2.shtml
This case will match any lower-case letter in the second matching expression, and do so for as many characters as it can. The .
construct tells it to treat the period as an actual period, not the special character it is in previous examples. It will break if the file-name has dashes in it, though.
RewriteRule ^/blog/([0-9]4)/([-a-z]*).html /newblog/$1/$2.shtml
This traps file-names with dashes in them. However, as -
is a special character in bracket expressions, it has to be the first character in the expression.
RewriteRule ^/blog/([0-9]4)/([-0-9a-zA-Z]*).html /newblog/$1/$2.shtml
This version traps any file name with letters, numbers or the -
character in the file-name. This is how you specify multiple character sets in a bracket expression.
RewriteRule flags
The flags on rewrite rules have a host of special meanings and usecases.
RewriteRule ^/blog/([0-9]4)/([-a-z]*).html /newblog/$1/$2.shtml [L]
The flag is the [L]
at the end of the above expression. Multiple flags can be used, separated by a comma. The linked documentation describes each one, but here they are anyway:
L = Last. Stop processing RewriteRules once this one matches. Order counts!
C = Chain. Continue processing the next RewriteRule. If this rule doesn't match, then the next rule won't be executed. More on this later.
E = Set environmental variable. Apache has various environmental variables that can affect web-server behavior.
F = Forbidden. Returns a 403-Forbidden error if this rule matches.
G = Gone. Returns a 410-Gone error if this rule matches.
H = Handler. Forces the request to be handled as if it were the specified MIME-type.
N = Next. Forces the rule to start over again and re-match. BE CAREFUL! Loops can result.
NC = No case. Allows jpg
to match both jpg and JPG.
NE = No escape. Prevents the rewriting of special characters (. ? # & etc) into their hex-code equivalents.
NS = No subrequests. If you're using server-side-includes, this will prevent matches to the included files.
P = Proxy. Forces the rule to be handled by mod_proxy. Transparently provide content from other servers, because your web-server fetches it and re-serves it. This is a dangerous flag, as a poorly written one will turn your web-server into an open-proxy and That is Bad.
PT = Pass Through. Take into account Alias statements in RewriteRule matching.
QSA = QSAppend. When the original string contains a query (http://example.com/thing?asp=foo) append the original query string to the rewritten string. Normally it would be discarded. Important for dynamic content.
R = Redirect. Provide an HTTP redirect to the specified URL. Can also provide exact redirect code [R=303]. Very similar to RedirectMatch
, which is faster and should be used when possible.
S = Skip. Skip this rule.
T = Type. Specify the mime-type of the returned content. Very similar to the AddType
directive.
You know how I said that RewriteCond
applies to one and only one rule? Well, you can get around that by chaining.
RewriteEngine On
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$)
RewriteRule ^/blog/(.*).html /blog/$1.sf.html [C]
RewriteRule ^/blog/(.*).jpg /blog/$1.sf.jpg
Because the first RewriteRule has the Chain flag, the second rewrite-rule will execute when the first does, which is when the previous RewriteCond rule is matched. Handy if Apache regular-expressions make your brain hurt. However, the all-in-one-line method I point to in the first section is faster from an optimization point of view.
RewriteRule ^/blog/([0-9]4)/([-0-9a-zA-Z]*).html /newblog/$1/$2.shtml
This can be made simpler through flags:
RewriteRule ^/blog/([0-9]4)/([-0-9a-z]*).html /newblog/$1/$2.shtml [NC]
Also, some flags also apply to RewriteCond. Notably, NoCase.
RewriteCond %HTTP_REFERER ^https?://serverfault.com(/|$) [NC]
Will match "ServerFault.com"
edited Mar 17 '14 at 20:57
BMDan
5,81421531
5,81421531
answered Dec 20 '10 at 17:44
sysadmin1138♦sysadmin1138
118k17148282
118k17148282
9
Well done. [filler]
– EEAA
Dec 20 '10 at 19:20
3
Very nicemod_rewrite
and regex primer. +1.
– Steven Monday
Dec 20 '10 at 23:24
3
It's sometimes useful to know that theRewriteCond
is actually processed after theRewriteRule
is matched. You might want to say "more on that later" near the top where you say "RewriteCond preceding RewriteRule makes that ONE rule subject to the conditional." You might want to mention that the regexes are Perl-compatible regular expressions. Also you have an extraneous apostrophe in "...the RewriteRule considers it's string-start..."
– Dennis Williamson
Dec 20 '10 at 23:57
2
RewriteRule ^/blog/.*/(.*)$ /newblog/$1
does not match the first directory component - rewriterules are greedy by default. /.*/(.*) matches both /1/(2)/ and /1/2/3/4/5/(6)/, so you need /[^/]*/ to only match the FIRST path component.
– adaptr
Apr 12 '12 at 12:55
1
@sysadmin1138, I think this answer is good but it can be better if you elaborate more on the flags E, N, NS, P, PT, and S with examples because those flags ain't obvious how they work etc.
– Pacerier
Aug 5 '13 at 2:17
|
show 8 more comments
9
Well done. [filler]
– EEAA
Dec 20 '10 at 19:20
3
Very nicemod_rewrite
and regex primer. +1.
– Steven Monday
Dec 20 '10 at 23:24
3
It's sometimes useful to know that theRewriteCond
is actually processed after theRewriteRule
is matched. You might want to say "more on that later" near the top where you say "RewriteCond preceding RewriteRule makes that ONE rule subject to the conditional." You might want to mention that the regexes are Perl-compatible regular expressions. Also you have an extraneous apostrophe in "...the RewriteRule considers it's string-start..."
– Dennis Williamson
Dec 20 '10 at 23:57
2
RewriteRule ^/blog/.*/(.*)$ /newblog/$1
does not match the first directory component - rewriterules are greedy by default. /.*/(.*) matches both /1/(2)/ and /1/2/3/4/5/(6)/, so you need /[^/]*/ to only match the FIRST path component.
– adaptr
Apr 12 '12 at 12:55
1
@sysadmin1138, I think this answer is good but it can be better if you elaborate more on the flags E, N, NS, P, PT, and S with examples because those flags ain't obvious how they work etc.
– Pacerier
Aug 5 '13 at 2:17
9
9
Well done. [filler]
– EEAA
Dec 20 '10 at 19:20
Well done. [filler]
– EEAA
Dec 20 '10 at 19:20
3
3
Very nice
mod_rewrite
and regex primer. +1.– Steven Monday
Dec 20 '10 at 23:24
Very nice
mod_rewrite
and regex primer. +1.– Steven Monday
Dec 20 '10 at 23:24
3
3
It's sometimes useful to know that the
RewriteCond
is actually processed after the RewriteRule
is matched. You might want to say "more on that later" near the top where you say "RewriteCond preceding RewriteRule makes that ONE rule subject to the conditional." You might want to mention that the regexes are Perl-compatible regular expressions. Also you have an extraneous apostrophe in "...the RewriteRule considers it's string-start..."– Dennis Williamson
Dec 20 '10 at 23:57
It's sometimes useful to know that the
RewriteCond
is actually processed after the RewriteRule
is matched. You might want to say "more on that later" near the top where you say "RewriteCond preceding RewriteRule makes that ONE rule subject to the conditional." You might want to mention that the regexes are Perl-compatible regular expressions. Also you have an extraneous apostrophe in "...the RewriteRule considers it's string-start..."– Dennis Williamson
Dec 20 '10 at 23:57
2
2
RewriteRule ^/blog/.*/(.*)$ /newblog/$1
does not match the first directory component - rewriterules are greedy by default. /.*/(.*) matches both /1/(2)/ and /1/2/3/4/5/(6)/, so you need /[^/]*/ to only match the FIRST path component.– adaptr
Apr 12 '12 at 12:55
RewriteRule ^/blog/.*/(.*)$ /newblog/$1
does not match the first directory component - rewriterules are greedy by default. /.*/(.*) matches both /1/(2)/ and /1/2/3/4/5/(6)/, so you need /[^/]*/ to only match the FIRST path component.– adaptr
Apr 12 '12 at 12:55
1
1
@sysadmin1138, I think this answer is good but it can be better if you elaborate more on the flags E, N, NS, P, PT, and S with examples because those flags ain't obvious how they work etc.
– Pacerier
Aug 5 '13 at 2:17
@sysadmin1138, I think this answer is good but it can be better if you elaborate more on the flags E, N, NS, P, PT, and S with examples because those flags ain't obvious how they work etc.
– Pacerier
Aug 5 '13 at 2:17
|
show 8 more comments
What is the fundamental format and
structure of mod_rewrite rules?
I'll defer to sysadmin1138's excellent answer on these points.
What form/flavor of regular
expressions do I need to have a solid
grasp of?
In addition to the syntax order, syntax matching/regular expressions, and RewriteRule flags outlined by sysadmin1138, I believe it bears mentioning that mod_rewrite exposes Apache environment variables based on HTTP request headers and Apache's configuration.
I would recommend AskApache's mod_rewrite Debug Tutorial for a comprehensive list of variables which may be available to mod_rewrite.
What are the most common
mistakes/pitfalls when writing rewrite
rules?
Most problems with RewriteRule's stem from a misunderstanding of PCRE syntax/failure to properly escape special characters or a lack of insight into the content of the variable(s) used for matching.
Typical problems and recommended troubleshooting:
500 - Internal Server Error - Remove Windows carriage controls in configuration file(s) if present, make sure mod_rewrite is enabled (wrap directives inIfModule
conditional to avoid this scenario), check directive syntax, comment out directives until problem is identified
Redirect loop - Make use of RewriteLog and RewriteLogLevel, comment out directives until problem is identified
What is a good method for testing and
verifying mod_rewrite rules?
First, look at the contents of the environment variable(s) you plan to match against - if you have PHP installed, this is as simple as adding the following block to your application:
<?php
var_dump($_SERVER);
?>
... then write your rules (preferably for testing on a development server) and note any inconsistent matching or activity in your Apache ErrorLog file.
For more complex rules, use mod_rewrite's RewriteLog
directive to log activity to a file and set RewriteLogLevel 3
Are there SEO or performance
implications of mod_rewrite rules I
should be aware of?
AllowOverride all
impacts server performance as Apache must check for .htaccess
files and parse directives with each request - if possible, keep all directives in the VirtualHost configuration for your site or enable .htaccess
overrides only for the directories which need them.
Google's Webmaster Guidelines explicitly state: "Don't deceive your users or present different content to search engines than you display to users, which is commonly referred to as 'cloaking.'" - avoid creating mod_rewrite directives which filter for search engine robots.
Search engine robots prefer a 1:1 content:URI mapping (this is the basis for ranking links to content) - if you are using mod_rewrite to create temporary redirects or you are serving the same content under multiple URI's, consider specifying a canonical URI within your HTML documents.
Are there common situations where
mod_rewrite might seem like the right
tool for the job but isn't?
This is a huge (and potentially contentious) topic in its own right - better (IMHO) to address uses on a case-by-case basis and let askers determine whether the resolutions suggested are appropriate to their needs.
What are some common examples?
AskApache's mod_rewrite Tricks and Tips covers just about every common use-case that pops up regularly, however, the "correct" solution for a given user may depend upon the sophistication of the user's configuration and existing directives (which is why it is a generally a good idea to see which other directives a user has in place whenever a mod_rewrite question comes up).
Thanks for the AskApache link. It's what I was looking for!
– sica07
Nov 23 '11 at 22:14
The AskApache clown is officially unsupported by the ASF. Much of what he says is debatable or plain wrong.
– adaptr
Apr 12 '12 at 12:59
@adaptr Please share the superior resources which you are apparently aware of.
– danlefree
Apr 13 '12 at 0:56
"common situations where mod_rewrite might seem like the right tool for the job but isn't?" - simple redirects, where mod_rewrite is not already being used. Use mod_aliasRedirect
orRedirectMatch
instead. See also the Apache docs: When not to use mod_rewrite
– MrWhite
Dec 8 '16 at 15:27
add a comment |
What is the fundamental format and
structure of mod_rewrite rules?
I'll defer to sysadmin1138's excellent answer on these points.
What form/flavor of regular
expressions do I need to have a solid
grasp of?
In addition to the syntax order, syntax matching/regular expressions, and RewriteRule flags outlined by sysadmin1138, I believe it bears mentioning that mod_rewrite exposes Apache environment variables based on HTTP request headers and Apache's configuration.
I would recommend AskApache's mod_rewrite Debug Tutorial for a comprehensive list of variables which may be available to mod_rewrite.
What are the most common
mistakes/pitfalls when writing rewrite
rules?
Most problems with RewriteRule's stem from a misunderstanding of PCRE syntax/failure to properly escape special characters or a lack of insight into the content of the variable(s) used for matching.
Typical problems and recommended troubleshooting:
500 - Internal Server Error - Remove Windows carriage controls in configuration file(s) if present, make sure mod_rewrite is enabled (wrap directives inIfModule
conditional to avoid this scenario), check directive syntax, comment out directives until problem is identified
Redirect loop - Make use of RewriteLog and RewriteLogLevel, comment out directives until problem is identified
What is a good method for testing and
verifying mod_rewrite rules?
First, look at the contents of the environment variable(s) you plan to match against - if you have PHP installed, this is as simple as adding the following block to your application:
<?php
var_dump($_SERVER);
?>
... then write your rules (preferably for testing on a development server) and note any inconsistent matching or activity in your Apache ErrorLog file.
For more complex rules, use mod_rewrite's RewriteLog
directive to log activity to a file and set RewriteLogLevel 3
Are there SEO or performance
implications of mod_rewrite rules I
should be aware of?
AllowOverride all
impacts server performance as Apache must check for .htaccess
files and parse directives with each request - if possible, keep all directives in the VirtualHost configuration for your site or enable .htaccess
overrides only for the directories which need them.
Google's Webmaster Guidelines explicitly state: "Don't deceive your users or present different content to search engines than you display to users, which is commonly referred to as 'cloaking.'" - avoid creating mod_rewrite directives which filter for search engine robots.
Search engine robots prefer a 1:1 content:URI mapping (this is the basis for ranking links to content) - if you are using mod_rewrite to create temporary redirects or you are serving the same content under multiple URI's, consider specifying a canonical URI within your HTML documents.
Are there common situations where
mod_rewrite might seem like the right
tool for the job but isn't?
This is a huge (and potentially contentious) topic in its own right - better (IMHO) to address uses on a case-by-case basis and let askers determine whether the resolutions suggested are appropriate to their needs.
What are some common examples?
AskApache's mod_rewrite Tricks and Tips covers just about every common use-case that pops up regularly, however, the "correct" solution for a given user may depend upon the sophistication of the user's configuration and existing directives (which is why it is a generally a good idea to see which other directives a user has in place whenever a mod_rewrite question comes up).
Thanks for the AskApache link. It's what I was looking for!
– sica07
Nov 23 '11 at 22:14
The AskApache clown is officially unsupported by the ASF. Much of what he says is debatable or plain wrong.
– adaptr
Apr 12 '12 at 12:59
@adaptr Please share the superior resources which you are apparently aware of.
– danlefree
Apr 13 '12 at 0:56
"common situations where mod_rewrite might seem like the right tool for the job but isn't?" - simple redirects, where mod_rewrite is not already being used. Use mod_aliasRedirect
orRedirectMatch
instead. See also the Apache docs: When not to use mod_rewrite
– MrWhite
Dec 8 '16 at 15:27
add a comment |
What is the fundamental format and
structure of mod_rewrite rules?
I'll defer to sysadmin1138's excellent answer on these points.
What form/flavor of regular
expressions do I need to have a solid
grasp of?
In addition to the syntax order, syntax matching/regular expressions, and RewriteRule flags outlined by sysadmin1138, I believe it bears mentioning that mod_rewrite exposes Apache environment variables based on HTTP request headers and Apache's configuration.
I would recommend AskApache's mod_rewrite Debug Tutorial for a comprehensive list of variables which may be available to mod_rewrite.
What are the most common
mistakes/pitfalls when writing rewrite
rules?
Most problems with RewriteRule's stem from a misunderstanding of PCRE syntax/failure to properly escape special characters or a lack of insight into the content of the variable(s) used for matching.
Typical problems and recommended troubleshooting:
500 - Internal Server Error - Remove Windows carriage controls in configuration file(s) if present, make sure mod_rewrite is enabled (wrap directives inIfModule
conditional to avoid this scenario), check directive syntax, comment out directives until problem is identified
Redirect loop - Make use of RewriteLog and RewriteLogLevel, comment out directives until problem is identified
What is a good method for testing and
verifying mod_rewrite rules?
First, look at the contents of the environment variable(s) you plan to match against - if you have PHP installed, this is as simple as adding the following block to your application:
<?php
var_dump($_SERVER);
?>
... then write your rules (preferably for testing on a development server) and note any inconsistent matching or activity in your Apache ErrorLog file.
For more complex rules, use mod_rewrite's RewriteLog
directive to log activity to a file and set RewriteLogLevel 3
Are there SEO or performance
implications of mod_rewrite rules I
should be aware of?
AllowOverride all
impacts server performance as Apache must check for .htaccess
files and parse directives with each request - if possible, keep all directives in the VirtualHost configuration for your site or enable .htaccess
overrides only for the directories which need them.
Google's Webmaster Guidelines explicitly state: "Don't deceive your users or present different content to search engines than you display to users, which is commonly referred to as 'cloaking.'" - avoid creating mod_rewrite directives which filter for search engine robots.
Search engine robots prefer a 1:1 content:URI mapping (this is the basis for ranking links to content) - if you are using mod_rewrite to create temporary redirects or you are serving the same content under multiple URI's, consider specifying a canonical URI within your HTML documents.
Are there common situations where
mod_rewrite might seem like the right
tool for the job but isn't?
This is a huge (and potentially contentious) topic in its own right - better (IMHO) to address uses on a case-by-case basis and let askers determine whether the resolutions suggested are appropriate to their needs.
What are some common examples?
AskApache's mod_rewrite Tricks and Tips covers just about every common use-case that pops up regularly, however, the "correct" solution for a given user may depend upon the sophistication of the user's configuration and existing directives (which is why it is a generally a good idea to see which other directives a user has in place whenever a mod_rewrite question comes up).
What is the fundamental format and
structure of mod_rewrite rules?
I'll defer to sysadmin1138's excellent answer on these points.
What form/flavor of regular
expressions do I need to have a solid
grasp of?
In addition to the syntax order, syntax matching/regular expressions, and RewriteRule flags outlined by sysadmin1138, I believe it bears mentioning that mod_rewrite exposes Apache environment variables based on HTTP request headers and Apache's configuration.
I would recommend AskApache's mod_rewrite Debug Tutorial for a comprehensive list of variables which may be available to mod_rewrite.
What are the most common
mistakes/pitfalls when writing rewrite
rules?
Most problems with RewriteRule's stem from a misunderstanding of PCRE syntax/failure to properly escape special characters or a lack of insight into the content of the variable(s) used for matching.
Typical problems and recommended troubleshooting:
500 - Internal Server Error - Remove Windows carriage controls in configuration file(s) if present, make sure mod_rewrite is enabled (wrap directives inIfModule
conditional to avoid this scenario), check directive syntax, comment out directives until problem is identified
Redirect loop - Make use of RewriteLog and RewriteLogLevel, comment out directives until problem is identified
What is a good method for testing and
verifying mod_rewrite rules?
First, look at the contents of the environment variable(s) you plan to match against - if you have PHP installed, this is as simple as adding the following block to your application:
<?php
var_dump($_SERVER);
?>
... then write your rules (preferably for testing on a development server) and note any inconsistent matching or activity in your Apache ErrorLog file.
For more complex rules, use mod_rewrite's RewriteLog
directive to log activity to a file and set RewriteLogLevel 3
Are there SEO or performance
implications of mod_rewrite rules I
should be aware of?
AllowOverride all
impacts server performance as Apache must check for .htaccess
files and parse directives with each request - if possible, keep all directives in the VirtualHost configuration for your site or enable .htaccess
overrides only for the directories which need them.
Google's Webmaster Guidelines explicitly state: "Don't deceive your users or present different content to search engines than you display to users, which is commonly referred to as 'cloaking.'" - avoid creating mod_rewrite directives which filter for search engine robots.
Search engine robots prefer a 1:1 content:URI mapping (this is the basis for ranking links to content) - if you are using mod_rewrite to create temporary redirects or you are serving the same content under multiple URI's, consider specifying a canonical URI within your HTML documents.
Are there common situations where
mod_rewrite might seem like the right
tool for the job but isn't?
This is a huge (and potentially contentious) topic in its own right - better (IMHO) to address uses on a case-by-case basis and let askers determine whether the resolutions suggested are appropriate to their needs.
What are some common examples?
AskApache's mod_rewrite Tricks and Tips covers just about every common use-case that pops up regularly, however, the "correct" solution for a given user may depend upon the sophistication of the user's configuration and existing directives (which is why it is a generally a good idea to see which other directives a user has in place whenever a mod_rewrite question comes up).
edited Dec 21 '10 at 1:10
answered Dec 21 '10 at 1:00
danlefreedanlefree
2,75611418
2,75611418
Thanks for the AskApache link. It's what I was looking for!
– sica07
Nov 23 '11 at 22:14
The AskApache clown is officially unsupported by the ASF. Much of what he says is debatable or plain wrong.
– adaptr
Apr 12 '12 at 12:59
@adaptr Please share the superior resources which you are apparently aware of.
– danlefree
Apr 13 '12 at 0:56
"common situations where mod_rewrite might seem like the right tool for the job but isn't?" - simple redirects, where mod_rewrite is not already being used. Use mod_aliasRedirect
orRedirectMatch
instead. See also the Apache docs: When not to use mod_rewrite
– MrWhite
Dec 8 '16 at 15:27
add a comment |
Thanks for the AskApache link. It's what I was looking for!
– sica07
Nov 23 '11 at 22:14
The AskApache clown is officially unsupported by the ASF. Much of what he says is debatable or plain wrong.
– adaptr
Apr 12 '12 at 12:59
@adaptr Please share the superior resources which you are apparently aware of.
– danlefree
Apr 13 '12 at 0:56
"common situations where mod_rewrite might seem like the right tool for the job but isn't?" - simple redirects, where mod_rewrite is not already being used. Use mod_aliasRedirect
orRedirectMatch
instead. See also the Apache docs: When not to use mod_rewrite
– MrWhite
Dec 8 '16 at 15:27
Thanks for the AskApache link. It's what I was looking for!
– sica07
Nov 23 '11 at 22:14
Thanks for the AskApache link. It's what I was looking for!
– sica07
Nov 23 '11 at 22:14
The AskApache clown is officially unsupported by the ASF. Much of what he says is debatable or plain wrong.
– adaptr
Apr 12 '12 at 12:59
The AskApache clown is officially unsupported by the ASF. Much of what he says is debatable or plain wrong.
– adaptr
Apr 12 '12 at 12:59
@adaptr Please share the superior resources which you are apparently aware of.
– danlefree
Apr 13 '12 at 0:56
@adaptr Please share the superior resources which you are apparently aware of.
– danlefree
Apr 13 '12 at 0:56
"common situations where mod_rewrite might seem like the right tool for the job but isn't?" - simple redirects, where mod_rewrite is not already being used. Use mod_alias
Redirect
or RedirectMatch
instead. See also the Apache docs: When not to use mod_rewrite– MrWhite
Dec 8 '16 at 15:27
"common situations where mod_rewrite might seem like the right tool for the job but isn't?" - simple redirects, where mod_rewrite is not already being used. Use mod_alias
Redirect
or RedirectMatch
instead. See also the Apache docs: When not to use mod_rewrite– MrWhite
Dec 8 '16 at 15:27
add a comment |
Like many admin/developers I've been fighting the intricacies of rewrite rules for years and am unhappy with the existing Apache documentation, so I decided as a personal project to get to the bottom of how mod_rewrite
actually works and interacts with the rest of the Apache core, so over the last few months I've been instrumenting test cases with strace
+ drilling into the source code to get a handle on all of this.
Here are some key comments that rewrite rule developers need to consider:
- Some aspects of rewriting are common to server config, virtual host, directory, .htaccess processing however
- Some processing is very different for the root config (server config, virtual host and directory) as opposed to the PerDir (
.htaccess
) processing. - Worse because PerDir processing can almost indiscriminately trigger INTERNAL REDIRECT cycling, the root config elements have to be written aware that such PerDir processing can trigger this.
I would go as fas as to say that because of this you almost need to split the rewrite user communities into two categories and treat them as entirely separate:
Those with root access to the Apache config. These are typically admin/developer with an application dedicated server/VM, and the message here is quite simple: avoid using
.htaccess
files if at all possible; do everything in your server or vhost config. Debugging is reasonable easy since the developer can set debugging and has access to the rewrite.log files.Users of a shared hosted service (SHS).
- Such users have to use
.htaccess
/ Perdir processing as there is no alternative available. - Worse, the skill level of such users (as far as using the regexp driven ladder-logic of mod_rewrite) is generally significantly less than experienced admins.
- Apache and the hosting providers offer no debugging / diagnostic support. The only diagnostic information is a successful redirection, a redirection to the wrong URI. or a 404/500 status code. This leaves them confused and helpless.
- Apache is extremely weak explaining how rewriting works for this use case. For example it does not provide a clear explanation of what PerDir
.htaccess
file is selected and why. It does not explain the intricacies of PerDir cycling and how to avoid this.
- Such users have to use
There is possibly a third community: admin and support staff in SHS providers who end up with a foot in both camps and have to suffer the consequences of the above.
I have written a couple of article-style blog posts (e.g More on using Rewrite rules in .htaccess files) which covers a lot of detailed points which I won't repeat here to keep this post short. I have my own shared service as well as supporting some dedicated & VM FLOSS projects. I started out using a standard LAMP VM as a test vehicle for my SHS account, but in the end I found it better to do a proper mirror VM (described here).
However, in terms of how the admin community should support .htaccess
users, I feel that we need to develop and to offer:
- A coherent description of how the rewrite system actually works in PerDir processing
- A set of guidelines/best practices on how to write
.htaccess
rewrite rules - A simple web based rewrite script parser sort of similar to the W3C html parsers, but by which users can input test URIs or test vectors of the same and get an immediate log of the rewrite logic flow/
Hints on how to get built-in diagnostics from your rules (e.g.
- Use
[E=VAR:EXPR]
exploiting the fact thatEXPR
will expand backreferences ($N or %N) to make them available as diagnostics to the target script. If you topically order your rewrite rules using [OR],[C],[SKIP] and [L] flags so that the entire rewrite scheme works without the need to exploit internal redirection, then you can add the following as rule 1 to avoid all looping hassle:
RewriteCond %ENV:REDIRECT_STATUS !=""
RewriteRule . - [L]
- Use
This is well-documented. Why do you say the documentation does not explain this?
– adaptr
Apr 12 '12 at 12:57
2
All you have to do is to subscribe to the.htaccess
topics and you will see. Most beginners get hopelessly confused -- most of these have their first experience of a LAMP service and mod_rewrite on a shared service and therefore have no root access to the system/vhost configs and have to use per dir processing through.htaccess
files. There are important differences which the beginner has to "bleed over". I would regard myself as a power-user and am still discovering subtleties. As I sayu I've had to use strace and source-code scanning to work out some aspects.S houldn't be needed. :-(
– TerryE
Apr 13 '12 at 16:25
I totally agree. "We need to split the rewrite user communities into two categories and treat them as entirely separate." Some users are using shared hosting and need to rely on.htaccess
, which is terribly fragile, complicated, and confusing, even for experts. I'm STILL having trouble.
– Ryan
Jul 19 '17 at 17:05
add a comment |
Like many admin/developers I've been fighting the intricacies of rewrite rules for years and am unhappy with the existing Apache documentation, so I decided as a personal project to get to the bottom of how mod_rewrite
actually works and interacts with the rest of the Apache core, so over the last few months I've been instrumenting test cases with strace
+ drilling into the source code to get a handle on all of this.
Here are some key comments that rewrite rule developers need to consider:
- Some aspects of rewriting are common to server config, virtual host, directory, .htaccess processing however
- Some processing is very different for the root config (server config, virtual host and directory) as opposed to the PerDir (
.htaccess
) processing. - Worse because PerDir processing can almost indiscriminately trigger INTERNAL REDIRECT cycling, the root config elements have to be written aware that such PerDir processing can trigger this.
I would go as fas as to say that because of this you almost need to split the rewrite user communities into two categories and treat them as entirely separate:
Those with root access to the Apache config. These are typically admin/developer with an application dedicated server/VM, and the message here is quite simple: avoid using
.htaccess
files if at all possible; do everything in your server or vhost config. Debugging is reasonable easy since the developer can set debugging and has access to the rewrite.log files.Users of a shared hosted service (SHS).
- Such users have to use
.htaccess
/ Perdir processing as there is no alternative available. - Worse, the skill level of such users (as far as using the regexp driven ladder-logic of mod_rewrite) is generally significantly less than experienced admins.
- Apache and the hosting providers offer no debugging / diagnostic support. The only diagnostic information is a successful redirection, a redirection to the wrong URI. or a 404/500 status code. This leaves them confused and helpless.
- Apache is extremely weak explaining how rewriting works for this use case. For example it does not provide a clear explanation of what PerDir
.htaccess
file is selected and why. It does not explain the intricacies of PerDir cycling and how to avoid this.
- Such users have to use
There is possibly a third community: admin and support staff in SHS providers who end up with a foot in both camps and have to suffer the consequences of the above.
I have written a couple of article-style blog posts (e.g More on using Rewrite rules in .htaccess files) which covers a lot of detailed points which I won't repeat here to keep this post short. I have my own shared service as well as supporting some dedicated & VM FLOSS projects. I started out using a standard LAMP VM as a test vehicle for my SHS account, but in the end I found it better to do a proper mirror VM (described here).
However, in terms of how the admin community should support .htaccess
users, I feel that we need to develop and to offer:
- A coherent description of how the rewrite system actually works in PerDir processing
- A set of guidelines/best practices on how to write
.htaccess
rewrite rules - A simple web based rewrite script parser sort of similar to the W3C html parsers, but by which users can input test URIs or test vectors of the same and get an immediate log of the rewrite logic flow/
Hints on how to get built-in diagnostics from your rules (e.g.
- Use
[E=VAR:EXPR]
exploiting the fact thatEXPR
will expand backreferences ($N or %N) to make them available as diagnostics to the target script. If you topically order your rewrite rules using [OR],[C],[SKIP] and [L] flags so that the entire rewrite scheme works without the need to exploit internal redirection, then you can add the following as rule 1 to avoid all looping hassle:
RewriteCond %ENV:REDIRECT_STATUS !=""
RewriteRule . - [L]
- Use
This is well-documented. Why do you say the documentation does not explain this?
– adaptr
Apr 12 '12 at 12:57
2
All you have to do is to subscribe to the.htaccess
topics and you will see. Most beginners get hopelessly confused -- most of these have their first experience of a LAMP service and mod_rewrite on a shared service and therefore have no root access to the system/vhost configs and have to use per dir processing through.htaccess
files. There are important differences which the beginner has to "bleed over". I would regard myself as a power-user and am still discovering subtleties. As I sayu I've had to use strace and source-code scanning to work out some aspects.S houldn't be needed. :-(
– TerryE
Apr 13 '12 at 16:25
I totally agree. "We need to split the rewrite user communities into two categories and treat them as entirely separate." Some users are using shared hosting and need to rely on.htaccess
, which is terribly fragile, complicated, and confusing, even for experts. I'm STILL having trouble.
– Ryan
Jul 19 '17 at 17:05
add a comment |
Like many admin/developers I've been fighting the intricacies of rewrite rules for years and am unhappy with the existing Apache documentation, so I decided as a personal project to get to the bottom of how mod_rewrite
actually works and interacts with the rest of the Apache core, so over the last few months I've been instrumenting test cases with strace
+ drilling into the source code to get a handle on all of this.
Here are some key comments that rewrite rule developers need to consider:
- Some aspects of rewriting are common to server config, virtual host, directory, .htaccess processing however
- Some processing is very different for the root config (server config, virtual host and directory) as opposed to the PerDir (
.htaccess
) processing. - Worse because PerDir processing can almost indiscriminately trigger INTERNAL REDIRECT cycling, the root config elements have to be written aware that such PerDir processing can trigger this.
I would go as fas as to say that because of this you almost need to split the rewrite user communities into two categories and treat them as entirely separate:
Those with root access to the Apache config. These are typically admin/developer with an application dedicated server/VM, and the message here is quite simple: avoid using
.htaccess
files if at all possible; do everything in your server or vhost config. Debugging is reasonable easy since the developer can set debugging and has access to the rewrite.log files.Users of a shared hosted service (SHS).
- Such users have to use
.htaccess
/ Perdir processing as there is no alternative available. - Worse, the skill level of such users (as far as using the regexp driven ladder-logic of mod_rewrite) is generally significantly less than experienced admins.
- Apache and the hosting providers offer no debugging / diagnostic support. The only diagnostic information is a successful redirection, a redirection to the wrong URI. or a 404/500 status code. This leaves them confused and helpless.
- Apache is extremely weak explaining how rewriting works for this use case. For example it does not provide a clear explanation of what PerDir
.htaccess
file is selected and why. It does not explain the intricacies of PerDir cycling and how to avoid this.
- Such users have to use
There is possibly a third community: admin and support staff in SHS providers who end up with a foot in both camps and have to suffer the consequences of the above.
I have written a couple of article-style blog posts (e.g More on using Rewrite rules in .htaccess files) which covers a lot of detailed points which I won't repeat here to keep this post short. I have my own shared service as well as supporting some dedicated & VM FLOSS projects. I started out using a standard LAMP VM as a test vehicle for my SHS account, but in the end I found it better to do a proper mirror VM (described here).
However, in terms of how the admin community should support .htaccess
users, I feel that we need to develop and to offer:
- A coherent description of how the rewrite system actually works in PerDir processing
- A set of guidelines/best practices on how to write
.htaccess
rewrite rules - A simple web based rewrite script parser sort of similar to the W3C html parsers, but by which users can input test URIs or test vectors of the same and get an immediate log of the rewrite logic flow/
Hints on how to get built-in diagnostics from your rules (e.g.
- Use
[E=VAR:EXPR]
exploiting the fact thatEXPR
will expand backreferences ($N or %N) to make them available as diagnostics to the target script. If you topically order your rewrite rules using [OR],[C],[SKIP] and [L] flags so that the entire rewrite scheme works without the need to exploit internal redirection, then you can add the following as rule 1 to avoid all looping hassle:
RewriteCond %ENV:REDIRECT_STATUS !=""
RewriteRule . - [L]
- Use
Like many admin/developers I've been fighting the intricacies of rewrite rules for years and am unhappy with the existing Apache documentation, so I decided as a personal project to get to the bottom of how mod_rewrite
actually works and interacts with the rest of the Apache core, so over the last few months I've been instrumenting test cases with strace
+ drilling into the source code to get a handle on all of this.
Here are some key comments that rewrite rule developers need to consider:
- Some aspects of rewriting are common to server config, virtual host, directory, .htaccess processing however
- Some processing is very different for the root config (server config, virtual host and directory) as opposed to the PerDir (
.htaccess
) processing. - Worse because PerDir processing can almost indiscriminately trigger INTERNAL REDIRECT cycling, the root config elements have to be written aware that such PerDir processing can trigger this.
I would go as fas as to say that because of this you almost need to split the rewrite user communities into two categories and treat them as entirely separate:
Those with root access to the Apache config. These are typically admin/developer with an application dedicated server/VM, and the message here is quite simple: avoid using
.htaccess
files if at all possible; do everything in your server or vhost config. Debugging is reasonable easy since the developer can set debugging and has access to the rewrite.log files.Users of a shared hosted service (SHS).
- Such users have to use
.htaccess
/ Perdir processing as there is no alternative available. - Worse, the skill level of such users (as far as using the regexp driven ladder-logic of mod_rewrite) is generally significantly less than experienced admins.
- Apache and the hosting providers offer no debugging / diagnostic support. The only diagnostic information is a successful redirection, a redirection to the wrong URI. or a 404/500 status code. This leaves them confused and helpless.
- Apache is extremely weak explaining how rewriting works for this use case. For example it does not provide a clear explanation of what PerDir
.htaccess
file is selected and why. It does not explain the intricacies of PerDir cycling and how to avoid this.
- Such users have to use
There is possibly a third community: admin and support staff in SHS providers who end up with a foot in both camps and have to suffer the consequences of the above.
I have written a couple of article-style blog posts (e.g More on using Rewrite rules in .htaccess files) which covers a lot of detailed points which I won't repeat here to keep this post short. I have my own shared service as well as supporting some dedicated & VM FLOSS projects. I started out using a standard LAMP VM as a test vehicle for my SHS account, but in the end I found it better to do a proper mirror VM (described here).
However, in terms of how the admin community should support .htaccess
users, I feel that we need to develop and to offer:
- A coherent description of how the rewrite system actually works in PerDir processing
- A set of guidelines/best practices on how to write
.htaccess
rewrite rules - A simple web based rewrite script parser sort of similar to the W3C html parsers, but by which users can input test URIs or test vectors of the same and get an immediate log of the rewrite logic flow/
Hints on how to get built-in diagnostics from your rules (e.g.
- Use
[E=VAR:EXPR]
exploiting the fact thatEXPR
will expand backreferences ($N or %N) to make them available as diagnostics to the target script. If you topically order your rewrite rules using [OR],[C],[SKIP] and [L] flags so that the entire rewrite scheme works without the need to exploit internal redirection, then you can add the following as rule 1 to avoid all looping hassle:
RewriteCond %ENV:REDIRECT_STATUS !=""
RewriteRule . - [L]
- Use
edited Jan 14 '12 at 19:13
freiheit
12.7k3865
12.7k3865
answered Jan 14 '12 at 16:50
TerryETerryE
39925
39925
This is well-documented. Why do you say the documentation does not explain this?
– adaptr
Apr 12 '12 at 12:57
2
All you have to do is to subscribe to the.htaccess
topics and you will see. Most beginners get hopelessly confused -- most of these have their first experience of a LAMP service and mod_rewrite on a shared service and therefore have no root access to the system/vhost configs and have to use per dir processing through.htaccess
files. There are important differences which the beginner has to "bleed over". I would regard myself as a power-user and am still discovering subtleties. As I sayu I've had to use strace and source-code scanning to work out some aspects.S houldn't be needed. :-(
– TerryE
Apr 13 '12 at 16:25
I totally agree. "We need to split the rewrite user communities into two categories and treat them as entirely separate." Some users are using shared hosting and need to rely on.htaccess
, which is terribly fragile, complicated, and confusing, even for experts. I'm STILL having trouble.
– Ryan
Jul 19 '17 at 17:05
add a comment |
This is well-documented. Why do you say the documentation does not explain this?
– adaptr
Apr 12 '12 at 12:57
2
All you have to do is to subscribe to the.htaccess
topics and you will see. Most beginners get hopelessly confused -- most of these have their first experience of a LAMP service and mod_rewrite on a shared service and therefore have no root access to the system/vhost configs and have to use per dir processing through.htaccess
files. There are important differences which the beginner has to "bleed over". I would regard myself as a power-user and am still discovering subtleties. As I sayu I've had to use strace and source-code scanning to work out some aspects.S houldn't be needed. :-(
– TerryE
Apr 13 '12 at 16:25
I totally agree. "We need to split the rewrite user communities into two categories and treat them as entirely separate." Some users are using shared hosting and need to rely on.htaccess
, which is terribly fragile, complicated, and confusing, even for experts. I'm STILL having trouble.
– Ryan
Jul 19 '17 at 17:05
This is well-documented. Why do you say the documentation does not explain this?
– adaptr
Apr 12 '12 at 12:57
This is well-documented. Why do you say the documentation does not explain this?
– adaptr
Apr 12 '12 at 12:57
2
2
All you have to do is to subscribe to the
.htaccess
topics and you will see. Most beginners get hopelessly confused -- most of these have their first experience of a LAMP service and mod_rewrite on a shared service and therefore have no root access to the system/vhost configs and have to use per dir processing through .htaccess
files. There are important differences which the beginner has to "bleed over". I would regard myself as a power-user and am still discovering subtleties. As I sayu I've had to use strace and source-code scanning to work out some aspects.S houldn't be needed. :-(– TerryE
Apr 13 '12 at 16:25
All you have to do is to subscribe to the
.htaccess
topics and you will see. Most beginners get hopelessly confused -- most of these have their first experience of a LAMP service and mod_rewrite on a shared service and therefore have no root access to the system/vhost configs and have to use per dir processing through .htaccess
files. There are important differences which the beginner has to "bleed over". I would regard myself as a power-user and am still discovering subtleties. As I sayu I've had to use strace and source-code scanning to work out some aspects.S houldn't be needed. :-(– TerryE
Apr 13 '12 at 16:25
I totally agree. "We need to split the rewrite user communities into two categories and treat them as entirely separate." Some users are using shared hosting and need to rely on
.htaccess
, which is terribly fragile, complicated, and confusing, even for experts. I'm STILL having trouble.– Ryan
Jul 19 '17 at 17:05
I totally agree. "We need to split the rewrite user communities into two categories and treat them as entirely separate." Some users are using shared hosting and need to rely on
.htaccess
, which is terribly fragile, complicated, and confusing, even for experts. I'm STILL having trouble.– Ryan
Jul 19 '17 at 17:05
add a comment |
Using rewritemap
There are lots of things you can do with rewritemaps. Rewritemaps get declared using the Rewritemap directive, and can then be used both in RewritCond evaluations, and in RewriteRule Subsitutions.
The general syntax for RewriteMap is:
RewriteMap MapName MapType:MapSource
For example:
RewriteMap examplemap txt:/path/to/file/map.txt
You can then use the mapname for constructs like this:
$examplemap:key
The map contains key/value pairs. If the key is found, the value is subsituted. Simple maps are just plain text files, but you can use hash maps, and even SQL queries. More details are in the docs:
http://httpd.apache.org/docs/2.2/mod/mod_rewrite.html#rewritemap
Unescaping strings.
There are four internal maps you can use to do some manipulations. Especially unescaping strings can come in handy.
For example: I want to test for the string "café" in the query string. However, the browser will escape this before sending it to my server, so I 'll need to either figure out what the URL escaped version is for every string I wish to match, or I can just unescape it...
RewriteMap unescape int:unescape
RewriteCond %QUERY_STRING (location|place)=(.*)
RewriteCond $unescape:%2 café
RewriteRule ^/find/$ /find/1234? [L,R]
Note how I use one RewriteCond to just capture the argument toe the query string parameter, and then use the map in the second rewriteCond to unescape it. This then gets compared.
Also note how I need to us %2 as key in the rewritemap, as %1 will contain either "location" or "place". When you use parentheses to group patterns they will also be captured, wether you plan to use the result of the capture or not...
The last sentence isn't quite true. Themod_rewrite
regexp engine supports non-capturing groups such as(?:location|place)
and this will only have one capture in the example.
– TerryE
Mar 10 '17 at 23:32
add a comment |
Using rewritemap
There are lots of things you can do with rewritemaps. Rewritemaps get declared using the Rewritemap directive, and can then be used both in RewritCond evaluations, and in RewriteRule Subsitutions.
The general syntax for RewriteMap is:
RewriteMap MapName MapType:MapSource
For example:
RewriteMap examplemap txt:/path/to/file/map.txt
You can then use the mapname for constructs like this:
$examplemap:key
The map contains key/value pairs. If the key is found, the value is subsituted. Simple maps are just plain text files, but you can use hash maps, and even SQL queries. More details are in the docs:
http://httpd.apache.org/docs/2.2/mod/mod_rewrite.html#rewritemap
Unescaping strings.
There are four internal maps you can use to do some manipulations. Especially unescaping strings can come in handy.
For example: I want to test for the string "café" in the query string. However, the browser will escape this before sending it to my server, so I 'll need to either figure out what the URL escaped version is for every string I wish to match, or I can just unescape it...
RewriteMap unescape int:unescape
RewriteCond %QUERY_STRING (location|place)=(.*)
RewriteCond $unescape:%2 café
RewriteRule ^/find/$ /find/1234? [L,R]
Note how I use one RewriteCond to just capture the argument toe the query string parameter, and then use the map in the second rewriteCond to unescape it. This then gets compared.
Also note how I need to us %2 as key in the rewritemap, as %1 will contain either "location" or "place". When you use parentheses to group patterns they will also be captured, wether you plan to use the result of the capture or not...
The last sentence isn't quite true. Themod_rewrite
regexp engine supports non-capturing groups such as(?:location|place)
and this will only have one capture in the example.
– TerryE
Mar 10 '17 at 23:32
add a comment |
Using rewritemap
There are lots of things you can do with rewritemaps. Rewritemaps get declared using the Rewritemap directive, and can then be used both in RewritCond evaluations, and in RewriteRule Subsitutions.
The general syntax for RewriteMap is:
RewriteMap MapName MapType:MapSource
For example:
RewriteMap examplemap txt:/path/to/file/map.txt
You can then use the mapname for constructs like this:
$examplemap:key
The map contains key/value pairs. If the key is found, the value is subsituted. Simple maps are just plain text files, but you can use hash maps, and even SQL queries. More details are in the docs:
http://httpd.apache.org/docs/2.2/mod/mod_rewrite.html#rewritemap
Unescaping strings.
There are four internal maps you can use to do some manipulations. Especially unescaping strings can come in handy.
For example: I want to test for the string "café" in the query string. However, the browser will escape this before sending it to my server, so I 'll need to either figure out what the URL escaped version is for every string I wish to match, or I can just unescape it...
RewriteMap unescape int:unescape
RewriteCond %QUERY_STRING (location|place)=(.*)
RewriteCond $unescape:%2 café
RewriteRule ^/find/$ /find/1234? [L,R]
Note how I use one RewriteCond to just capture the argument toe the query string parameter, and then use the map in the second rewriteCond to unescape it. This then gets compared.
Also note how I need to us %2 as key in the rewritemap, as %1 will contain either "location" or "place". When you use parentheses to group patterns they will also be captured, wether you plan to use the result of the capture or not...
Using rewritemap
There are lots of things you can do with rewritemaps. Rewritemaps get declared using the Rewritemap directive, and can then be used both in RewritCond evaluations, and in RewriteRule Subsitutions.
The general syntax for RewriteMap is:
RewriteMap MapName MapType:MapSource
For example:
RewriteMap examplemap txt:/path/to/file/map.txt
You can then use the mapname for constructs like this:
$examplemap:key
The map contains key/value pairs. If the key is found, the value is subsituted. Simple maps are just plain text files, but you can use hash maps, and even SQL queries. More details are in the docs:
http://httpd.apache.org/docs/2.2/mod/mod_rewrite.html#rewritemap
Unescaping strings.
There are four internal maps you can use to do some manipulations. Especially unescaping strings can come in handy.
For example: I want to test for the string "café" in the query string. However, the browser will escape this before sending it to my server, so I 'll need to either figure out what the URL escaped version is for every string I wish to match, or I can just unescape it...
RewriteMap unescape int:unescape
RewriteCond %QUERY_STRING (location|place)=(.*)
RewriteCond $unescape:%2 café
RewriteRule ^/find/$ /find/1234? [L,R]
Note how I use one RewriteCond to just capture the argument toe the query string parameter, and then use the map in the second rewriteCond to unescape it. This then gets compared.
Also note how I need to us %2 as key in the rewritemap, as %1 will contain either "location" or "place". When you use parentheses to group patterns they will also be captured, wether you plan to use the result of the capture or not...
answered Apr 6 '13 at 11:57
Krist van BesienKrist van Besien
1,7521016
1,7521016
The last sentence isn't quite true. Themod_rewrite
regexp engine supports non-capturing groups such as(?:location|place)
and this will only have one capture in the example.
– TerryE
Mar 10 '17 at 23:32
add a comment |
The last sentence isn't quite true. Themod_rewrite
regexp engine supports non-capturing groups such as(?:location|place)
and this will only have one capture in the example.
– TerryE
Mar 10 '17 at 23:32
The last sentence isn't quite true. The
mod_rewrite
regexp engine supports non-capturing groups such as (?:location|place)
and this will only have one capture in the example.– TerryE
Mar 10 '17 at 23:32
The last sentence isn't quite true. The
mod_rewrite
regexp engine supports non-capturing groups such as (?:location|place)
and this will only have one capture in the example.– TerryE
Mar 10 '17 at 23:32
add a comment |
What are the most common
mistakes/pitfalls when writing rewrite
rules?
A really easy pitfall is when you rewrite URLs that alter the apparent path, e.g. from/base/1234/index.html
to /base/script.php?id=1234
. Any images or CSS with relative paths to the script location will not be found by the client. A number of options to resolve this can be found on this faq.
1
Thanks for the link. Particularly when working with other team members that are not familiar with rewriting, I find adding a<base>
tag to be most easy to follow and still enable relative paths.
– kontur
May 20 '12 at 11:12
add a comment |
What are the most common
mistakes/pitfalls when writing rewrite
rules?
A really easy pitfall is when you rewrite URLs that alter the apparent path, e.g. from/base/1234/index.html
to /base/script.php?id=1234
. Any images or CSS with relative paths to the script location will not be found by the client. A number of options to resolve this can be found on this faq.
1
Thanks for the link. Particularly when working with other team members that are not familiar with rewriting, I find adding a<base>
tag to be most easy to follow and still enable relative paths.
– kontur
May 20 '12 at 11:12
add a comment |
What are the most common
mistakes/pitfalls when writing rewrite
rules?
A really easy pitfall is when you rewrite URLs that alter the apparent path, e.g. from/base/1234/index.html
to /base/script.php?id=1234
. Any images or CSS with relative paths to the script location will not be found by the client. A number of options to resolve this can be found on this faq.
What are the most common
mistakes/pitfalls when writing rewrite
rules?
A really easy pitfall is when you rewrite URLs that alter the apparent path, e.g. from/base/1234/index.html
to /base/script.php?id=1234
. Any images or CSS with relative paths to the script location will not be found by the client. A number of options to resolve this can be found on this faq.
answered Jan 1 '11 at 4:02
beldazbeldaz
22829
22829
1
Thanks for the link. Particularly when working with other team members that are not familiar with rewriting, I find adding a<base>
tag to be most easy to follow and still enable relative paths.
– kontur
May 20 '12 at 11:12
add a comment |
1
Thanks for the link. Particularly when working with other team members that are not familiar with rewriting, I find adding a<base>
tag to be most easy to follow and still enable relative paths.
– kontur
May 20 '12 at 11:12
1
1
Thanks for the link. Particularly when working with other team members that are not familiar with rewriting, I find adding a
<base>
tag to be most easy to follow and still enable relative paths.– kontur
May 20 '12 at 11:12
Thanks for the link. Particularly when working with other team members that are not familiar with rewriting, I find adding a
<base>
tag to be most easy to follow and still enable relative paths.– kontur
May 20 '12 at 11:12
add a comment |
protected by Chris S Feb 27 '14 at 16:16
Thank you for your interest in this question.
Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).
Would you like to answer one of these unanswered questions instead?
9
The idea behind this question is to give a close path for all the endless mod_rewrite questions that drive our more regular users crazy. This is very similar to what was done with subnetting at serverfault.com/questions/49765/how-does-subnetting-work .
– Kyle Brandt
Dec 20 '10 at 17:00
1
Also, I don't really want too many upvotes on this question, rather they should go to the answer. I don't want to CW this because I want to make sure the poster gets full credit for what I am hoping is the mod_rewrite answer to end all mod_rewrite questions.
– Kyle Brandt
Dec 20 '10 at 17:09
4
Sorry, I upvoted the question. ;-) I really think it needs to show up at (or near) the top of
mod-rewrite
tag searches/filters.– Steven Monday
Dec 20 '10 at 19:07
Someone Else (tm) should handle the common use-cases. I don't know them well enough to do it justice.
– sysadmin1138♦
Dec 20 '10 at 20:55
Perhaps this question should be linked into the mod-rewrite tag wiki to make the path even shorter.
– beldaz
Jan 1 '11 at 3:50