Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tokenize .htaccess files

Bet you didn't see this coming? ;)

So, a project of mine requires that I specifically read and make sense out of .htaccess files.

Sadly, searching on Google only yields the infinite woes of people trying to get their own .htaccess to work (sorry, couldn't resist the comment).

Anyway, I'm a bit scared of trying to get this thing out of open-source projects that use it. See, in the past few weeks, I ended up wasting a lot of time trying to fix my issues with this strategy, only to find out that I did better to read RFCs & specs and build the thing my way.

So, if you know about a library, or any (hopefully clean!) code that does this, please do share. In the mean time, if you know about any articles about .htaccess file format, I'm sure they'll be very handy. Thanks.

NB: I'm pretty much multilingual and could make use of any codebase, even though the end code will be Delphi. I know I'm asking too much, but I'd love to see less of C++. Just think of my mental health before sharing C++ code. :)

Edit: Well, I think I'm just going to do this manually myself. The file structure seems to be:

directive arg1 arg2 argN
<begin directive section>
</end directive section>
# single line comment
like image 999
Christian Avatar asked Aug 18 '11 18:08

Christian


Video Answer


1 Answers

.htaccess grammar is actually the exact same as the Apache configuration itself, and example parsers do exist for it.

If you're looking to write your own, you are mostly correct on the format. Remember, section tags can be nested and can have parameters (like <Location />)

English method of parsing:

For each line in the file:
  Strip whitespace from beginning and end of line.
  If the line starts with a '#':
    Parse it as a comment (or skip it)

  Else, If the line starts with a '<':
    If the next character is a '/', the line is a closing tag:
      Seek to the next '>' to get the tag name, and pop it from the tag stack.
    Else, the line is an opening tag:
      Seek to the next '>' for the tag name.
      If the tag, trimmed, contains whitespace:
        Split on the first whitespace. The right side is params, left is the tag. 
        (IfModule, Location, etc use this)

      Push the tag name to the tag stack.

  Else, the line is a directive:
    Split the line on whitespace. This is the directive and params.

Just add quote handling and you're set.

like image 57
lunixbochs Avatar answered Sep 19 '22 21:09

lunixbochs