Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Indenting for code-generation

Often, programmers write code that generates other code.

(The technical term is metaprogramming, but it is more common than merely cross-compilers; think about every PHP web-page that generates HTML or every XSLT file.)

One area I find challenging is coming up with techniques to ensure that both the hand-written source file, and the computer-generated object file are clearly indented to aid debugging. The two goals often seem to be competing.

I find this particularly challenging in the PHP/HTML combination. I think that is because:

  • there is sometimes more of the HTML code in the source file than the generating PHP
  • HTML files tend to be longer than, say, SQL statements, and need better indenting
  • HTML has space-sensitive features (e.g. between tags)
  • the result is more publicly visible HTML than SQL statements, so there is more pressure to do a reasonable job.

What techniques do you use to address this?


Edit: I accept that there are at least three arguments to not bothering to generate pretty HTML code:
  • Complexity of generating code is increased.
  • Makes no difference to rendering by browser; developers can use Firebug or similar to view it nicely.
  • Minor performance hit - increased download time for whitespace characters.

I have certainly sometimes generated code without thought to the indenting (especially SQL).

However, there are a few arguments pushing the other way:

  • I find, in practice, that I do frequently read generated code - having extra steps to access it is inconvenient.
  • HTML has some space-sensitivity issues that bite occasionally.

For example, consider the code:

<div class="foo">
    <?php
        $fooHeader();
        $fooBody();
        $fooFooter();
    ?>
</div>

It is clearer than the following code:

<div class="foo"><?php
        $fooHeader();
        $fooBody();
        $fooFooter();
?></div>

However, it is also has different rendering because of the whitespace included in the HTML.

like image 352
Oddthinking Avatar asked Oct 20 '08 01:10

Oddthinking


6 Answers

In the more general case, I have written XSLT code that generates C++ database interface code. Although at first I tried to output correctly indented code from the XSLT, this quickly became untenable. My solution was to completely ignore formatting in the XSLT output, and then run the resulting very long line of code through GNU indent. This produced a reasonably formatted C++ source file suitable for debugging.

I can imagine the problem gets a lot more prickly when dealing with combined source such as HTML and PHP.

like image 192
Greg Hewgill Avatar answered Sep 28 '22 16:09

Greg Hewgill


Generate an AST then traverse it inorder and emit source code that is properly formatted.

like image 38
Watson Ladd Avatar answered Sep 28 '22 17:09

Watson Ladd


A technique that I use when the generating code dominates over the generated code is to pass an indent parameter around.

e.g., in Python, generating more Python.

def generateWhileLoop(condition, block, indentPrefix = ""):
    print indentPrefix + "while " + condition + ":"
    generateBlock(block, indentPrefix + "    ")

Alternatively, depending on my mood:

def generateWhileLoop(condition, block, indentLevel = 0):
    print " " * (indentLevel * spacesPerIndent) + "while " + condition + ":"
    generateBlock(block, indentLevel + 1)

Note the assumption that condition is a short piece of text that fits on the same line, while block is on a separate indented line. If this code can't be sure of whether the sub-items need to be indented, this method starts to fall down.

Also, this technique isn't nearly as useful for sprinkling relatively small amounts of PHP into HTML.

[Edit to clarify: I wrote the question and also this answer. I wanted to seed the answers with one technique that I do use and is sometimes useful, but this technique fails me for typical PHP coding, so I am looking for other ideas like it.]

like image 31
Oddthinking Avatar answered Sep 28 '22 17:09

Oddthinking


I have found that ignoring indenting during generation is best. I have written a generic 'code formatting' engine that post processed all code outputted. This way, I can define indenting rules and code syntax rules seperately from the generator. There are clear benefits to this separation.

like image 26
Jack Avatar answered Sep 28 '22 17:09

Jack


I agree with oddthinking's answer.

Sometimes it's best to solve the problem by inverting it. If you find yourself generating a whole lot of text, consider if its easier to write the text as a template with small bits of intelligent generation code. Or if you can break the problem down into a series of small templates which you assemble, and then indent each template as a whole.

like image 41
Schwern Avatar answered Sep 28 '22 16:09

Schwern


Making websites in PHP, I find mixing of HTML and function specific PHP problematic, it limits the overview and makes debugging harder. A solution to avoid mixing in this case is using template driven content, see Smarty for example. Except better intendation, templating of content is useful for other things like, for example, faster patching. If a customer requires a change in the layout, that particular layout issue can be quickly found and fixed, without bothering with the functional PHP code generating the data (and the other way around).

like image 25
Hannes Landeholm Avatar answered Sep 28 '22 16:09

Hannes Landeholm