Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to escape/strip special characters in the LaTeX document?

We implemented the online service where it is possible to generate PDF with predefined structure. The user can choose a LaTeX template and then compile it with an appropriate inputs.

The question we worry about is the security, that the malicious user was not able to gain shell access through the injection of special instruction into latex document.

We need some workaround for this or at least a list of special characters that we should strip from the input data.

Preferred language would be PHP, but any suggestions, constructions and links are very welcomed.

PS. in few word we're looking for mysql_real_escape_string for LaTeX

like image 235
Igor Avatar asked Mar 29 '10 22:03

Igor


People also ask

How do you escape characters in LaTeX?

Outside \verb , the first seven of them can be typeset by prepending a backslash; for the other three, use the macros \textasciitilde , \textasciicircum , and \textbackslash . Note that the seven "single non-letter" macros don't gobble the space following them.

How do you escape the special character?

To search for a special character that has a special function in the query syntax, you must escape the special character by adding a backslash before it, for example: To search for the string "where?", escape the question mark as follows: "where\?"

How do I show special characters in LaTeX?

LaTeX Spacial Characters If you simply want the character to be printed just as any other letter, include a \ in front of the character. For example, \$ will produce $ in your output. The exception to the rule is the \ itself because \\ has its own special meaning. A \ is produced by typing $\backslash$ in your file.

How do you escape a special character in a shell script?

The backslash (\) character is used to mark these special characters so that they are not interpreted by the shell, but passed on to the command being run (for example, echo ). So to output the string: (Assuming that the value of $X is 5): A quote is ", backslash is \, backtick is `. A few spaces are and dollar is $.


3 Answers

Here's some code to implement the Geoff Reedy answer. I place this code in the public domain.

<?

$test = "Test characters: # $ % & ~ _ ^ \ { }.";
header( "content-type:text/plain" );
print latexSpecialChars( $test );
exit;

function latexSpecialChars( $string )
{
    $map = array( 
            "#"=>"\\#",
            "$"=>"\\$",
            "%"=>"\\%",
            "&"=>"\\&",
            "~"=>"\\~{}",
            "_"=>"\\_",
            "^"=>"\\^{}",
            "\\"=>"\\textbackslash",
            "{"=>"\\{",
            "}"=>"\\}",
    );
    return preg_replace( "/([\^\%~\\\\#\$%&_\{\}])/e", "\$map['$1']", $string );
}
like image 73
Christopher Gutteridge Avatar answered Oct 04 '22 04:10

Christopher Gutteridge


The only possibility (AFAIK) to perform harmful operations using LaTeX is to enable the possibility to call external commands using \write18. This only works if you run LaTeX with the --shell-escape or --enable-write18 argument (depending on your distribution).

So as long as you do not run it with one of these arguments you should be safe without the need to filter out any parts.

Besides that, one is still able to write other files using the \newwrite, \openout and \write commands. Having the user create and (over)write files might be unwanted? So you could filter out occurrences of these commands. But keeping blacklists of certain commands is prone to fail since someone with a bad intention can easily hide the actual command by obfusticating the input document.

Edit: Running the LaTeX command using a limited account (ie no writing to non latex/project related directories) in combination with disabling \write18 might be easier and more secure than keeping a blacklist of 'dangerous' commands.

like image 27
Veger Avatar answered Oct 04 '22 05:10

Veger


According to http://www.tug.org/tutorials/latex2e/Special_Characters.html the special characters in latex are # $ % & ~ _ ^ \ { }. Most can be escaped with a simple backslash but _ ^ and \ need special treatment.

For caret use \^{} (or \textasciicircum), for tilde use \~{} (or \textasciitilde) and for backslash use \textbackslash

If you want the user input to appear as typewriter text, there is also the \verb command which can be used like \verb+asdf$$&\~^+, the + can be any character but can't be in the text.

like image 3
Geoff Reedy Avatar answered Oct 04 '22 04:10

Geoff Reedy