GraphQL

Query structured data through an API and perform mutations with authorization

Enumeration

GraphQL is an alternative to a REST API, it automatically exposes all data through one endpoint and lets the client query whatever they need. It is also possible to write data. fully replacing the need for regular API endpoints. Of course, this should be guarded by authorization checks to ensure you cannot read data you're not supposed to.

While using an application with GraphQL, the client-side JavaScript code will make fetches to a /graphql endpoint. Note that it may be in a subdirectory or renamed, but you should find it in your request history after browsing some data.

Introspection

When having found such an endpoint, you want to get the "documentation" to understand what kind of queries you can write. There is a built-in feature called introspection where you send a special kind of query, which the server recognizes and returns documentation. Not all servers have this enabled, but if it is, this will make your life much easier.

Below is an example request to check if a GraphQL endpoint has introspection enabled:

POST /graphql HTTP/2
Host: example.com
Content-Type: application/json

{"query": "query { __schema { types { name } } }"}

As you can see, a query parameter is set to a string version of the query in the body. All introspection queries use the __schema key, and here we request the names of all types. A successful response would be something like the following:

{
  "data": {
    "__schema": {
      "types": [
        {"name": "Boolean"},
        {"name": "CustomType1"},
        {"name": "Float"},
        {"name": "ID"},
        {"name": "Int"},
        {"name": "Query"},
        {"name": "SomeOtherCustomType"},
        {"name": "String"},
        {"name": "StringQueryOperatorInput"},
        {"name": "__Directive"},
        ...

Instead of exploring these manually (which you can), tools exist that send these introspection queries to build a schema. You can then read the schema and write queries with auto-completion.

If you're lucky, your target has a URL like /graphiql or responds to GET /graphql with a playground where you can test the API. However, in more hardened environments this is often not the case. You can however use a regular tool like Apollo Sandbox with a URL pointing to your target to send and receive data from there, while having a nice UI.

To aid in this, I created a simple wrapper where you can specify your own URL. You can open this in an empty browser profile with web security disabled to allow CORS without the target having to configure it. Apollo Sandbox allows you to add custom required headers and you can copy over the cookies from your regular authenticated session on your target.

With an introspection response, you can let the following tool generate all possible queries to play around with if you don't want to manually write these queries (although Apollo Sandbox can help with this too):

Tip: if you encounter or receive a SDL-formatted GraphQL schema (type syntax), you can turn it into JSON introspection data using this simple tool:

Guessing Schema with Hints

There are reasons for GraphQL APIs to disable introspection, in this case the tool above won't be able to auto-complete queries or fields. What you can do instead is try to fuzz for the right keywords. Often these APIs still give suggestions on your queries if a name is not recognized. With a good wordlist you can often recover a large portion of the API with this method.

The following tool implements this:

$ clairvoyance --help
usage: clairvoyance [-h] [-v] [-i <file>] [-o <file>] [-d <string>] [-H <header>] [-c <int>] [-w <file>] [-wv] [-x <string>] [-k]
                    [-m <int>] [-b <int>] [-p {slow,fast}] [--progress]
                    url

positional arguments:
  url

options:
  -h, --help            show this help message and exit
  -v, --verbose
  -i <file>, --input-schema <file>
                        Input file containing JSON schema which will be supplemented with obtained information
  -o <file>, --output <file>
                        Output file containing JSON schema (default to stdout)
  -d <string>, --document <string>
                        Start with this document (default query { FUZZ })
  -H <header>, --header <header>
  -c <int>, --concurrent-requests <int>
                        Number of concurrent requests to send to the server
  -w <file>, --wordlist <file>
                        This wordlist will be used for all brute force effots (fields, arguments and so on)
  -wv, --validate       Validate the wordlist items match name Regex
  -x <string>, --proxy <string>
                        Define a proxy to use for all requests. For more info, read
                        https://docs.aiohttp.org/en/stable/client_advanced.html?highlight=proxy
  -k, --no-ssl          Disable SSL verification
  -m <int>, --max-retries <int>
                        How many retries should be made when a request fails
  -b <int>, --backoff <int>
                        Exponential backoff factor. Delay will be calculated as: `0.5 * backoff**retries` seconds.
  -p {slow,fast}, --profile {slow,fast}
                        Select a speed profile. fast mod will set lot of workers to provide you quick result but if the server as
                        some rate limit you may want to use slow mod.
  --progress            Enable progress bar

After running the tool and receiving an output schema.json file, you can upload this to GraphiQL Explorer together with your endpoint to receive auto-completion and view the schema while querying.

For better results, it is recommended to create a custom wordlist from as much information as you can find from your target. This can be as simple as running a \w+ regex over the text to find and extract all unique words that may potentially be query names or fields. Use the -w option to provide it to clairvoyance.

Note that while looking at the target's JavaScript files, you can already often find some GraphQL queries stored in there as it is always the browser that requests them. Search for keywords like query or mutation .

Features

The basic concepts of GraphQL are explained in the tutorial below:

In summary, you have types with fields. You can query these types for exactly the fields that you require, or call specific mutations that have server-side logic implemented for them.

Arguments & Variables

Fields can also have arguments, these are common for filtering results. In your query you fill in these arguments with values.

Queries can also contain arguments, and you can leave these generic to fill them with a separate variables parameter. In a request, this looks like:

Query with $name variable

query ExampleQuery($name: String!) {
  someQuery(arg: $name) {
    id
  }
}

Request

POST /graphql HTTP/2
Host: example.com
Content-Type: application/json

{"query":"query ExampleQuery(...", "variables": {"name": "value"}}

This is a common pattern for applications because the query can be cached, but only the variable data is unique.

Mutations

The server can implement functions to handle changes in data, which you can call from GraphQL. These mutations often also use variables as explained above, and have a very similar structure to queries:

Mutation with $name variable

mutation ExampleMutation($name: String!) { 
  createUser(name: $name) {
    id
    name
  }
}

Request

POST /graphql HTTP/2
Host: example.com
Content-Type: application/json

{"query":"mutation ExampleMutation(...", "variables": {"name": "value"}}

The variables will be substituted in the query and the server will perform whatever logic it has implemented. The fields id and name specified inside the function call will be returned after it is done.

You can run multiple mutations in series by providing multiple aliases for different functions calls:

Multiple mutations

mutation { 
  firstUser: deleteUser(id: "42")
  secondUser: deleteUser(id: "1337")
}

More information about the HTTP requirements for a standard server endpoint can be found in the documentation below:

WebSockets

Instead of HTTP, there is also a common library that adds communication via WebSockets:

GitHub - enisdenjo/graphql-ws: Coherent, zero-dependency, lazy, simple, GraphQL over WebSocket Protocol compliant server and client.GitHub

The structure and handlers of this are slightly different from the regular HTTP API, so you may see different behavior like one allowing introspection while the other does not.

The WebSocket protocol is very similar, apart from some protocol changes, queries are the exact same. Below is an example client that queries another server over WebSockets:

<script type="module">
import { createClient } from 'https://cdn.jsdelivr.net/npm/[email protected]/+esm'

const client = createClient({
  url: "ws://localhost:4000/graphql",
});
console.log("Client connected", client);

(async () => {
  const query = client.iterate({
    query: "{ hello }",
  });

  const { value } = await query.next();
  console.log(value); // { hello: "world" }
})().catch((e) => console.error(e.message));
</script>

Attacks

Data Leak & IDOR

One common mistake in GraphQL is accidentally exposing too many properties. You should enumerate all fields for every object in every query. Developers may unintentionally expose properties that should be internal, like a password hash, reset token or 2FA secret.

You can use Introspection to get an exhaustive list, or fuzz with Guessing Schema with Hints.

Your own user and another user are two very different types. You should be able to see almost all properties of your user, but only a few minimal ones of other users. A naive implementation may just return all properties for all users, potentially exposing too much information if you can get a reference to another user.

Additionally, protections may be set on certain queries rather than fields. This has the effect that maybe directly requesting something you are not authorized to won't work, but if you indirectly access the field through some other reference it may still be allowed.

This combines well with Insecure Direct Object Reference (IDOR) vulnerabilities if you need to specify an identifier of some kind in a query/mutation argument.

Lastly, it is good to know that a mutation returns data. This is often the object you mutated, but may also expose too many properties. The following syntax gets properties of the result of a mutation:

Return data from mutation

mutation {
  sendMessage(user_id: 1337, message: "Hi!") {
    user {
      password_hash
    }
  }
}

Batching

In a single GraphQL request, you can send multiple queries and/or mutations. If they have the same name, you can differentiate them using an alias which is a name: prefix. This can be useful for bypassing per-request rate limiting because a single request may contain many actions. Below is an example for brute-forcing a login form, only the alias that was successful will return a valid token in the response:

Batch with aliases

mutation  {
  a: login(username: "admin", password: "admin")
  b: login(username: "admin", password: "123456")
  c: login(username: "admin", password: "password")
}

CSRF

Cross-Site Request Forgery (CSRF) is a technique where you send a request from an attacker's site straight to the target site, which will be automatically authenticated by the browser adding cookies.

Because GraphQL mutations happen via a simple POST request to a /graphql endpoint, implementations of it may also be vulnerable. It is crucial to check if only cookies provide authentication, no need for headers. And if so, check the SameSite= attribute of the cookie. See the dedicated CSRF page for details on what cases are exploitable and how.

By default, the query and variables are sent with a Content-Type: application/json header. This is not directly allowed to be set in a cross-origin request, and the browser will first send a Preflight request. If the response to this OPTIONS request says that it may use the JSON content type, only then will the real fetch() request you set up be sent. There are ways around this by confusing the content type reader, especially if SameSite=None or empty by providing alternative headers and a cleverly set up body.

GraphQL also uses a POST request which causes SameSite=Lax cookies not to be sent, even in a top-level form navigation. It may however be possible to change the method to GET and write the query parameter in the URL, such as:

GET /graphql?query=mutation%20{...

WebSocket Hijacking

If the server uses WebSockets and only requires SameSite=None or empty cookies to authenticate, you can connect with it cross-site. The best thing is that CORS doesn't apply here, you can always read the response!

Note that if cookies are SameSite=Strict, they will still be sent from subdomains, an XSS or takeover would be enough to compromise the main site in such a case.

All you have to do is connect with the WebSocket, send it a query that will be authenticated as the signed-in victim, and then read the response (more info).

XS-Search via Timing

If you are able to perform CSRF, but there aren't any interesting mutations, you may still get lucky if there are queries that search private data. These are inherently vulnerable to a XS-Leaks where you send a request from the attacker's site using fetch(), and then measure the time it took to resolve the request. The timing can be amplified by Batching to slowly leak the data matched by a search query in GraphQL.

PreviousNoSQL Injection NextXML External Entities (XXE)

Last updated 6 months ago