molily Navigation

Frontend security and Cross-Site Scripting (XSS) for Ruby on Rails developers

Cross-Site Scripting is a security hole that allows attackers to inject and execute JavaScript on your website. The article highlights some important security aspects that affect Ruby on Rails development. Transcript of a presentation I gave at my employer 9elements.

At the company I work for, I gave an internal presentation about frontend security. 9elements is a software development agency that builds server-side Ruby on Rails applications and client-side JavaScript applications, as well as native mobile applications for iOS and Android.

The presentation isn’t a complete introduction to Cross-Site Scripting, but highlights some important aspects that affect Ruby on Rails development. The following text is a transcript of the presentation.

Frontend security concepts to learn

Before I start with the main topic, I’d like to give an overview about security concepts every Rails and JavaScript developer should learn sooner or later. I won’t go into detail here, but mention them so you can read and learn on your own.

Not a thorough introduction

In this presentation, I won’t explain all things XSS. I will link to articles which do. I will explain the background and then focus on how XSS affects Ruby on Rails frontend development. Later I will show two real-world examples to illustrate how nasty and tricky XSS is.

What is Cross-Site Scripting (XSS)?

A security hole that allows attackers to inject and execute JavaScript on your website.

The cause of the problems: Data changes context

XSS is a very specific problem, but it’s caused by a general issue that affects all computer systems and programming languages:

  1. Applications process data using different programming languages and formats (for example Ruby, JavaScript, SQL; plain text, HTML, JSON, CSV).
  2. Data moves from one context into another because languages and formats are nested or chained.
  3. Data that has a specific meaning in one context gets different meaning when put into another context.
  4. In context one, data is just plain text. In context two, it may be interpreted as code.

This is a broad and abstract description, we’ll get into detail soon.

Untrusted content

Web applications deal with untrusted content all the time. This is data that isn’t created by the service provider, developers or trusted parties. It may contain errors, it may be incomplete, it may not comply with syntactical rules. In addition, it may contain malicious code.

Sources of untrusted content include:

  • Everything in the HTTP request:

    • URL: path, query string parameters etc.
    • Headers: cookies, user agent etc.
    • Request body: form data with user input, uploaded files etc.
  • Data from third-party web services and APIs

An important rule of web application development is: Always mistrust user input!

Code injection

Untrusted content can cause processing errors in the backend and frontend, but why is it a security concern?

Untrusted content gets into the database and eventually into the HTML, CSS or JavaScript code. If not treated correctly, this is a possible code injection.

Code injection is a serious security threat, especially the injection of JavaScript code. The injected code typically runs with the same privileges as the developer’s code. Such code can hijack user sessions, forge HTTP requests, read and expose private data, change content, spread misinformation, steal money etc.

“Exploits of a Mom”

Code injection is explained in a comic strip of the famous web comic XKCD:

Exploits of a Mom

In this story, a mom receives a phone call from her son’s school. The school has lost student records because she had added SQL code to her son’s name, probably when filling out a web form. The web application embeds the name directly into an SQL statement without considering that it can contain SQL code itself.

The comic is about SQL injection and not XSS, but the fundamental problem is the same.

Language syntax and escaping

To understand the background of XSS, we need to understand the nesting of data. Every programming language and data format has this problem in its own syntax.

For example: When does a string literal start and end? Typically, there is a delimiter character that marks the beginning and end of a string. In Ruby and JavaScript, this is a single or double quotation mark. Example:

"A string uses specific delimiters."

The parser reads this code, recognizes the delimiters and treats characters in between as part of the string, not as code.

But what happens if the string contains the delimiters? This won’t work:

"A string that contains "delimiters"."

This code has a syntax error because the second quotation mark already terminates the string. The parser would try to process “delimiters” as code again.

The solution is:

"The \"delimiters\" need to be escaped with a slash. The escape character \\ needs to be escaped as well."

The resulting string is “The "delimiters" need to be escaped with a slash. The escape character \ needs to be escaped as well.

These character sequences starting with a slash are called escape sequences. They tell the parser to treat the following character verbatim and not as code. They neutralize the special meaning of the character.

Language nesting

A typical Ruby on Rails software stack nests and chains languages – one language generates another or is translated into another language. For example, HAML is compiled to HTML and can contain Ruby, CSS and JavaScript. Sass is compiled to CSS; CoffeeScript to JavaScript. JavaScript itself can contain HTML and CSS. Most of these languages may contain common formats like URL, JSON and SVG.

Language nesting is a potential security problem because data changes its context and needs to be treated correctly to prevent errors and code injection.

A typical Ruby on Rails application uses the templating languages ERB and HAML. Since they concatenate strings to generate HTML, they just make a vague guess about the target context. They treat HTML as one context, which it isn’t, as we will see later.

HAML understands the HTML syntax a bit better than ERB. It can distinguish between elements, attributes and text content. For safe embedding of JavaScript and CSS, it has filters like :javascript and :css.

A templating language designed with security in mind should know the different contexts of the target language so it can escape appropriately. For example, an XML-/XSLT-based templating language is parsed into to a tree. The processor is able to understand the nesting of languages correctly, for example JavaScript embedded into HTML.

General HTML escaping

In HTML element content and attribute values, some characters have a special meaning. They need to be escaped so the browser processes them as plain text, not markup. Replace these characters with character references, either entity references or numerical references:

CharacterEscaped character
< &lt; or &#60;
> &gt; or &#62;
" &quot; or &#34;
' &apos; or &#39;
& &amp; or &#38;

ERB and HAML perform this HTML escaping per default.

Let’s assume there is malicious input with HTML and JavaScript code:

input = "<script>alert('XSS')</script>"

In the ERB template, the input is written to the document:

<p><%= input %></p>

Generated output:

<p>&lt;script&gt;alert(&#39;XSS&#39;)&lt;/script&gt;</p>

Thanks to ERB’s automatic HTML escaping, the script injection was prevented. But usually it’s more complicated!

Context matters

The HTML syntax is complex and input may be embedded into different contexts, for example:

  • HTML element content: <p>input</p>
  • HTML attribute values: <p title="input">
  • Inside the start tag, in the attribute list: <p input>
  • Some HTML elements have special parsing rules:

    • <title>input</title>
    • <script>input</script>
  • Embedded CSS and JavaScript: They have their own complex syntax with different contexts.

    • <style> h1 { color: input; }</style>
    • <script> var data = 'input'; </script>
    • <p onclick="someFunction('input')">

The described general HTML escaping is not applicable to all contexts. Just using ERB’s <%= input %> in these places is not safe. Context-specific escaping is necessary!

HTML parsing

To understand the processing of HTML in the browser, we need to take a brief look at the history of HTML.

In 1998, HTML 4 was defined as an application of SGML, a meta-language that allows to define syntax and semantics of markup languages. But no browsers followed the rigid SGML specification – because only few web pages were SGML-compliant. Instead, every browser invented its own ultra-liberal “tag soup” parser to deal with erroneous markup.

After the attempts to use SGML- and XML-based languages for web pages failed, HTML 5 finally standardized a custom syntax and liberal parsing. The HTML 5 parser is a complex state machine with defined output and error handling. This algorithm is now implemented in all big browsers, but older browsers have quirky parsers with different XSS attack vectors.

In recent Internet Explorers, attackers can enable the “compatibility mode” using a meta tag. For example, IE 11 has an IE 5 mode:

<meta http-equiv="X-UA-Compatible" content="IE=5">

In this mode, IE 11 emulates several errors and the HTML parsing quirks of IE 5. This way even the most recent Internet Explorer is vulnerable to old browser-specific XSS attacks.

Types of XSS

This presentation won’t mention all XSS principles, but we need to distinguish between two types of XSS:

Reflected XSS

Code is injected using the HTTP request and only present in the associated HTTP response. The attack vector is mostly the URL. All users are affected which open a crafted link that contains the injected code.

Reflected XSS is typically considered as the less severe type, but don’t underestimate it. Social media and e-mail spam make it easy to spread prepared URLs.

Let’s have a look at a simple example of Reflected XSS. Assume there is a URL that contains malicious code (HTML with JavaScript) in the query string:

http://example.com/?id=<script>alert(1)</script>

Assume there is a PHP script on the server that outputs the input without context-specific escaping:

<p>ID: <?php echo $_GET['id'] ?></p>

This creates a Reflected XSS hole because the server “reflects” the input in the output without filtering malicious code.

This is just a simple example – most of the time it’s more complex and the security vulnerability is not that obvious.

Modern browsers try to mitigate Reflected XSS by refusing to execute JavaScript code that originates from the URL or from form data. Browsers get suspicious when both input and output contain the same JavaScript code.

Persistent XSS (aka Stored XSS)

Persistent XSS means that malicious code is stored on the server, for example in the database, and is sent to other users with every response to a specific URL. Therefore, Persistent XSS potentially affects all users visiting a site. In contrast to Reflected XSS, the malicious code doesn’t need to be part of each request once it has been placed on the server.

There are multiple attack vectors for Persistent XSS. Data from all parts of the HTTP request (the URL, headers like “Cookie”, form data…) can be harmful when it is stored on the server and output again without treatment.

Also content that is loaded from third parties, especially HTML and JavaScript, may inject code persistently. This includes JavaScript libraries loaded from Content Delivery Networks (CDN), as well as advertisement and web analytics scripts.

XSS is the beginning of larger attacks

XSS allows an attacker to overtake user accounts in order to steal money, manipulate information or access private data. This is severe enough, but it may not be the ultimate goal of the attacker. A lot of bigger security breaches start with a rather small XSS hole, then the attacker drills deeper into the IT infrastructure.

For example, an attacker compromises a user profile on a social network using XSS. The malicious code replicates itself and spreads to other profiles (see the MySpace worm “Samy” and the recent Tweetdeck worm). First an attacker targets normal users, then they gain access to admin accounts. Using social engineering, an attacker manages to access servers and private databases that are not directly connected to the web application.

Rails does not save us from XSS holes

Rails 4, ERB and HAML have good defaults that prevent simple XSS attacks. They create SafeBuffers and HTML-escape input per default. But the devil lies in the detail. Most likely all non-trivial Ruby on Rails application are affected by XSS, we just don’t know yet because such holes aren’t easy to find.

Places where XSS holes hide in a Rails application:

  • Rails view helpers that create HTML code dynamically, but do not correctly escape the input data.
  • User-generated HTML isn’t filtered correctly, for example from a web-based rich text editor.
  • HTML is crawled from a third-party API and embedded into the page without filtering. To attack a well-secured site by XSS, an attacker just needs to compromise the weakest third-party script provider. They can pick one from up to 40 services.

In the rest of this presentation, I will describe two less obvious examples of XSS.

Example 1: XSS through jQuery DOM insertion

HTML5 data attributes are a common way to embed data into HTML that isn’t directly visible, but can be used by JavaScript later. A data attribute can contain HTML tags as long as they are escaped correctly. JavaScript can read the attribute value and inserts it into the document when necessary.

There are several ways to embed hidden content into a document to read it later with JavaScript. These are not XSS holes necessarily, but they may become if they aren’t implemented with care.

The vulnerability demonstrated here: If content is inserted into the document using common jQuery functions, jQuery executes all scripts in the input string.

Assume there is a Rails User model that has a “profile” attribute. This is a text field that can be edited by the user, so it contains arbitrary, untrusted data. Let’s assume an attacker enters HTML with JavaScript so we have this User model:

user = User.new(profile: '<script>alert(1)</script>')

The profile text isn’t visible directly, but it is shown using JavaScript when a button is clicked. The ERB template:

<button class="show-profile"
  data-profile="<%= user.profile %>">
  Show user profile
</button>

The profile text is embedded as a data attribute. The HTML output:

<button class="show-profile"
  data-profile="&lt;script&gt;alert(1)&lt;/script&gt;">
  Show user profile
</button>

The attribute value was escaped correctly, so this is not a security vulnerability yet. It becomes an XSS hole through this jQuery code:

$('.show-profile').click(function(event) {
  var profileText = $(event.target).data('profile');
  $('.content').html( profileText );
});

When the button is clicked, the data attribute is read into a string. This contains unescaped HTML and JavaScript code again: '<script>alert(1)</script>'. Then the string is added to the document using jQuery’s html() method.

The DOM insertion methods of jQuery have a questionable feature: They automatically find script elements in the input string and execute their content as JavaScript. Therefore, the jQuery code above contains an XSS vulnerability. jQuery evaluates the injected script alert(1).

All jQuery functions that accept HTML strings are affected: $('HTML code'), html, append(To), prepend(To), before/insertBefore, after/insertAfter etc.

jQuery offers parseHTML to parse an HTML string into a DOM tree without executing scripts. This would prevent the attack in the example above, but it does not protect from all XSS attacks. There are other ways to inject JavaScript into HTML, for example:

<img src="bogus" onerror="alert(1)">

When this image is added to the DOM, the browser tries to fetch the source URL “bogus”. Since the source does not exist, the error event is fired. Thus the code in the onerror attribute is automatically executed without user interaction.

A possible workaround is to use jQuery’s text() function whenever possible. This function ignores HTML and embedded scripts and treats them as plain text.

The reliable solution is to always sanitize input. A proper HTML sanitizer either removes HTML completely or filters known XSS attack vectors.

Example 2: Embedding JSON into HTML

Another tricky example of possible XSS is the embedding of JSON into HTML.

In JavaScript web applications, it is a common practice to embed necessary data for the application start right into the HTML using JSON. This can be a complete database record with untrusted data. Let’s assume we have a User model with a forged name:

user = User.new(name: "</script> <script>alert(1)</script>")

The User model needs to be passed to the JavaScript application. This is achieved by embedding a JSON serialization into the HTML. A naive approach would be (ERB template):

<script>
var user = <%= @user.to_json %>;
</script>

The resulting output:

<script>
var user = {&quot;name&quot;:
  &quot;&lt;/script&gt; &lt;script&gt;alert(1)&lt;/script&gt;&quot;};
</script>

This does not work and produces a JavaScript syntax error. The JSON code is escaped using HTML character references (&lt;, &gt; etc.), but character references inside of script elements are not parsed. For the content of script elements, special parsing rules apply. The general HTML escaping by ERB/HAML is not applicable to the particular script context.

We could continue with the naive approach and use string.html_safe or raw(string) to mark the JSON string as “safe” so it is not HTML-escaped:

<script>
var user = <%= @user.to_json.html_safe %>;
</script>

Don’t do this! It creates an XSS hole in Rails 3. Output:

<script>
var user = {"name": "</script> <script>alert(1)</script>"};
</script>

The </script> in the user name closes the script element, leaving the JavaScript context. The rest of the string is then parsed as normal HTML. It contains a new script with JavaScript code by the attacker.

So how do we solve this problem without creating XSS vulnerabilities? We need to tell Rails to escape all dangerous characters at the JSON string level.

In Rails 3, there are at least two ways to achieve this. There is an application-wide configuration option that changes the behavior of all to_json methods. In config/application.rb, we can add:

ActiveSupport::JSON::Encoding.escape_html_entities_in_json = true

Since this changes the output of all to_json calls, it may be more suitable to apply the fix on a lower level. Pass the output of to_json to json_escape:

<script>
var user = <%= json_escape(@user.to_json) %>;
</script>

In both cases, the HTML output is:

<script>
var user = {"name":
  "\u003C/script\u003E \u003Cscript\u003Ealert(1)\u003C/script\u003E"};
</script>

As you can see, dangerous characters are escaped on the JavaScript string level using Unicode escape sequences. This is valid JavaScript and prevents XSS. You still need to make sure that the JavaScript application treats the user name carefully – for example, do not pass it to jQuery’s html().

In case you are using Rails 4, you’re lucky and do not have to apply the fixes above. Since Rails 4, to_json always escapes special HTML characters (<, >…) in strings using the \uXXXX notation. The configuration option escape_html_entities_in_json is enabled per default. This is a good move that makes Rails applications more secure.

The cause of this XSS vulnerability is the special parsing of script elements. To avoid it, don’t put JSON in a script element, but in hidden div or span element instead. For these elements, general HTML escaping is sufficient:

<div id="user-json" style="display: none"><%= @user.to_json %></div>

Then read the content with JavaScript and parse the JSON manually:

var userJSON = document.getElementById('user-json').textContent;
var user = JSON.parse(userJSON);

A more effective solution is to sanitize untrusted data before it appears in the frontend. Remove or filter HTML using well-proven sanitizers.

Conclusion: Comprehensive input validation

A web application needs to mistrust input in every respect. Input needs to be validated before it is stored and processed further. This includes several checks and processing steps:

  • Text encoding (enforce UTF-8)
  • Length and range (minimum/maximum)
  • Syntax (names, identifiers, numbers, fixed sets of these)
  • Allowed characters, words, texts (uniqueness, disallow swear words, filter spam, prevent duplicate content etc.)
  • Remove or filter possible code injection (HTML, CSS, JavaScript etc.)
  • Whenever data changes the context, apply specific escaping or use tools that do automatically

These checks are necessary to ensure the reliability, robustness and security of your web application. My advice is: Try to break your own web app. Try to break into your own web app. Every day.

Resources

More than XSS

As said in the beginning, XSS is not the only frontend security threat. These links cover related security issues:

Acknowledgement

Thanks to the 9elements team for feedback on the presentation.

I’d love to hear your feedback! Send an email to zapperlott@gmail.com or message me on Twitter: @molily.