<<

Security

Cross-Site Scripting (XSS)

By David Epler (Bio)
gist

Cross-site Scripting (XSS) is the most prevalent web application security flaw and occurs when user supplied data is sent to the browser without properly validating or escaping that content. XSS flaws can allow the attacker to:

There are three types of XSS:

  • Reflected
    Reflected attacks are where the injected code is sent as part of the request and shown in the response. Common locations for reflected XSS are in error messages or search results. Reflected attacks require getting the user to click on the specially crafted URL or form with the injection. They are usually embedded in phishing emails or hidden through URL shorteners. Reflected XSS is also referred to as first order XSS.
  • Stored
    Stored attacks are where the injected code is permanently stored in the web application. Common locations for stored XSS are in message forums, blog comments, or comment fields. The stored attack is sent to the user when they access the information. Stored XSS is also referred to as Persistent or second order XSS.
  • DOM Based
    As the name suggests, the DOM based attack directly manipulates the browser through the DOM. DOM based attacks are different in that the response from the server is not manipulated, but the client side scripting is manipulated to modify how it runs.

Dealing with XSS requires properly decoding the input, validating the input, and then encoding the input for the context it will be used.

How Many Ways to Encode < as an HTML Entity?

So how many ways can you think of? Probably, &lt; maybe &#60;. There are actually 68 variations when using UTF-8. The problem only gets worse when attackers encode a string multiple times and mix the encodings to bypass validation filters. Below are some examples of how < can be encoded using different techniques:

  • &#x26;lt&#59 - double encoding
  • %26lt%3b - double encoding with multiple schemes
  • %252525253C - multiple encoding
  • &%6ct; - nested encoding with multiple schemes

It becomes very clear that trying to create black lists of patterns is futile when <script> can be encoded 1,677,721,600,000,000 ways. ColdFusion 10 introduced a new function, Canonicalize(), which reduces an encoded string down to its simplest form. Canonicalize takes three arguments; the first is the string to be processed, the next is a boolean to indicate whether multiple encodings should be restricted, and the third is another boolean to indicate whether mixed encodings should be restricted. If either of the boolean flags are set to true, ColdFusion will throw an error if the string contains a given restriction. Input that is encoded multiple times and/or contains mixed encodings is something a normal user would not do and should be regarded as an attack on the web application.

Below is an example function that will canonicalize all the keys of a structure, blocking any input that contains multiple or mixed encodings and logging them.

<cffunction name="decodeScope" access="public" returntype="void" output="false">
    <cfargument name="scope" type="struct" required="true" />

    <cfset var key = "" />

    <cfloop collection="#arguments.scope#" item="key">
        <cfif IsSimpleValue(arguments.scope[key])>
            <cftry>
                <!--- do not allow multiple and mixed encodings, most likely an attack --->
                <cfset arguments.scope[key] = canonicalize(arguments.scope[key], true, true) />

            <cfcatch type="any">
                <!--- if logging actual string, encoded it properly for context. 
                      Truncating cfcatch.logmessage so value is not included --->
                <cflog file="encodingErrors" text="#key# - #Left(cfcatch.LogMessage, Find(' in', cfcatch.LogMessage))#" type="error" application="true" />
                <cfset arguments.scope[key] = "" />
            </cfcatch>

            </cftry>
        </cfif>
    </cfloop>
</cffunction>

<cfset decodeScope(url) />
<cfset decodeScope(form) />
<cfset decodeScope(cookie) />
<cfset decodeScope(cgi) />

Canonicalize makes it easier to do validation now that the input has the encoding simplified. Do not output any values that have been canonicalized without properly encoding them for context which they will be used.

Validation

Now that the input has been canonicalized, it needs to be validated. As stated before, the best approach is for a white list, where you only allow good known patterns of the input through; anything that does not match the pattern is rejected. ColdFusion's IsValid() is the best choice since it has many predefined types and the ability to use regular expressions. The input should be checked for length and if it is in a valid range in the case of drop-downs, radio buttons, and checkboxes using values from the database.

<cfscript>
    // IsValid using pre-defined type for US Zip Code
    if (NOT IsValid("zipcode", form.zipcode)) {
        variables.error += "Bad Zip Code<br />";
    }

    // IsValid using regex
    if (NOT IsValid("regex", form.zipcode, "^\d{5}(-\d{4})?$")) {
        variables.error += "Bad Zip Code<br />";
    }

    // Another IsValid using regex limiting characters (only alphanumeric, space, ., -, ', and ,)
    if (NOT IsValid("regex", form.address, "^[a-zA-Z0-9 \.\-,']+$")) {
        variables.error += "Bad Address<br />";
    }
</cfscript>

There are also many validation functions available through CFLib. Validation should always take place on the server side. Client side validation should not be relied upon since it can be circumvented.

What About Rich Text Editors?

Inevitably, there will be a requirement for the web application to accept rich text formatting in textarea, which involves accepting HTML markup; this can be dangerous because it is harder to filter XSS. If possible, only accept rich text from users that are logged in to the web application. Filtering HTML used for rich text is difficult, and it is recommend to use OWASP AntiSamy instead of trying to create your own filters. There are several blog postings on how to integrate AntiSamy with ColdFusion; the best one is by Pete Freitag, Using AntiSamy with ColdFusion.

Another approach would be to use something other than HTML to markup the content. There are several lightweight markup languages such as Markdown, BBCode, or various wiki text formats which could be used. It is a little more complex to implement, but since the web applications are directly controlling the formatting and rendering of the markup, it becomes significantly harder to introduce XSS to the content.

Proper Encoding for Output

Now that the input has been canonicalized and validated, is it safe to use it? Actually, the answer is no; it is still possible that the validation let something through, as attackers are always finding new ways to create XSS attacks. Any input that is used after it has been canonicalized and validated needs to be encoded for the context it will be used in to ensure that it is not interpreted to be anything other than text. In previous versions of ColdFusion, one would use HTMLEditFormat(), XMLFormat(), or URLEncodedFormat(), but they do not properly encode everything correctly if you are using the input in Javascript or CSS. ColdFusion 10 introduced several functions using the OWASP ESAPI library which addressed these shortcomings.

Context Function Example
HTML EncodeForHTML() <p>Hello #EncodeForHTML(url.name)#</p>
HTML Attribute EncodeForHTMLAttribute() <div id="#EncodeForHTMLAttribute(url.name)#">
Javascript EncodeForJavascript() <a onclick="welcome('#EncodeForJavascript(url.name)#');" />
<script>var name = #EncodeForJavascript(url.name)#;</script>
CSS EncodeForCSS() <div style="background-color: #EncodeForCSS(url.color)#;"></div>
<style>.customColor {background-color: #EncodeForCSS(url.color)#};</style>
URL EncodeForURL() <a href="http://www.example.com/edit.cfm?id=#EncodeForURL(url.id)#">Edit</a>
XML EncodeForXML() <userinfo><name>#EncodeForXML(url.name)#</name></userinfo>

Unfortunately, Adobe did not implement all the encoders that are available in ESAPI, which would be particularly useful for EncodeForXMLAttribute, EncodeForLDAP, EncodeForXPath.

Because the older functions like HTMLEditFormat, URLDecode, URLEncodedFormat, and XMLFormat do not encode and decode as well as the new functions based upon ESAPI, Adobe is likely to deprecate them in favor of the newer functions.

<!--- Using OWASP ZAP 1.4.0.1 fuzzing with jbrofuzz/XSS - all selected (223 patterns) and URL encoded --->
<cfif NOT StructIsEmpty(form)>
    <cfoutput>
    <dl>
        <dt>Raw:</dt>
        <dd>#form.myvalue#</dd>
        <!--- blocked 42 patterns with Script Protect ALL --->

        <dt>HTMLEditFormat:</dt>
        <dd>#HTMLEditFormat(form.myvalue)#</dd>
        <!--- blocked 204 patterns --->

        <dt>EncodeForHTML:</dt>
        <dd>#EncodeForHTML(form.myvalue)#</dd>
        <!--- blocked all 223 patterns --->
    </dl>
    </cfoutput>
</cfif>

<form name="myForm" id="myForm" method="post">
    <input name="myvalue" type="text" value="" />
    <input type="submit" />
</form>

Older Versions of ColdFusion

As seen above, older functions that were relied upon to properly encode and decode data do not work as well as the new ESAPI based functions in ColdFusion 10. If ColdFusion 8 or 9 has been patched with APSB11-04 or higher, the ESAPI Java library can be used by calling the Java library. There is another excellent blog post by Pete Freitag, ColdFusion's Built in Enterprise Security API; here, he addresses the method that CFBackport uses and provides the same encode functionality for ColdFusion 8 and 9 along with other functions introduced in ColdFusion 10. Another alternative would be to use CFESAPI, which is a full implementation of ESAPI in ColdFusion.

Additional Resources: