ModSecurity Breach

ModSecurity: Features: PDF Universal XSS Protection

Introduction

The GNUCITIZEN website gives a pretty concise overview of this probem. The Universal PDF XSS issue was discovered by Stefano Di Paola and Giorgio Fedon and it was presented at the 23C3 security conference. This vulnerability obviously affects the Adobe Acrobat Reader which is a widely used software among business, non-business organizations and individuals. By abusing Acrobat’s open parameter features well protected sites become vulnerable to Cross-site scripting attacks if they host PDF documents. This is pretty bad and unless you update your reader or change the way your browser handles PDF documents, you may get hacked quite badly. This issue is very serious. Fortunately, ModSecurity 2.5 users can easily and effectively mitigate this issue by using the new PDF protection directives.

Background

It all started back in 2005 when Amit Klein published DOM Based Cross Site Scripting or XSS of the Third Kind. Amit observed that XSS does not necessarily need a vulnerable server-side program to manifest itself. Everything can take place in the browser itself. He also observed how the # character can be used to, very conveniently, avoid sending attack payload to the server. DOM-based XSS typically uses JavaScript.

Example (taken from Amit’s paper):

<HTML><TITLE>Welcome!</TITLE>
Hi
<SCRIPT> var pos = document.URL.indexOf("name=") + 5; document.write(document.URL.substring(pos,document.URL.length)); </SCRIPT>
</HTML>

Normally invoked with:

http://www.example.com/welcome.html?name=Joe

Does not work equally well when invoked with:

http://www.example.com/welcome.html?name= <script>alert(document.cookie)</script>

Enter Acrobat Reader Universal PDF XSS

In December 2006 Stefano Di Paola and friends speak about the universal XSS flaw in the Acrobat Reader plug-in on Windows. The world found out when the advisory went out on January 3rd, 2007. (The flaw was already fixed in Reader v8 in early December 2006.) The word spread like fire among security bloggers (pdp) and on the mailing lists. RSnake discovered the attack can be used against PDF files hosted on the local filesystem.

For many people this was the last straw. They acknowledged that the end of the World is near...

So What Was The Problem?

It turns out the Reader plug-in loved JavaScript so much it would execute it when a link in the following format is encountered:

http://www.example.com/file.pdf#a=javascript:alert('Alert')

Uh-oh, notice the # character!

Threat Assessment

  • Discoverability - 10
  • Reproducibility - 10
  • Exploitability - 7
    • Attack code not trivial but not very difficult to write.
    • Victim must click a link (email) or visit a malicious web site. Both attack vectors are examples of CSRF.
  • Affected Users - 10
    • PDF is a standard for printable documentation.
    • Most computers have Adobe Reader installed.
    • Most sites carry PDF files.
  • Damage Potential - 8
    • After a successful attack the code is executed in the context of the site that hosts the PDF file.
    • The attacker is in full control of the victim’s browser (think session hijacking, request forgery, etc.).
    • Individual users are fully compromised.
    • System compromise is possible through escalation.
    • When a locally-hosted PDF file is targeted attackers can gain access to the workstation (requires further tricks to be used, e.g. the QTL hack, but doable).
    • Damage potential depends on site content.

Where Are The Exploits?

The potential for damage is there, all right, but where are the exploits?

  • Many have expected doom and gloom.
  • But no major scale attacks reported.
  • Why?

Where Do We Stand Today?

  • The excitement is gone.
  • Security-aware people have fixed the problems.
  • But how many vulnerable people and sites remain?
  • This problem is as dangerous as it was when I came out at the beginning of 2007.

Fixing The Problem - Users

In many ways this is a simple problem to solve. Just upgrade the client-side software as:

  • Adobe Reader 8 not vulnerable.
  • Internet Explorer 7 not vulnerable.
  • Other PDF viewers (e.g. Foxit Reader) not vulnerable.

Alternatively, you can configure the browser not to open PDF files at all, however we know many users will not upgrade.

Fixing The Problem - Sites

The most important point to understand is this - it is not possible to detect attack on the server. Therefore our only option is to “protect” all PDF files no matter if they are being attacked or not. Proposed mitigation revolves around three ideas:

  • Moving PDF files to some other domain name.
  • Preventing browsers from recognising PDF files. (Some are very stubborn in this regard.)
  • Forcing browsers to download PDF files.

This can be done via header modification in web server configuration (all files) or application (dynamic files only).

  • Key headers:
    Content-Type: application/octet-stream
    Content-Disposition: attachment; filename=x.pdf
    
  • Apache Fix:
    AddType application/octet-stream .pdf
    <FileMatch "\.pdf$">
    Header set Content-Disposition \ "attachment; filename=document.pdf“
    </FileMatch>
    

    Detailed instructions available from Adobe: http://www.adobe.com/support/security/advisories/apsa07-02.html

Analysis Of The Solution So Far

  • Advantages:
    • The web server configuration-based approach is very easy to implement, however it may not possible to use this approach with all environments.
  • Weaknesses:
    • Changing application code can be time consuming.
    • Forcing downloads of PDF files is not very user friendly (many users will get confused).
    • Dynamically-generated PDF files are easy to forget (and thus miss).

Sidebar: Approaches That Do Not Work

  • Trying to detect attack from the server.
    • Not possible to see the attack from the server.
  • Relying on the Referer request header.
    • It’s not always there.
    • Can be forged.
  • Changing Content-Type only.
    • IE will sniff the content to determine the Content-Type.
  • URI Encryption & Requiring sessions:
    • Defied using session fixation.
    • Not usable on public sites anyway.

Using Redirection

Amit Klein proposed a defence mechanism, which was subsequently discussed and refined on the mailing lists. While searching for a better solution many people noticed that it is possible to overwrite the attack payload using redirection and a harmless fragment identifier.

If we get:

http://example.com/test.pdf#x=ATTACK

We redirect to:

http://example.com/test.pdf#neutralise

Preventing Loops

But how do we tell we’ve already redirected the user? If we don’t we’ll just end up with an endless loop. We can use one-time tokens as flags.

So this:

http://example.com/test.pdf#x=ATTACK

Is now redirected to:

http://example.com/test.pdf? TOKEN=XXXXXXX#neutralise

Token Generation

If we generate a completely random token then we’d have to start keeping state on the server (i.e. token repository, garbage collection of expired tokens).

  • It’s a fine approach.
  • But it can have non-negligible impact on the performance and maintenance of non-trivial sites.
  • It can also affect cacheability.

Alternatively, we can store state on the client.

  • Use cryptography to validate tokens.
  • Embed the expiry time.

Token Hijacking?

Unfortunately, our solution is not foolproof yet as the attacker can simply generate a number of tokens to use against his victims. We have to associate tokens with clients somehow. It would be nice to use the application session but not all sites have them.

  • Exploitation possible through session fixation.
  • Thus we have no choice but use the IP address.

But what happens if the IP address changes (user behind a proxy)?

  • We fall back to forced download.

It's Not Foolproof!

There are still holes in our solution! If the attacker shares the same IP address as the victim (proxy, NAT) he will be able to obtain tokens to use in attacks.

  • The timeout feature does not help much.
  • If the attacker can get the victim to browse a malicious web site he can:
    • Generate responses dynamically while…
    • …obtaining valid tokens behind the scenes.

At best, we can prevent mass-exploitation, however focused attacks remain an issue.

A Foolproof Protection Mechanism Would...

A foolproof protection mechanism would:

  • Associate tokens with client SSL certificates. (Or to session IDs where sessions have already been associated with client SSL certificates.)
  • This would prevent session fixation.

And it would only work on:

  • Sites that have sessions, and
  • We would have to know where the session ID resides.

Not usable as a general purpose protection method.

Implementation Details

Most protection mechanisms rely on detecting the PDF extension in the request URI. Let’s have a look at some request types:

  • GET /innocent.pdf
  • GET /download.php/innocent.pdf
  • GET /download.php?file=innocent.pdf
  • GET /download.php?fileid=619
  • POST /generateReport.php (with a bunch of parameters in the request body)

To catch the last three cases we have to inspect the outgoing headers:

Content-Type: application/pdf

Potential Performance Issue

There is a potential performance issue if we redirect a GET request based on what we see in the response headers.

  • The PDF is going to have to be generated twice.
  • Think long-running reports… not good.

There is a way to solve this but it is a bit of a strech:

  • Store the response (PDF) into a temporary file.
  • Redirect request, serving the PDF (from the temporary file, without invoking the backend) when we see the corresponding token again.

Can We Deail With POST Requests?

No; all redirections are to a GET.

  • We lose POST parameters.

Well, strictly speaking, there is a way:

  • We could respond with a page that contains a self-submitting form with original parameters.
  • Or, as we did on the previous slide, store the response and issue a GET with a token to fetch it.

But that would be bit too much.

  • It could break applications in subtle ways.
  • It’s probably “cheaper” to simply force PDF download in such cases.

Redirection Defense Implementations

There may be others... Let us know if you find any.

ModSecurity PDF Protect Directives

SecPdfProtect

Description: Enables the PDF XSS protection functionality.

Once enabled access to PDF files is tracked. Direct access attempts are redirected to links that contain one-time tokens. Requests with valid tokens are allowed through unmodified. Requests with invalid tokens are also allowed through but with forced download of the PDF files. This implementation uses response headers to detect PDF files and thus can be used with dynamically generated PDF files that do not have the .pdf extension in the request URI.

SecPdfProtectMethod

Description: Configure desired protection method to be used when requests for PDF files are detected.

Possible values are TokenRedirection and ForcedDownload. The token redirection approach will attempt to redirect with tokens where possible. This allows PDF files to continue to be opened inline but only works for GET requests. Forced download always causes PDF files to be delivered as opaque binaries and attachments. The latter will always be used for non-GET requests. Forced download is considered to be more secure but may cause usability problems for users ("This PDF won't open anymore!").

Default: TokenRedirection

SecPdfProtectSecret

Description: Defines the secret that will be used to construct one-time tokens.

You should use a reasonably long value for the secret (e.g. 16 characters is good). Once selected the secret should not be changed as as it will break the the tokens that were sent prior to change. But it's not a big deal even if you change it. It will just force dowload of PDF files with tokens that were issued in the last few seconds.

SecPdfProtectTimeout

Description: Defines the token timeout.

After token expires it can no longer be used to allow access to PDF file. Request will be allowed through but the PDF will be delivered as attachment. Default: 10

SecPdfProtectTokenName

Description: Defines the name of the token.

The only reason you would want to change the name of the token is if you wanted to hide the fact you are running ModSecurity. It's a good reason but it won't really help as the adversary can look into the algorithm used for PDF protection and figure it out anyway. It does raise the bar slightly so go ahead if you want to. Default: PDFTOKEN

Testing Injection Rules

Let's run a quick test with the following ruleset:

# PDF Protection
SecPdfProtect On
SecPdfProtectTimeout 10
SecPdfProtectSecret 3790918688a87dc76496a5de6811ac1f
SecPdfProtectTokenName PDFPROTECT

These rules enable the PDF proection mechansims. Now, if a client requests a PDF file on the protected site, their original request:

GET /documents/sample.pdf HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive

Is intercepted by the SecPdfProtect directive, and a response similar to the following would be sent:

HTTP/1.1 307 Temporary Redirect
Date: Sun, 30 Sep 2007 07:51:19 GMT
Location: /documents/sample.pdf?PDFPROTECT=0f0cecf605568c08e7cb99d7cbeff8164d571d7d|1191138689#PDFP
Content-Length: 308
Keep-Alive: timeout=5, max=50
Connection: Keep-Alive
Content-Type: text/html; charset=iso-8859-1
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>307 Temporary Redirect</title>
</head><body>
<h1>Temporary Redirect</h1>
<p>The document has moved <a href="/documents/sample.pdf?PDFPROTECT=0f0cecf605568c08e7cb99d7cbeff8164d571d7d|1191138689#PDFP">here</a>.</p>
</body></html>

Once the client's browser followed the 307 Temporary Redirect location, ModSecurity would then validate the PDFPROTECT hash data. If the hash was still valid, then the request would be allowed to continue. If the hash value was not valid (either due to a mismatch in the client IP address or that it is outside of the allowed timeout setting) then it would be rejected.

Conclusion

There is no perfect solution - only a trade-off between security, usability, and performance. Isn't everything? Flaws to be aware of:

  • Does not protect from attackers sharing IP address with you.
  • Must fall back to forced download for dynamic requests.

In general:

  • Carefully examine your chosen defence method to understand exactly when you are protected!