Matt McCutchen's Web Site → Confused-deputy attacks in shared web hosting (Top, The attack, Symlink testing?, Countermeasures, See also, Bottom). Email me about this page.

Confused-deputy attacks in shared web hosting

Status: dormant; 2011 (supersedes any conflicting remarks left on this page; see the home page for definitions)

Many popular shared web hosting environments feature a single web server that runs under its own user ID and serves content from mutually untrusting customers who have shell access. This is a recipe for disastrous confused-deputy attacks unless special precautions are taken. One would think that any unix wizard who thinks about the setup for a little while would realize this, but the issue does not seem to be widely known (as of 2011-02-24), so I am attempting to publicize it here. I welcome comments.

The attack[# Top]

I'll discuss Apache because I am most familiar with it; I have not looked into whether other popular web servers differ in ways that would hinder this class of attacks.

Suppose an Apache process running as user httpd serves sites http://alice.example.com/ and http://bob.example.com/ from document roots /home/alice/www and /home/bob/www, which belong to customers Alice and Bob, who have shell accounts of the same names. Alice creates a secret page at /home/alice/www/secret/index.html and configures password protection via /home/alice/www/secret/.htaccess. In order for the web server to serve the page to web clients bearing the password, she grants httpd read permission on these files.

Bob wants to read Alice's page but does not know the password, so he makes a symlink at /home/bob/www/oops.html pointing to /home/alice/www/secret/index.html and browses to http://bob.example.com/oops.html . When the web server opens /home/bob/www/oops.html, the OS transparently follows the symlink and opens Alice's page, since the httpd user has read access to it. And the web server sends the page to Bob. Apache applies htaccess files only from the directories in the original path, /home/bob/www/oops.html; it has no occasion to apply Alice's htaccess file that specifies the password protection.

There are many variations of this attack. Bob could steal the htaccess file itself and then the password file it refers to. He could make a symlink to Alice's directory from within the scope of an htaccess file of his own that will cause Alice's directory to be served in a compromising way. E.g., if Alice created an index.html to suppress automatic indexing of a directory, Bob could specify DirectoryIndex dummy and then grab a listing of the directory. If Bob has access to a web application platform that will run his code as httpd (CGI without suexec, mod_php, etc.), he may be able to simply write a program that reads Alice's files.

Symlink testing?[# Top]

Apache and some web application platforms are capable of checking that the paths they use do not go through any symlinks. In Apache, this checking is activated by disabling the FollowSymLinks option. However, most implementations of symlink testing contain time-of-check-to-time-of-use race conditions that make them suitable only for mitigating webmaster mistakes, not preventing confused-deputy attacks by a malicious webmaster. I commend the Apache manual for calling this out.

Defeating the symlink testing in these systems is challenging but quite possible with appropriate techniques, foremost among them filesystem mazes. Inspired by a seminal paper on the topic, I prepared an exploit kit for a symlink-testing race in MySQL. Later, I successfully adapted it for the symlink testing in one popular proprietary web application platform. There is every reason to expect it could be adapted to Apache and other web application platforms.

One approach to symlink testing that does not have a race condition is to emulate path walking in userspace by checking and opening one component at a time, using O_NOFOLLOW to catch a non-symlink that is suddenly replaced with a symlink. However, I imagine this would be very slow, and I am unaware of it being used in practice.

Countermeasures[# Top]

I can think of several ways a web server (likewise, web application platform) could prevent the attacks:

While running on behalf of Bob, temporarily take on a UID specific to Bob. Alice's document root would allow access only to the UID that the web server uses when running on her behalf. There are a few variants:

setuid to Bob. This may be risky, but perhaps not much more so than suexec for CGI programs. This is a privileged operation; the web server would need to be running as root (risky) or otherwise be authorized to change to that specific ID (mainstream OSes do not have such a feature).
setuid to a designated UID for Bob's document root, different from his own. Again a privileged operation.
Adopt Bob's UID as an additional constraint on filesystem operations. This feature is not in mainstream OSes, but if it were added, it would not need to be a privileged operation and so would not require special arrangements for the web server to be able to use it.

Like #1c, but emulate the secondary permission check in the web server instead of changing the process credentials. To do this fully, the web server would have to emulate path walking as mentioned above to perform the incident permission checks. Checking only the ultimate target via an fstat after open would prevent most significant cases of data theft but would still leak the existence of files via error messages. The obvious approaches to plug this leak can be defeated via races again.
While running on behalf to Bob, chroot to a directory that does not contain Alice's document root. It may take some work to set up the environment for the web server to function correctly. In current OSes, chroot is a privileged operation due to interactions with hard links to setuid executables, but in principle that is easy to fix by subjecting processes that have undergone an unprivileged chroot to the same restrictions as processes being ptraced.

There is one other way to prevent the attacks, which does not involve modifying any software: add a secret random string to the path of each document root. However, one must be extremely diligent to prevent the string from leaking to other customers, e.g., in web application error messages, or in the server configuration or error log if they can be stolen by the same method. The admin should provide a convenience symlink to each document root so that the customer does not have to remember the string and will be less likely to leak it via command-line arguments on OSes where those are public. Obviously, this symlink must not be accessible to the web server.

Confused-deputy attacks in shared web hosting

The attack[# Top]

Symlink testing?[# Top]

Countermeasures[# Top]

See also[# Top]