Status: dormant; 2011
Many popular shared web hosting environments feature a single web server that runs under its own user ID and serves content from mutually untrusting customers who have shell access. This is a recipe for disastrous confused-deputy attacks unless special precautions are taken. One would think that any unix wizard who thinks about the setup for a little while would realize this, but the issue does not seem to be widely known (as of 2011-02-24), so I am attempting to publicize it here. I welcome comments.
I'll discuss Apache because I am most familiar with it; I have not looked into whether other popular web servers differ in ways that would hinder this class of attacks.
Suppose an Apache process running as user httpd
serves sites http://alice.example.com/ and http://bob.example.com/ from document roots /home/alice/www
and /home/bob/www
, which belong to customers Alice and Bob, who have shell accounts of the same names. Alice creates a secret page at /home/alice/www/secret/index.html
and configures password protection via /home/alice/www/secret/.htaccess
. In order for the web server to serve the page to web clients bearing the password, she grants httpd
read permission on these files.
Bob wants to read Alice's page but does not know the password, so he makes a symlink at /home/bob/www/oops.html
pointing to /home/alice/www/secret/index.html
and browses to http://bob.example.com/oops.html . When the web server opens /home/bob/www/oops.html
, the OS transparently follows the symlink and opens Alice's page, since the httpd
user has read access to it. And the web server sends the page to Bob. Apache applies htaccess files only from the directories in the original path, /home/bob/www/oops.html
; it has no occasion to apply Alice's htaccess file that specifies the password protection.
There are many variations of this attack. Bob could steal the htaccess file itself and then the password file it refers to. He could make a symlink to Alice's directory from within the scope of an htaccess file of his own that will cause Alice's directory to be served in a compromising way. E.g., if Alice created an index.html
to suppress automatic indexing of a directory, Bob could specify DirectoryIndex dummy
and then grab a listing of the directory. If Bob has access to a web application platform that will run his code as httpd
(CGI without suexec, mod_php, etc.), he may be able to simply write a program that reads Alice's files.
Apache and some web application platforms are capable of checking that the paths they use do not go through any symlinks.
In Apache, this checking is activated by disabling the FollowSymLinks
option. However, most implementations of symlink testing contain time-of-check-to-time-of-use race conditions that make them suitable only for mitigating webmaster mistakes, not preventing confused-deputy attacks by a malicious webmaster. I commend the Apache manual for calling this out.
Defeating the symlink testing in these systems is challenging but quite possible with appropriate techniques, foremost among them filesystem mazes. Inspired by a seminal paper on the topic, I prepared an exploit kit for a symlink-testing race in MySQL. Later, I successfully adapted it for the symlink testing in one popular proprietary web application platform. There is every reason to expect it could be adapted to Apache and other web application platforms.
One approach to symlink testing that does not have a race condition is to emulate path walking in userspace by checking and opening one component at a time, using O_NOFOLLOW
to catch a non-symlink that is suddenly replaced with a symlink. However, I imagine this would be very slow, and I am unaware of it being used in practice.
I can think of several ways a web server (likewise, web application platform) could prevent the attacks:
setuid
to Bob. This may be risky, but perhaps not much more so than suexec
for CGI programs. This is a privileged operation; the web server would need to be running as root (risky) or otherwise be authorized to change to that specific ID (mainstream OSes do not have such a feature).setuid
to a designated UID for Bob's document root, different from his own. Again a privileged operation.fstat
after open
would prevent most significant cases of data theft but would still leak the existence of files via error messages. The obvious approaches to plug this leak can be defeated via races again.chroot
to a directory that does not contain Alice's document root. It may take some work to set up the environment for the web server to function correctly. In current OSes, chroot
is a privileged operation due to interactions with hard links to setuid executables, but in principle that is easy to fix by subjecting processes that have undergone an unprivileged chroot to the same restrictions as processes being ptraced.There is one other way to prevent the attacks, which does not involve modifying any software: add a secret random string to the path of each document root. However, one must be extremely diligent to prevent the string from leaking to other customers, e.g., in web application error messages, or in the server configuration or error log if they can be stolen by the same method. The admin should provide a convenience symlink to each document root so that the customer does not have to remember the string and will be less likely to leak it via command-line arguments on OSes where those are public. Obviously, this symlink must not be accessible to the web server.
I took a quick look for previous discussion of the issue on 2011-03-13, and here is what I found. Apparently it has been discussed before, but I think I am the first to state it in full generality.