Matt McCutchen's Filesystem Enhancements ======================================== The filesystem support in my custom kernel has a number of enhancements that address what I consider to be deficiencies in functionality and security. - Traversing sticky directories - Creating entries in sticky directories - Hard-linking others' files - Moving others' directories - Meaningful symlink permissions - lchmod - lutimes - writelink (not implemented) - Userspace support There is a section below documenting each. Finally there is a section on userspace support. Traversing sticky directories ----------------------------- Using this kernel, to access a file in someone else's sticky directory, you must own the target file or have some permission (r, w, or x) on it. If you try to look up a file that isn't yours and on which you have no permission, you get EPERM. Readdir still returns all directory entries (and on some filesystems it gives you the files' i-numbers and types); in a future version of the custom kernel, readdir may omit entries you aren't allowed to traverse. If you have only execute permission on a directory, the nonexistence of files in the directory is concealed: if you try to look up a nonexistent file, you get EPERM instead of ENOENT. If you have read permission on the directory, you can list it with readdir and see whether a given file name is listed, while if you have write permission, you can attempt to create a new file with that name and see whether you get EACCES because of an existing, unwritable file. Concealing a file's nonexistence would be pointless in either of these cases, so you get ENOENT if you have read and/or write permission in addition to execute. Why is this behavior useful? If you want to let everyone read a folder in your home directory, you can set its permissions to 755. To get to the folder, however, people need execute on your home directory, and giving them execute invites lots of abuse. People can guess filenames and see if those files exist in your home directory; if the files exist, people can stat them and learn their atimes, mtimes, and sizes. Maybe they won't hit upon any of your personal files, but the names of your mailbox and your dotfiles are likely to be well-known. People can find out when you last got mail and how big the mail was (if they've been watching the size of your mailbox), what programs have written configuration files recently, and so forth. If you use my kernel, you can stop the abuse by making your home directory sticky (mode 1711). People can still get to the public folder, but trying to access any other name will give them EPERM. They can't even learn whether files of given names exist, let alone stat them. Alternatively, you can set your home directory to mode 1755, in which case others can list the names of all files but only see stat information for the public ones. If ls fails to stat a file, it shows question marks for the file's attributes and gives the name a red background. Maybe you've already seen this if you've listed a directory of the silly mode 600. Here's what another user's home directory might look like: drwxr-xr-t 81 matt matt 4096 Jan 18 16:13 . drwxr-xr-x 3 root root 4096 Jan 8 16:46 .. ?--------- ? ? ? ? ? .bashrc drwxr-xr-x 1 matt matt 18 Jan 18 16:14 public ?--------- ? ? ? ? ? private If you use qmail, making your home directory sticky will tell qmail to hold your email. I recommend you modify qmail to use the setuid bit instead of the sticky bit to hold email since setuid on directories currently does nothing. I created such a modified qmail for my computer. Creating entries in sticky directories -------------------------------------- In the standard kernel, linking or moving someone else's file into someone else's sticky directory is legal but irreversible. My kernel forbids this. Just as it doesn't let you delete an entry for someone else's file from someone else's sticky directory, it doesn't let you create such an entry. Hard-linking others' files -------------------------- My school's Linux server hosts a number of Web sites for various school clubs. The members of each club belong to a group with permission to write to their Web site. This was set up long before default ACLs were available, so the sysadmins needed a way to allow people to write to files added to Web sites by other people with restrictive umasks. So they wrote a cron job that would forcibly "chmod -R g+w" on each Web site. They were lucky that nobody was devious enough to hard-link /etc/passwd into a Web site, let its group-write bit get turned on, and compromise the system. One can avoid this scenario by restricting hard linking. This kernel only lets you create a hard link to a file you don't own if you "control" the directory entry by which you name the file, meaning that stickiness or write permission of the containing directory do not prevent you from deleting the entry. In 99% of cases, this means you can hard link to a file if and only if you can move it. Immutable and append-only attributes on the containing directory affect moving the file but not hard-linking it. The upshot is that the only risk you take by running a recursive permission resetter is that a user who controls a directory entry to one of your files may be able to cause that file's permissions to change to the new value you are applying. If every file of yours whose directory entry is controlled by someone else also grants that person read and write permission, this risk is not a problem. This is almost always the case, but watch out for programs dumping core in untrusted directories. It appears to me that most legitimate purposes for hard links to others' files (e.g., saving disk space) are served equally well by symlinks. Moving others' directories -------------------------- At my school, people taking a certain computer class once copied their programs into the teacher's dropbox on the above-mentioned Linux server. The submitted directories got 755 permissions, and the teacher could not delete them when he was finished grading them! In general, if you own a directory, you should be able to delete any file from it. However, you can only delete a directory if you have enough write permission on the stuff inside to empty out the directory first. I considered having a system-wide trash area to which people can move offending directories and a cron job to clean out the area as root, but a restriction on rename(2) ruins this approach: since moving a directory causes its .. entry to change, you can only move directories that you can write. My custom kernel lifts the restriction so that a system-wide trash area can be implemented. If you wish to implement one, keep in mind that you need a trash area on each filesystem that is writable by non-root users, and your move-to- trash command needs to select the trash area on the same filesystem as the file being trashed. Meaningful symlink permissions ------------------------------ This is the most exciting of my enhancements, but it is also potentially the most disruptive. On reiserfs filesystems mounted with the new "symlink-perms" mount option, my kernel allows symlinks to have permissions and/or access ACLs just like other files. A newly created symlink gets its permissions from the creator's umask or the directory's default ACL, as usual. Permissions have the following meanings: - readlink requires read permission - writelink (not implemented) will require write permission - traversal requires execute permission This way, you can let people access some of the symlinks in a directory but not others. Or, if you are using files of secret names in conjunction with directories that grant only execute permission, you can give others execute-only symlinks that let them access the files but not learn their names. In addition, if you're using a symlink as a convenient miniature text file, you can make it non-executable so people don't try to follow it. (Unfortunately, this currently can't be done when you create a link.) Symlink permissions are a nice complement to the enhanced sticky bit. Using earlier versions of the custom kernel, if you opened your home directory to others (mode 1711), there was still no way to hide symlinks: they always appeared normally in the file listing. Now you can hide them by giving them 700 permissions. My /usr/local/bin has root:wheel ownership, 2775 permissions, and a default ACL of 775. Few installers respect the default ACL, so I occasionally have to fix the permissions of installed files. Now I can scan down the permissions column of the directory listing without being distracted by "lrwxrwxrwx" entries. Again, symlink permissions are only enforced, initialized as above, and changeable by users on reiserfs filesystems mounted with the option "symlink-perms". On other filesystems, new symlinks get 777 permissions, and anyone who can stat a symlink can traverse and readlink it. (The necessary changes to support symlink permissions are split between the VFS layer and individual filesystems, and I didn't feel like modifying every single filesystem implementation, so I modified only reiserfs because it is my favorite.) (I stopped enforcing symlink permissions on all filesystems when it interfered with readlinking /proc/*/fd/* entries for files open for writing only.) If you use RPM, I recommend that you do not enable symlink permissions on your root filesystem because they might confuse RPM verification. On the other hand, users would probably like symlink permissions supported on their home directories. lchmod ------ This system call changes the permissions on a file without following it if it happens to be a symlink. (Plain chmod will change the permissions on the target of a symlink.) Of course, you must own the file. If the file is indeed a symlink but symlink permissions are disabled on the filesystem, you get ENOTSUP. #313: int lchmod(const char *linkpath, mode_t mode); #314: int lchmodat(int relative_to_fd, const char *linkpath, mode_t mode); By the way, you can change a symlink's access ACL with lsetxattr(2) or "setfattr -h"; if ACLs and/or symlink permissions are disabled, you get ENOTSUP. lutimes ------- As lchmod is a counterpart to chmod that does not follow symlinks, lutimes is a counterpart to utimes that does not follow symlinks. You can use it to change the atime and mtime of a symlink on any filesystem. (Originally symlink times could only be changed on reiserfs, but this capability doesn't seem too dangerous; a filesystem implementation that really can't handle changing symlink times should complain in setattr.) As with utimes, you must have write permission to set the atime and mtime to the current time, and you must own the file to set the atime and mtime arbitrarily. #315: int lutimes(const char *path, const struct timeval times[2]); #316: int lutimesat(int relative_to_fd, const char *path, const struct timeval times[2]); writelink (not implemented) --------------------------- My kernel adds two system calls to let you change a symlink's target in-place. They are not yet implemented; they follow the path to the link but then give ENOSYS. Their declarations and system-call ID numbers are as follows: #311: int writelink(const char *linkpath, const char *new_target); #312: int writelinkat(int relative_to_fd, const char *linkpath, const char *new_target); I plan to consult the reiserfs people to learn how to implement changing symlink targets in-place. If this is practical, I will then add writelink support for reiserfs, controlled by a mount option "writelink". You probably wouldn't want to enable "writelink" on filesystems that still create their symlinks with 777 permissions. Userspace support ----------------- I have made all the necessary changes to the kernel to support the enhancements described here. Some, which merely tighten security, show up in userspace only as additional errors. Others require corresponding changes to userspace tools and libraries to be useful. For example, to call any of the six new system calls, you must either use syscall(2) and provide the system call number given here or use a customized C library that knows about the calls. I have made a customized glibc. Some command-line utilities could also use enhancement. Eventually I plan to customize coreutils to add support for "chmod -h", "touch -h", "getfacl -h", and "setfacl -h" and to make the "+" indicating nontrivial ACLs appear when it should on symlinks in "ls -l" output. In the meantime, I have prepared a small collection of proof-of-concept userspace utilities that work but are rather inconvenient to use. There's lchmod: $ lchmod 0700 mylink $ lchmod 0775 mylink There's lutimes: $ lutimes myfile 1146518442 0 1146518442 0 # atime{s,ns} mtime{s,ns} $ lutimes myfile # both to current time And there's even writelink: $ writelink mylink newtarget writelink: Function not implemented -------------- Matt McCutchen hashproduct@gmail.com http://kepreon.com/~matt/