Matt McCutchen's Web SiteUtilities  (Top, Basic file management, RPM software management, Web log analysis, Stow, retex, ftc, ftx, gitar, patchsync, xsltdepcomp, continusync, Isolated Firefox, ntfsresizecopy, git subtree-lite, Bottom).  Email me about this page.


Status: intermittently active, parts obsolete; 2005-present (supersedes any conflicting remarks left on this page; see the home page for definitions)

Here I collect utilities I use or previously used on my Linux system. Some of them may work on other unix-like systems. They vary widely in maturity: caveat emptor!  Your feedback and contributions are welcome: please email me.

Since 2020-09-02, the actual files are in this git repository for your convenience in keeping your copies up to date and merging modifications, except as indicated on this page.  The main documentation is still maintained on this page, although many of the files also have comments in them.

Basic file management[# Top]

RPM software management[# Top]

Maintenance of traditional mutable root filesystems

I use these tools to help manage my traditional Fedora system on which RPM transactions mutate the root filesystem over time.  They have some limitations and make some assumptions specific to my setup, but the latter should be easy to change.  I'd like to move to a system that generates the root filesystem reproducibly from a specification, likely reusing code or at least ideas from Fedora Silverblue (Silverblue doesn't appear to do everything I need out of the box), which would make these tools obsolete.  In the meantime, rpm-audit keeps at least the set of installed packages reproducible (with minor caveats).  With the packaged versions of configuration files saved by rpmconf-matt, it wouldn't be hard to write a tool that diffs the root filesystem against the packaged state, though I haven't done so yet.  (If you do, let me know!)  However, the problem remains that scriptlets mutate things from their packaged state, so one has to know what changes from scriptlets are expected (or what expected changes from scriptlets are missing!).  Efficient re-running of scriptlets (ideally incremental, if there's any way to achieve that) is a major problem that any tool for reproducible generation of the root filesystem would have to solve.

The simple system-update script brings together all three tools to do a complete system upgrade, audit, and file merge.

If you want to install some packages from an updates-testing or similar repository (or even a different Fedora release with dnf --releasever=X) without permanently enabling the whole repository, you'll need to dnf download the packages to a permanently enabled local repository in order to pass rpm-audit.  See the next subsection for a consideration about doing that.

Downloading all RPMs built from a given SRPM

If you copy some RPMs from a repository you don't enable by default on your system to your own repository, it's best to copy complete sets of packages built from given SRPMs to ensure that you don't accidentally install mismatching RPMs if you forget to add another RPM built from the same SRPM to your repository before installing that package name.  Unfortunately, as of this writing (2020-09-18), dnf repoquery does not have a built-in option to list all RPMs built from given SRPMs, but you can use this dnf-repoquery-by-srpm tool instead.  So the command to download all packages built from given SRPMs to the current directory would be something like:

dnf download [OPTIONS] $(dnf-repoquery-by-srpm [OPTIONS] N-V-R.src.rpm [N2-V2-R2.src.rpm ...])

Note that options like --enablerepo or --releasever must be passed in two places.

Alternatively, if you want an entire Fedora update (which may include RPMs built from more than one SRPM), you can use dnf updateinfo --list, but many third-party repositories do not have an analogous update system that publishes metadata for dnf updateinfo.

RPM building against the local dnf configuration

If you want to build custom RPMs locally, I highly recommend Mock (available in Fedora in the mock package).  It builds in an isolated environment, helping you achieve functional reproducibility.  (Note that achieving bit-for-bit reproducibility often requires much more legwork.  Also, Mock is not designed to protect your system from malicious build inputs; I recommend using a separate Qubes OS VM for that.)

However, an obstacle you're likely to encounter, at least in Fedora, is that by default, Mock builds against a standard dnf configuration for the Fedora repositories.  You'd probably prefer to build against your own system's dnf repository configuration, which might include third-party repositories or even your own previously built custom RPMs.  For RPM source repositories formatted like the official Fedora ones, you can achieve this with this Mock configuration file, which should be placed in /etc/mock.  You can select it by passing -r host to Mock or by symlinking /etc/mock/default.cfg to host.cfg.  If you are using fedpkg mockbuild, you must pass --mock-config host, since fedpkg's default choice of Mock configuration is based on the package rather than the system default.

In the past, I've used a similar Mock configuration to build custom Qubes OS RPMs for VMs rather than using the official qubes-builder, which works somewhat differently.  If and when I need that configuration again and bring it up to date, I will post it here.  (Building RPMs for dom0 requires a different process since your VM repository configuration won't match dom0's.)

As a reminder, in addition to using one of these configuration files, you probably want to set the following in your ~/.config/mock.cfg:

config_opts['macros']['%packager'] = 'YOUR_NAME <YOUR_EMAIL>'

Web log analysis[# Top]

Here are the tools I use to analyze the server logs for this web site to understand how people are using the site and prioritize improvements, since a quick web search didn't find another tool that did what I wanted.  Notable features:

Sample overview output (some portions omitted and comments added):

=== STATUS CODE 200 ===
  86 "GET /site/style.css"
    64 ""
     9 ""
  77 "GET /bigint/"
    40 "-"
    23 ""
   # More concise output if all requests for the same URL have the same referrer
   4 "GET /escape/icon.png" ""
=== STATUS CODE 404 ===
   2 "GET //wordpress/wp-includes/wlwmanifest.xml" "-"    # abuse
   # Oops... site misconfiguration fixed on 2020-09-16
   1 "GET /app-downloads/escapesetup-windows-201609050.mattmccutchen202008280.exe" ""

The obvious site-specific parameters are taken from a separate configuration file, but there are plenty of other assumptions specific to my setup that you may need to change in order to use this.

Stow[# Top]

My modified version of Stow 1.3.3 (very old) with the following enhancements:

retex[# Top]

retex is a TeX wrapper script that makes TeX compiling fit more nicely into build processes.  For example, it exits nonzero immediately if an error occurs, and it repeats until a fixed point is reached in order to handle LaTeX references correctly. Better tools may exist.

ftc, ftx[# Top]

ftc packages a file tree in a single file of a simplistic format that I designed, and ftx extracts such package files.  They have the same purposes as tar -c and tar -x respectively but have no bells or whistles.  They handle binary files safely, but a package of only text files is itself a text file.

gitar[# Top]

gitar ("git archive") uses the git backend to make really small packages out of file trees with lots of redundancy.  ungitar unpacks the packages; it requires ftx.

A .gitar package consists of an ftc package containing a bare git repository whose HEAD is the original tree and whose objects are all stored in a single pack.  Hence, git will represent similar files in the original tree as deltas.

gitar is great for compressing together several versions of the same piece of software.  I had seven versions of my custom rsync lying around, each about 585 KB as a tar-bz2 package.  I unpacked them all inside a single folder and gitared that folder; the resulting gitar package was only 865 KB.  Of course, if you can be bothered to import the sequence of versions into git as a proper sequence of commits, that's much better.

Note (2008-06-01): A while ago, git gained a standardized binary format for "bundles"; I should change gitar and ungitar to use bundles rather than my ftc format.

patchsync[# Top]

patchsync synchronizes a trunk, a branch, and a patch that contains the differences between the two.  If the trunk or patch changes, it updates the branch; if the branch changes, it updates the patch.  I developed patchsync to help me follow branches of rsync, but I no longer use it for that purpose.  Depending on your situation, you may prefer a more sophisticated patch-management tool such as StGIT.

To set up a patchsync staging directory, run:

patchsync --new trunk patch branch where-to-create-staging

Then, to synchronize, run:

patchsync staging

Read the gigantic comment at the top of patchsync for much more information.

Version log

From 2006.12.16 to 2006.12.24, a development version of patchsync was mistakenly identified as version 2.2.  As of 2006.12.24, the real version 2.2 is posted.

xsltdepcomp[# Top]

xsltdepcomp (named after depcomp) runs an XSL transform with xsltproc and generates a dependency makefile for the files read, thanks to xsltproc --load-trace. It is used in the build system for this web site.

continusync[# Top]

continusync is a perl script around inotifywait and rsync that performs continuous mirroring, as suggested by Buck Huppmann. It is currently experimental and rather inefficient, but it does appear to work in simple cases.  If you want to use it, I would be much obliged if you improved it as necessary and sent me the improved version.

Isolated Firefox[# Top]

firefox-isolated is a Firefox wrapper script that creates and uses a disposable profile.  You can make it harder for people to correlate your activities across multiple Web sites by browsing each site with a separate Firefox profile created by this script.  This script was inspired by the Facebook Beacon outrage.

To use this script, install it in your $PATH and name your master Firefox profile (from which the disposable ones will be copied) 00000000.master; then run firefox-isolated.  Your mileage may vary.

Last update 2007-12-02: Initial posting.  Seems to work.

As of 2020-09-02, Firefox containers are much more convenient to use, although it's possible there are some Firefox features (perhaps some of these?) that containers and the like do not yet properly isolate but a separate profile would.

ntfsresizecopy[# Top]

ntfsresizecopy copies an NTFS filesystem from one block device to another, resizing it to the size of the destination device in the process.  (It uses ntfsprogs.)  This is EXPERIMENTAL; after using this script, you should mount the destination read-only and check that everything looks intact.

An expanding copy is just done with ntfsclone followed by ntfsresize.  A shrinking copy is done by running ntfsclone and ntfsresize on devices specially crafted with the Linux device-mapper (requires dmsetup and losetup); you may save time by checking first that the shrinkage is possible with `ntfsresize -n -s SIZE SRC'.

The special shrinking technique should be applicable to any filesystem type that has an in-place shrinking command that doesn't write outside the new size.  Just change the calls to ntfsclone and ntfsresize; ntfsclone can be replaced by a dd of the beginning of the source for filesystems that don't have a sparse clone command.

Change log

git subtree-lite[# Top]

git subtree-lite is a tool to manage modified versions of content imported from other git repositories, now deprecated in favor of Braid, which is roughly equivalent but more mature.  The source repository remains available for historical interest.

Matt McCutchen's Web SiteUtilities  (Top, Basic file management, RPM software management, Web log analysis, Stow, retex, ftc, ftx, gitar, patchsync, xsltdepcomp, continusync, Isolated Firefox, ntfsresizecopy, git subtree-lite, Bottom).  Email me about this page.
Modification time of this page's main source file: 2020-09-18 19:19:44 +0000
Except where otherwise noted, Matt McCutchen waives his copyright to the content of this site.  This site comes with absolutely no warranty.  Why?