for people who want to generate the file list using a find(1)
command or a script.
+File list structure in memory
+
+ Rather than one big array, perhaps have a tree in memory mirroring
+ the directory tree.
+
+ This might make sorting much faster! (I'm not sure it's a big CPU
+ problem, mind you.)
+
+ It might also reduce memory use in storing repeated directory names
+ -- again I'm not sure this is a problem.
Performance
not sure this makes sense with modern mallocs. At any rate it will
make us allocate a huge amount of memory for large file lists.
- We can try using the GNU/SVID/XPG mallinfo() function to get some
- heap statistics.
-
Hard-link handling
can end up with many empty directories. We might avoid this by
lazily creating such directories.
+
zlib
- Perhaps don't use our own zlib. Will we actually be incompatible,
- or just be slightly less efficient?
+ Perhaps don't use our own zlib.
+
+ Advantages:
+
+ - will automatically be up to date with bugfixes in zlib
+
+ - can leave it out for small rsync on e.g. recovery disks
+
+ - can use a shared library
+
+ - avoids people breaking rsync by trying to do this themselves and
+ messing up
+
+ Should we ship zlib for systems that don't have it, or require
+ people to install it separately?
+
+ Apparently this will make us incompatible with versions of rsync
+ that use the patched version of rsync. Probably the simplest way to
+ do this is to just disable gzip (with a warning) when talking to old
+ versions.
+
logging
Add --with-socks, and then perhaps a command-line option to put them
on or off. This might be more reliable than LD_PRELOAD hacks.
+Better statistics:
+
+ <Rasmus> mbp: hey, how about an rsync option that just gives you the
+ summary without the list of files? And perhaps gives more
+ information like the number of new files, number of changed,
+ deleted, etc. ?
+ <mbp> Rasmus: nice idea
+ <mbp> there is --stats
+ <mbp> but at the moment it's very tridge-oriented
+ <mbp> rather than user-friendly
+ <mbp> it would be nice to improve it
+ <mbp> that would also work well with --dryrun
+
+TDB:
+
+ Rather than storing the file list in memory, store it in a TDB.
+
+ This *might* make memory usage lower while building the file list.
+
+ Hashtable lookup will mean files are not transmitted in order,
+ though... hm.
+
+ This would neatly eliminate one of the major post-fork shared data
+ structures.
+
+
PLATFORMS ------------------------------------------------------------
Win32
we are correct to call close(), because shutdown() discards
untransmitted data.
+DEVELOPMENT ----------------------------------------------------------
+
+Splint
+
+ Build rsync with SPLINT to try to find security holes. Add
+ annotations as necessary. Keep track of the number of warnings
+ found initially, and see how many of them are real bugs, or real
+ security bugs. Knowing the percentage of likely hits would be
+ really interesting for other projects.
+
+Torture test
+
+ Something that just keeps running rsync continuously over a data set
+ likely to generate problems.
+
+Cross-testing
+
+ Run current rsync versions against significant past releases.
+
+Memory debugger
+
+ jra recommends Valgrind:
+
+ http://devel-home.kde.org/~sewardj/
+
DOCUMENTATION --------------------------------------------------------
Update README
Indicate whether files are new, updated, or deleted
+ At end of transfer, show how many files were or were not transferred
+ correctly.
+
internationalization
Change to using gettext(). Probably need to ship this for platforms
fairly directly into rsync commands: it just needs to remember the
current host, directory and so on. We can probably even do
completion of remote filenames.
-
-%K%