Reorganized and cleaned up TODO list.
[rsync/rsync.git] / TODO
CommitLineData
46ef7d1d 1-*- indented-text -*-
a0365806 2
259c3e72 3BUGS ---------------------------------------------------------------
abb0b532
S
4Fix hardlink reporting 2002/03/25
5Fix progress indicator to not corrupt log
6lchmod question
7Do not rely on having a group called "nobody"
8Incorrect timestamps (Debian #100295)
9Win32
10
11FEATURES ------------------------------------------------------------
12server-imposed bandwidth limits
13rsyncd over ssh
14Use chroot only if supported
15Allow supplementary groups in rsyncd.conf 2002/04/09
16Handling IPv6 on old machines
17Other IPv6 stuff:
18Add ACL support 2001/12/02
19Lazy directory creation
20Conditional -z for old protocols
21proxy authentication 2002/01/23
22SOCKS 2002/01/23
23FAT support
24Allow forcing arbitrary permissions 2002/03/12
25--diff david.e.sewell 2002/03/15
26Add daemon --no-detach and --no-fork options
27
28DOCUMENTATION --------------------------------------------------------
29Update README
30Keep list of open issues and todos on the web site
31Update web site from CVS
32Perhaps redo manual as SGML
33
34LOGGING --------------------------------------------------------------
35Make dry run list all updates 2002/04/03
36Memory accounting
37Improve error messages
38Better statistics: Rasmus 2002/03/08
39Perhaps flush stdout like syslog
40Log deamon sessions that just list modules
41Log child death on signal
42Keep stderr and stdout properly separated (Debian #23626)
43Log errors with function that reports process of origin
44verbose output David Stein 2001/12/20
45Add reason for transfer to file logging
46debugging of daemon 2002/04/08
47internationalization
48
49DEVELOPMENT --------------------------------------------------------
50Handling duplicate names
51Use generic zlib 2002/02/25
52TDB: 2002/03/12
53Splint 2002/03/12
54Memory debugger
55Create release script
56Add machines to build farm
57
58PERFORMANCE ----------------------------------------------------------
59File list structure in memory
60Traverse just one directory at a time
61Hard-link handling
62Allow skipping MD4 file_sum 2002/04/08
63Accelerate MD4
64String area code
65
66TESTING --------------------------------------------------------------
67Torture test
68Cross-test versions 2001/08/22
69Test on kernel source
70Test large files
71Create mutator program for testing
72Create configure option to enable dangerous tests
73If tests are skipped, say why.
74Test daemon feature to disallow particular options.
75Create pipe program for testing
76Create test makefile target for some tests
77Test "refuse options" works
78
79RELATED PROJECTS -----------------------------------------------------
80rsyncsh
81http://rsync.samba.org/rsync-and-debian/
82rsyncable gzip patch
83rsyncsplit as alternative to real integration with gzip?
84reverse rsync over HTTP Range
85
259c3e72 86
abb0b532
S
87
88BUGS ---------------------------------------------------------------
89
90Fix hardlink reporting 2002/03/25
91 (was: There seems to be a bug with hardlinks)
259c3e72
MP
92
93 mbp/2 build$ ls -l /tmp/a /tmp/b -i
94 /tmp/a:
95 total 32
96 2568307 -rw-rw-r-- 3 mbp mbp 29 Mar 25 17:30 a1
97 2568307 -rw-rw-r-- 3 mbp mbp 29 Mar 25 17:30 a2
98 2568307 -rw-rw-r-- 3 mbp mbp 29 Mar 25 17:30 a3
99 2568310 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 a4
100 2568310 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 a5
101 2568310 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 b1
102 2568310 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 b2
103 2568310 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 b3
104
105 /tmp/b:
106 total 32
107 2568309 -rw-rw-r-- 3 mbp mbp 29 Mar 25 17:30 a1
108 2568309 -rw-rw-r-- 3 mbp mbp 29 Mar 25 17:30 a2
109 2568309 -rw-rw-r-- 3 mbp mbp 29 Mar 25 17:30 a3
110 2568311 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 a4
111 2568311 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 a5
112 2568311 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 b1
113 2568311 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 b2
114 2568311 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 b3
115 mbp/2 build$ rm -r /tmp/b && ./rsync -avH /tmp/a/ /tmp/b
116 building file list ... done
117 created directory /tmp/b
118 ./
119 a1
120 a4
121 a2 => a1
122 a3 => a2
123 wrote 350 bytes read 52 bytes 804.00 bytes/sec
124 total size is 232 speedup is 0.58
125 mbp/2 build$ rm -r /tmp/b
126 mbp/2 build$ ls -l /tmp/b
127 ls: /tmp/b: No such file or directory
128 mbp/2 build$ rm -r /tmp/b && ./rsync -avH /tmp/a/ /tmp/b
129 rm: cannot remove `/tmp/b': No such file or directory
130 mbp/2 build$ rm -f -r /tmp/b && ./rsync -avH /tmp/a/ /tmp/b
131 building file list ... done
132 created directory /tmp/b
133 ./
134 a1
135 a4
136 a2 => a1
137 a3 => a2
138 wrote 350 bytes read 52 bytes 804.00 bytes/sec
139 total size is 232 speedup is 0.58
140 mbp/2 build$ ls -l /tmp/b
141 total 32
142 -rw-rw-r-- 3 mbp mbp 29 Mar 25 17:30 a1
143 -rw-rw-r-- 3 mbp mbp 29 Mar 25 17:30 a2
144 -rw-rw-r-- 3 mbp mbp 29 Mar 25 17:30 a3
145 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 a4
146 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 a5
147 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 b1
148 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 b2
149 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 b3
150 mbp/2 build$ ls -l /tmp/a
151 total 32
152 -rw-rw-r-- 3 mbp mbp 29 Mar 25 17:30 a1
153 -rw-rw-r-- 3 mbp mbp 29 Mar 25 17:30 a2
154 -rw-rw-r-- 3 mbp mbp 29 Mar 25 17:30 a3
155 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 a4
156 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 a5
157 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 b1
158 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 b2
159 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 b3
46ef7d1d 160
abb0b532
S
161 -- --
162
33d213bb 163
abb0b532
S
164Fix progress indicator to not corrupt log
165
166 Progress indicator can produce corrupt output when transferring directories:
e4724e5c
MP
167
168 main/binary-arm/
169 main/binary-arm/admin/
170 main/binary-arm/base/
171 main/binary-arm/comm/8.56kB/s 0:00:52
172 main/binary-arm/devel/
173 main/binary-arm/doc/
174 main/binary-arm/editors/
175 main/binary-arm/electronics/s 0:00:53
176 main/binary-arm/games/
177 main/binary-arm/graphics/
178 main/binary-arm/hamradio/
179 main/binary-arm/interpreters/
180 main/binary-arm/libs/6.61kB/s 0:00:54
181 main/binary-arm/mail/
182 main/binary-arm/math/
183 main/binary-arm/misc/
184
abb0b532
S
185 -- --
186
187
188lchmod question
7e28fca1 189
e4724e5c 190 I don't think we handle this properly on systems that don't have the
7e28fca1
MP
191 call. Are there any such?
192
abb0b532 193 -- --
e4724e5c 194
5ba268ef 195
8bd1a73e
MP
196Do not rely on having a group called "nobody"
197
198 http://www.linuxbase.org/spec/refspecs/LSB_1.1.0/gLSB/usernames.html
199
200 On Debian it's "nogroup"
e4724e5c 201
abb0b532 202 -- --
b3e6c815 203
d2e9d069 204
abb0b532 205Incorrect timestamps (Debian #100295)
d2e9d069 206
abb0b532 207 A bit hard to believe, but apparently it happens.
d2e9d069 208
abb0b532 209 -- --
d2e9d069 210
d2e9d069 211
abb0b532 212Win32
0e5a1f83 213
abb0b532 214 Don't detach, because this messes up --srvany.
0e5a1f83 215
abb0b532 216 http://sources.redhat.com/ml/cygwin/2001-08/msg00234.html
a6a3c3df 217
a6a3c3df 218
b3e6c815 219
abb0b532 220 -- --
0e5a1f83 221
abb0b532 222FEATURES ------------------------------------------------------------
a6a3c3df 223
abb0b532 224server-imposed bandwidth limits
a6a3c3df 225
abb0b532 226 -- --
0e5a1f83 227
a6a3c3df 228
abb0b532 229rsyncd over ssh
a6a3c3df 230
abb0b532 231 There are already some patches to do this.
a6a3c3df 232
abb0b532
S
233 BitKeeper uses a server whose login shell is set to bkd. That's
234 probably a reasonable approach.
a6a3c3df 235
abb0b532 236 -- --
a6a3c3df 237
a6a3c3df 238
abb0b532 239Use chroot only if supported
a6a3c3df 240
abb0b532 241 If the platform doesn't support it, then don't even try.
a6a3c3df 242
abb0b532
S
243 If running as non-root, then don't fail, just give a warning.
244 (There was a thread about this a while ago?)
a6a3c3df 245
abb0b532
S
246 http://lists.samba.org/pipermail/rsync/2001-August/thread.html
247 http://lists.samba.org/pipermail/rsync/2001-September/thread.html
a6a3c3df 248
abb0b532 249 -- --
a6a3c3df 250
a6a3c3df 251
abb0b532 252Allow supplementary groups in rsyncd.conf 2002/04/09
a6a3c3df 253
abb0b532
S
254 Perhaps allow supplementary groups to be specified in rsyncd.conf;
255 then make the first one the primary gid and all the rest be
256 supplementary gids.
a2d2e5c0 257
abb0b532 258 -- --
a2d2e5c0 259
bde47ca7 260
411acbbc 261Handling IPv6 on old machines
bde47ca7 262
411acbbc
MP
263 The KAME IPv6 patch is nice in theory but has proved a bit of a
264 nightmare in practice. The basic idea of their patch is that rsync
265 is rewritten to use the new getaddrinfo()/getnameinfo() interface,
266 rather than gethostbyname()/gethostbyaddr() as in rsync 2.4.6.
267 Systems that don't have the new interface are handled by providing
268 our own implementation in lib/, which is selectively linked in.
c7d692c3 269
411acbbc
MP
270 The problem with this is that it is really hard to get right on
271 platforms that have a half-working implementation, so redefining
272 these functions clashes with system headers, and leaving them out
273 breaks. This affects at least OSF/1, RedHat 5, and Cobalt, which
274 are moderately improtant.
275
276 Perhaps the simplest solution would be to have two different files
277 implementing the same interface, and choose either the new or the
278 old API. This is probably necessary for systems that e.g. have
279 IPv6, but gethostbyaddr() can't handle it. The Linux manpage claims
280 this is currently the case.
281
282 In fact, our internal sockets interface (things like
283 open_socket_out(), etc) is much narrower than the getaddrinfo()
284 interface, and so probably simpler to get right. In addition, the
285 old code is known to work well on old machines.
286
287 We could drop the rather large lib/getaddrinfo files.
288
abb0b532
S
289 -- --
290
411acbbc
MP
291
292Other IPv6 stuff:
293
c33e3e39
MP
294 Implement suggestions from http://www.kame.net/newsletter/19980604/
295 and ftp://ftp.iij.ad.jp/pub/RFC/rfc2553.txt
296
297 If a host has multiple addresses, then listen try to connect to all
298 in order until we get through. (getaddrinfo may return multiple
c10b0bdd 299 addresses.) This is kind of implemented already.
c33e3e39
MP
300
301 Possibly also when starting as a server we may need to listen on
302 multiple passive addresses. This might be a bit harder, because we
303 may need to select on all of them. Hm.
304
a2d2e5c0
MP
305 Define a syntax for IPv6 literal addresses. Since they include
306 colons, they tend to break most naming systems, including ours.
307 Based on the HTTP IPv6 syntax, I think we should use
308
a577af90 309 rsync://[::1]/foo/bar [::1]::bar
a2d2e5c0
MP
310
311 which should just take a small change to the parser code.
312
abb0b532 313 -- --
b17dd0c4
MP
314
315
abb0b532 316Add ACL support 2001/12/02
5575de14 317
5575de14
MP
318 Transfer ACLs. Need to think of a standard representation.
319 Probably better not to even try to convert between NT and POSIX.
320 Possibly can share some code with Samba.
5aafd07b 321
abb0b532
S
322 -- --
323
324
325Lazy directory creation
28a69e25
MP
326
327 With the current common --include '*/' --exclude '*' pattern, people
328 can end up with many empty directories. We might avoid this by
329 lazily creating such directories.
330
abb0b532 331 -- --
c6e27b60 332
28a69e25 333
abb0b532 334Conditional -z for old protocols
c6e27b60 335
abb0b532
S
336 After we get the @RSYNCD greeting from the server, we know it's
337 version but we have not yet sent the command line, so we could just
338 remove the -z option if the server is too old.
c6e27b60 339
abb0b532
S
340 For ssh invocation it's not so simple, because we actually use the
341 command line to start the remote process. However, we only actually
342 do compression in token.c, and we could therefore once we discover
343 the remote version emit an error if it's too old. I'm not sure if
344 that's a good tradeoff or not.
c6e27b60 345
abb0b532 346 -- --
5ba268ef 347
5ba268ef 348
abb0b532 349proxy authentication 2002/01/23
92325ada
MP
350
351 Allow RSYNC_PROXY to be http://user:pass@proxy.foo:3128/, and do
a577af90 352 HTTP Basic Proxy-Authentication.
92325ada
MP
353
354 Multiple schemes are possible, up to and including the insanity that
355 is NTLM, but Basic probably covers most cases.
356
abb0b532
S
357 -- --
358
359
360SOCKS 2002/01/23
92325ada
MP
361
362 Add --with-socks, and then perhaps a command-line option to put them
363 on or off. This might be more reliable than LD_PRELOAD hacks.
364
abb0b532
S
365 -- --
366
367
5ba268ef
MP
368FAT support
369
a577af90
PG
370 rsync to a FAT partition on a Unix machine doesn't work very well at
371 the moment. I think we get errors about invalid filenames and
5ba268ef
MP
372 perhaps also trying to do atomic renames.
373
a577af90
PG
374 I guess the code to do this is currently #ifdef'd on Windows;
375 perhaps we ought to intelligently fall back to it on Unix too.
5ba268ef 376
abb0b532 377 -- --
5ba268ef 378
27741d9f 379
abb0b532 380Allow forcing arbitrary permissions 2002/03/12
e53fe9a2 381
abb0b532
S
382 On 12 Mar 2002, Dave Dykstra <dwd@bell-labs.com> wrote:
383 > If we would add an option to do that functionality, I
384 > would vote for one that was more general which could mask
385 > off any set of permission bits and possibly add any set of
386 > bits. Perhaps a chmod-like syntax if it could be
387 > implemented simply.
97e1254a 388
a577af90 389 I think that would be good too. For example, people uploading files
97e1254a
MP
390 to a web server might like to say
391
392 rsync -avzP --chmod a+rX ./ sourcefrog.net:/home/www/sourcefrog/
393
394 Ideally the patch would implement as many of the gnu chmod semantics
395 as possible. I think the mode parser should be a separate function
a577af90
PG
396 that passes back something like (mask,set) description to the rest
397 of the program. For bonus points there would be a test case for the
97e1254a
MP
398 parser.
399
8bd1a73e
MP
400 Possibly also --chown
401
36692011
MP
402 (Debian #23628)
403
abb0b532 404 -- --
97e1254a 405
abb0b532
S
406
407--diff david.e.sewell 2002/03/15
3c1edccb
MP
408
409 Allow people to specify the diff command. (Might want to use wdiff,
410 gnudiff, etc.)
411
412 Just diff the temporary file with the destination file, and delete
413 the tmp file rather than moving it into place.
414
415 Interaction with --partial.
416
417 Security interactions with daemon mode?
418
abb0b532 419 -- --
3c1edccb
MP
420
421
abb0b532 422Add daemon --no-detach and --no-fork options
a628b069 423
abb0b532
S
424 Very useful for debugging. Also good when running under a
425 daemon-monitoring process that tries to restart the service when the
426 parent exits.
6d19c674 427
abb0b532 428 -- --
6d19c674 429
abb0b532 430DOCUMENTATION --------------------------------------------------------
6d19c674 431
abb0b532 432Update README
6d19c674 433
abb0b532 434 -- --
6479c2ed
MP
435
436
abb0b532 437Keep list of open issues and todos on the web site
bd0ad74f 438
abb0b532 439 -- --
bd0ad74f 440
bd0ad74f 441
abb0b532 442Update web site from CVS
bd0ad74f 443
abb0b532 444 -- --
bd0ad74f 445
bd0ad74f 446
abb0b532 447Perhaps redo manual as SGML
bd0ad74f 448
abb0b532
S
449 The man page is getting rather large, and there is more information
450 that ought to be added.
bd0ad74f 451
abb0b532 452 TexInfo source is probably a dying format.
6479c2ed 453
abb0b532
S
454 Linuxdoc looks like the most likely contender. I know DocBook is
455 favoured by some people, but it's so bloody verbose, even with emacs
456 support.
a628b069 457
abb0b532 458 -- --
a2d2e5c0 459
abb0b532 460LOGGING --------------------------------------------------------------
a2d2e5c0 461
abb0b532
S
462Make dry run list all updates 2002/04/03
463
464 --dry-run is too dry
465
466 Mark Santcroos points out that -n fails to list files which have
467 only metadata changes, though it probably should.
468
469 There may be a Debian bug about this as well.
470
471 -- --
472
473
474Memory accounting
475
476 At exit, show how much memory was used for the file list, etc.
477
478 Also we do a wierd exponential-growth allocation in flist.c. I'm
479 not sure this makes sense with modern mallocs. At any rate it will
480 make us allocate a huge amount of memory for large file lists.
481
482 -- --
483
484
485Improve error messages
486
487 If we hang or get SIGINT, then explain where we were up to. Perhaps
488 have a static buffer that contains the current function name, or
489 some kind of description of what we were trying to do. This is a
490 little easier on people than needing to run strace/truss.
491
492 "The dungeon collapses! You are killed." Rather than "unexpected
493 eof" give a message that is more detailed if possible and also more
494 helpful.
495
496 If we get an error writing to a socket, then we should perhaps
497 continue trying to read to see if an error message comes across
498 explaining why the socket is closed. I'm not sure if this would
499 work, but it would certainly make our messages more helpful.
500
501 What happens if a directory is missing -x attributes. Do we lose
502 our load? (Debian #28416) Probably fixed now, but a test case would
503 be good.
a2d2e5c0 504
a2d2e5c0 505
5ba268ef 506
abb0b532 507 -- --
0e23e41d 508
abb0b532
S
509
510Better statistics: Rasmus 2002/03/08
511
512 <Rasmus>
513 hey, how about an rsync option that just gives you the
514 summary without the list of files? And perhaps gives
515 more information like the number of new files, number
516 of changed, deleted, etc. ?
517
518 <mbp>
519 nice idea there is --stats but at the moment it's very
520 tridge-oriented rather than user-friendly it would be
521 nice to improve it that would also work well with
522 --dryrun
523
524 -- --
525
526
527Perhaps flush stdout like syslog
528
529 Perhaps flush stdout after each filename, so that people trying to
530 monitor progress in a log file can do so more easily. See
531 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=48108
532
533 -- --
534
535
536Log deamon sessions that just list modules
537
538 At the connections that just get a list of modules are not logged,
539 but they should be.
540
541 -- --
542
543
544Log child death on signal
545
546 If a child of the rsync daemon dies with a signal, we should notice
547 that when we reap it and log a message.
548
549 -- --
550
551
552Keep stderr and stdout properly separated (Debian #23626)
553
554 -- --
555
556
557Log errors with function that reports process of origin
558
559 Use a separate function for reporting errors; prefix it with
560 "rsync:" or "rsync(remote)", or perhaps even "rsync(local
561 generator): ".
562
563 -- --
564
565
566verbose output David Stein 2001/12/20
567
568 Indicate whether files are new, updated, or deleted
569
570 At end of transfer, show how many files were or were not transferred
571 correctly.
572
573 -- --
574
575
576Add reason for transfer to file logging
577
578 Explain *why* every file is transferred or not (e.g. "local mtime
579 123123 newer than 1283198")
580
581 -- --
582
583
584debugging of daemon 2002/04/08
585
586 Add an rsyncd.conf parameter to turn on debugging on the server.
587
588 -- --
589
590
591internationalization
592
593 Change to using gettext(). Probably need to ship this for platforms
594 that don't have it.
595
596 Solicit translations.
597
598 Does anyone care? Before we bother modifying the code, we ought to
599 get the manual translated first, because that's possibly more useful
600 and at any rate demonstrates desire.
601
602 -- --
603
604DEVELOPMENT --------------------------------------------------------
605
606Handling duplicate names
607
608 We need to be careful of duplicate names getting into the file list.
609 See clean_flist(). This could happen if multiple arguments include
610 the same file. Bad.
611
612 I think duplicates are only a problem if they're both flowing
613 through the pipeline at the same time. For example we might have
614 updated the first occurrence after reading the checksums for the
615 second. So possibly we just need to make sure that we don't have
616 both in the pipeline at the same time.
617
618 Possibly if we did one directory at a time that would be sufficient.
619
620 Alternatively we could pre-process the arguments to make sure no
621 duplicates will ever be inserted. There could be some bad cases
622 when we're collapsing symlinks.
623
624 We could have a hash table.
625
626 The root of the problem is that we do not want more than one file
627 list entry referring to the same file. At first glance there are
628 several ways this could happen: symlinks, hardlinks, and repeated
629 names on the command line.
630
631 If names are repeated on the command line, they may be present in
632 different forms, perhaps by traversing directory paths in different
633 ways, traversing paths including symlinks. Also we need to allow
634 for expansion of globs by rsync.
635
636 At the moment, clean_flist() requires having the entire file list in
637 memory. Duplicate names are detected just by a string comparison.
638
639 We don't need to worry about hard links causing duplicates because
640 files are never updated in place. Similarly for symlinks.
641
642 I think even if we're using a different symlink mode we don't need
643 to worry.
644
645 Unless we're really clever this will introduce a protocol
646 incompatibility, so we need to be able to accept the old format as
647 well.
648
649 -- --
650
651
652Use generic zlib 2002/02/25
653
654 Perhaps don't use our own zlib.
655
656 Advantages:
657
658 - will automatically be up to date with bugfixes in zlib
659
660 - can leave it out for small rsync on e.g. recovery disks
661
662 - can use a shared library
663
664 - avoids people breaking rsync by trying to do this themselves and
665 messing up
666
667 Should we ship zlib for systems that don't have it, or require
668 people to install it separately?
669
670 Apparently this will make us incompatible with versions of rsync
671 that use the patched version of rsync. Probably the simplest way to
672 do this is to just disable gzip (with a warning) when talking to old
673 versions.
674
675 -- --
676
677
678TDB: 2002/03/12
679
680 Rather than storing the file list in memory, store it in a TDB.
681
682 This *might* make memory usage lower while building the file list.
683
684 Hashtable lookup will mean files are not transmitted in order,
685 though... hm.
686
687 This would neatly eliminate one of the major post-fork shared data
688 structures.
689
690 -- --
691
692
693Splint 2002/03/12
0e23e41d
MP
694
695 Build rsync with SPLINT to try to find security holes. Add
696 annotations as necessary. Keep track of the number of warnings
697 found initially, and see how many of them are real bugs, or real
698 security bugs. Knowing the percentage of likely hits would be
699 really interesting for other projects.
700
abb0b532 701 -- --
f5a95bb5 702
f5a95bb5 703
43a4dc10
MP
704Memory debugger
705
3a79260d 706 jra recommends Valgrind:
43a4dc10
MP
707
708 http://devel-home.kde.org/~sewardj/
709
abb0b532
S
710 -- --
711
712
713Create release script
25ff30e8 714
abb0b532 715 Script would:
25ff30e8 716
abb0b532 717 Update spec files
25ff30e8 718
abb0b532
S
719 Build tar file; upload
720
721 Send announcement to mailing list and c.o.l.a.
25ff30e8 722
abb0b532 723 Make freshmeat announcement
25ff30e8 724
abb0b532 725 Update web site
25ff30e8 726
abb0b532 727 -- --
25ff30e8
MP
728
729
abb0b532 730Add machines to build farm
e9c4c301 731
abb0b532 732 Cygwin (on different versions of Win32?)
e9c4c301 733
abb0b532 734 HP-UX variants (via HP?)
25ff30e8 735
abb0b532 736 SCO
e9c4c301 737
e9c4c301 738
25ff30e8 739
abb0b532 740 -- --
25ff30e8 741
abb0b532 742PERFORMANCE ----------------------------------------------------------
25ff30e8 743
abb0b532 744File list structure in memory
25ff30e8 745
abb0b532
S
746 Rather than one big array, perhaps have a tree in memory mirroring
747 the directory tree.
25ff30e8 748
abb0b532
S
749 This might make sorting much faster! (I'm not sure it's a big CPU
750 problem, mind you.)
25ff30e8 751
abb0b532
S
752 It might also reduce memory use in storing repeated directory names
753 -- again I'm not sure this is a problem.
e9c4c301 754
abb0b532 755 -- --
e9c4c301 756
e9c4c301 757
abb0b532 758Traverse just one directory at a time
e9c4c301 759
abb0b532 760 Traverse just one directory at a time. Tridge says it's possible.
e9c4c301 761
abb0b532
S
762 At the moment rsync reads the whole file list into memory at the
763 start, which makes us use a lot of memory and also not pipeline
764 network access as much as we could.
e9c4c301 765
abb0b532 766 -- --
b73b51a9 767
599dc93c 768
abb0b532 769Hard-link handling
76533c52 770
abb0b532
S
771 At the moment hardlink handling is very expensive, so it's off by
772 default. It does not need to be so.
6479c2ed 773
abb0b532
S
774 Since most of the solutions are rather intertwined with the file
775 list it is probably better to fix that first, although fixing
776 hardlinks is possibly simpler.
717eb9b8 777
abb0b532
S
778 We can rule out hardlinked directories since they will probably
779 screw us up in all kinds of ways. They simply should not be used.
717eb9b8 780
abb0b532
S
781 At the moment rsync only cares about hardlinks to regular files. I
782 guess you could also use them for sockets, devices and other beasts,
783 but I have not seen them.
717eb9b8 784
abb0b532
S
785 When trying to reproduce hard links, we only need to worry about
786 files that have more than one name (nlinks>1 && !S_ISDIR).
e9c4c301 787
abb0b532
S
788 The basic point of this is to discover alternate names that refer to
789 the same file. All operations, including creating the file and
790 writing modifications to it need only to be done for the first name.
791 For all later names, we just create the link and then leave it
792 alone.
7c583c73 793
abb0b532 794 If hard links are to be preserved:
7c583c73 795
abb0b532
S
796 Before the generator/receiver fork, the list of files is received
797 from the sender (recv_file_list), and a table for detecting hard
798 links is built.
b73b51a9 799
abb0b532
S
800 The generator looks for hard links within the file list and does
801 not send checksums for them, though it does send other metadata.
b73b51a9 802
abb0b532
S
803 The sender sends the device number and inode with file entries, so
804 that files are uniquely identified.
5af50297 805
abb0b532
S
806 The receiver goes through and creates hard links (do_hard_links)
807 after all data has been written, but before directory permissions
808 are set.
5af50297 809
abb0b532
S
810 At the moment device and inum are sent as 4-byte integers, which
811 will probably cause problems on large filesystems. On Linux the
812 kernel uses 64-bit ino_t's internally, and people will soon have
813 filesystems big enough to use them. We ought to follow NFS4 in
814 using 64-bit device and inode identification, perhaps with a
815 protocol version bump.
5af50297 816
abb0b532
S
817 Once we've seen all the names for a particular file, we no longer
818 need to think about it and we can deallocate the memory.
5af50297 819
abb0b532
S
820 We can also have the case where there are links to a file that are
821 not in the tree being transferred. There's nothing we can do about
822 that. Because we rename the destination into place after writing,
823 any hardlinks to the old file are always going to be orphaned. In
824 fact that is almost necessary because otherwise we'd get really
825 confused if we were generating checksums for one name of a file and
826 modifying another.
5af50297 827
abb0b532
S
828 At the moment the code seems to make a whole second copy of the file
829 list, which seems unnecessary.
5af50297 830
abb0b532
S
831 We should have a test case that exercises hard links. Since it
832 might be hard to compare ./tls output where the inodes change we
833 might need a little program to check whether several names refer to
834 the same file.
a2d2e5c0 835
abb0b532 836 -- --
a2d2e5c0 837
a2d2e5c0 838
abb0b532 839Allow skipping MD4 file_sum 2002/04/08
33d213bb 840
abb0b532
S
841 If we're doing a local transfer, or using -W, then perhaps don't
842 send the file checksum. If we're doing a local transfer, then
843 calculating MD4 checksums uses 90% of CPU and is unlikely to be
844 useful.
5aafd07b 845
abb0b532
S
846 Indeed for transfers over zlib or ssh we can also rely on the
847 transport to have quite strong protection against corruption.
46ef7d1d 848
abb0b532
S
849 Perhaps we should have an option to disable this,
850 analogous to --whole-file, although it would default to
851 disabled. The file checksum takes up a definite space in
852 the protocol -- we can either set it to 0, or perhaps just
853 leave it out.
a2d2e5c0 854
abb0b532 855 -- --
a2d2e5c0 856
62b68c80 857
abb0b532 858Accelerate MD4
62b68c80 859
abb0b532 860 Perhaps borrow an assembler MD4 from someone?
62b68c80 861
abb0b532
S
862 Make sure we call MD4 with properly-sized blocks whenever possible
863 to avoid copying into the residue region?
a2d2e5c0 864
abb0b532 865 -- --
50f2f002 866
d834adc1 867
abb0b532 868String area code
62b68c80 869
abb0b532
S
870 Test whether this is actually faster than just using malloc(). If
871 it's not (anymore), throw it out.
62b68c80 872
abb0b532 873 -- --
62b68c80 874
abb0b532 875TESTING --------------------------------------------------------------
8ff9d697 876
abb0b532 877Torture test
8ff9d697 878
abb0b532
S
879 Something that just keeps running rsync continuously over a data set
880 likely to generate problems.
8ff9d697 881
abb0b532 882 -- --
62b68c80 883
62b68c80 884
abb0b532 885Cross-test versions 2001/08/22
62b68c80 886
abb0b532
S
887 Part of the regression suite should be making sure that we
888 don't break backwards compatibility: old clients vs new
889 servers and so on. Ideally we would test both up and down
890 from the current release to all old versions.
62b68c80 891
abb0b532 892 Run current rsync versions against significant past releases.
3d90ec14 893
abb0b532
S
894 We might need to omit broken old versions, or versions in which
895 particular functionality is broken
a2d2e5c0 896
abb0b532
S
897 It might be sufficient to test downloads from well-known public
898 rsync servers running different versions of rsync. This will give
899 some testing and also be the most common case for having different
900 versions and not being able to upgrade.
a2d2e5c0 901
abb0b532 902 The new --protocol option may help in this.
a2d2e5c0 903
abb0b532
S
904 -- --
905
906
907Test on kernel source
908
909 Download all versions of kernel; unpack, sync between them. Also
910 sync between uncompressed tarballs. Compare directories after
911 transfer.
912
913 Use local mode; ssh; daemon; --whole-file and --no-whole-file.
914
915 Use awk to pull out the 'speedup' number for each transfer. Make
916 sure it is >= x.
917
918 -- --
919
920
921Test large files
922
923 Sparse and non-sparse
924
925 -- --
926
927
928Create mutator program for testing
929
930 Insert bytes, delete bytes, swap blocks, ...
931
932 -- --
933
934
935Create configure option to enable dangerous tests
936
937 -- --
938
939
940If tests are skipped, say why.
941
942 -- --
943
944
945Test daemon feature to disallow particular options.
946
947 -- --
948
949
950Create pipe program for testing
951
952 Create pipe program that makes slow/jerky connections for
953 testing Versions of read() and write() that corrupt the
954 stream, or abruptly fail
955
956 -- --
957
958
959Create test makefile target for some tests
960
961 Separate makefile target to run rough tests -- or perhaps
962 just run them every time?
963
964 -- --
965
966
967Test "refuse options" works
968
969 What about for --recursive?
970
971 If you specify an unrecognized option here, you should get an error.
972
973 We need a test case for this...
974
975 Was this broken when we changed to popt?
976
977 -- --
978
979RELATED PROJECTS -----------------------------------------------------
3d90ec14 980
a577af90 981rsyncsh
46ef7d1d
MP
982
983 Write a small emulation of interactive ftp as a Pythonn program
984 that calls rsync. Commands such as "cd", "ls", "ls *.c" etc map
985 fairly directly into rsync commands: it just needs to remember the
986 current host, directory and so on. We can probably even do
987 completion of remote filenames.
25ff30e8 988
abb0b532 989 -- --
25ff30e8 990
25ff30e8
MP
991
992http://rsync.samba.org/rsync-and-debian/
993
abb0b532
S
994
995 -- --
996
997
25ff30e8
MP
998rsyncable gzip patch
999
1000 Exhaustive, tortuous testing
1001
1002 Cleanups?
1003
abb0b532
S
1004 -- --
1005
1006
25ff30e8
MP
1007rsyncsplit as alternative to real integration with gzip?
1008
abb0b532
S
1009 -- --
1010
1011
25ff30e8
MP
1012reverse rsync over HTTP Range
1013
1014 Goswin Brederlow suggested this on Debian; I think tridge and I
1015 talked about it previous in relation to rproxy.
a577af90 1016
abb0b532 1017 -- --
a577af90 1018