Commit | Line | Data |
---|---|---|
4f69fe59 MP |
1 | -*- indented-text -*- |
2 | ||
3 | Notes towards a new version of rsync | |
4 | Martin Pool <mbp@samba.org> | |
5 | ||
6 | ||
7 | Good things about the current implementation: | |
8 | ||
9 | - Widely known and adopted. | |
10 | ||
11 | - Fast/efficient, especially for moderately small sets of files over | |
12 | slow links (transoceanic or modem.) | |
13 | ||
14 | - Fairly reliable. | |
15 | ||
16 | - The choice of runnning over a plain TCP socket or tunneling over | |
17 | ssh. | |
18 | ||
19 | - rsync operations are idempotent: you can always run the same | |
20 | command twice to make sure it worked properly without any fear. | |
21 | (Are there any exceptions?) | |
22 | ||
23 | - Small changes to files cause small deltas. | |
24 | ||
25 | - There is a way to evolve the protocol to some extent. | |
26 | ||
27 | - rdiff and rsync --write-batch allow generation of standalone patch | |
28 | sets. rsync+ is pretty cheesy, though. xdelta seems cleaner. | |
29 | ||
30 | - Process triangle is creative, but seems to provoke OS bugs. | |
31 | ||
32 | - "Morning-after property": you don't need to know anything on the | |
33 | local machine about the state of the remote machine, or about | |
34 | transfers that have been done in the past. | |
35 | ||
36 | - You can easily push or pull simply by switching the order of | |
37 | files. | |
38 | ||
39 | ||
40 | Bad things about the current implementation: | |
41 | ||
42 | - Persistent and hard-to-diagnose hang bugs remain | |
43 | ||
44 | - Protocol is sketchily documented, tied to this implementation, and | |
45 | hard to modify/extend | |
46 | ||
47 | - Both the program and the protocol assume a single non-interactive | |
48 | one-way transfer | |
49 | ||
50 | - A list of all files are held in memory for the entire transfer, | |
51 | which cripples scalability to large file trees | |
52 | ||
53 | - Opening a new socket for every operation causes problems, | |
54 | especially when running over SSH with password authentication. | |
55 | ||
56 | - Renamed files are not handled: the old file is removed, and the | |
57 | new file created from scratch. | |
58 | ||
59 | - The versioning approach assumes that future versions of the | |
60 | program know about all previous versions, and will do the right | |
61 | thing. | |
62 | ||
63 | - People always get confused about ':' vs '::' | |
64 | ||
65 | - Error messages can be cryptic. | |
66 | ||
67 | ||
68 | Protocol philosophy: | |
69 | ||
70 | *The* big difference between protocols like HTTP, FTP, and NFS is | |
71 | that their fundamental operations are "read this file", "delete | |
72 | this file", and "make this directory", whereas rsync is "make this | |
73 | directory like this one". | |
74 | ||
75 | ||
76 | Questionable features: | |
77 | ||
78 | These are neat, but not necessarily clean or worth preserving. | |
79 | ||
80 | - The remote rsync can be wrapped by some other program, such as in | |
81 | tridge's rsync-mail scripts. The general feature of sending and | |
82 | retrieving mail over rsync is good, but this is perhaps not the | |
83 | right way to implement it. | |
84 | ||
85 | ||
86 | Desirable features: | |
87 | ||
88 | These don't really require architectural changes; they're just | |
89 | something to keep in mind. | |
90 | ||
91 | - Synchronize ACLs and extended attributes | |
92 | ||
93 | - Anonymous servers should be efficient | |
94 | ||
95 | - Code should be portable to non-UNIX systems | |
96 | ||
97 | - Should be possible to document the protocol in RFC form | |
98 | ||
99 | - --dry-run option | |
100 | ||
101 | - IPv6 support. Pretty straightforward. | |
102 | ||
103 | - Allow the basis and destination files to be different. For | |
104 | example, you could use this when you have a CD-ROM and want to | |
105 | download an updated image onto a hard drive. | |
106 | ||
107 | - Efficiently interrupt and restart a transfer. We can write a | |
108 | checkpoint file that says where we're up to in the filesystem. | |
109 | Alternatively, as long as transfers are idempotent, we can just | |
110 | restart the whole thing. [NFSv4] | |
111 | ||
112 | - Scripting support. | |
113 | ||
114 | - Propagate atimes and do not modify them. This is very ugly on | |
115 | Unix. It might be better to try to add O_NOATIME to kernels, and | |
116 | call that. | |
117 | ||
118 | - VFS. Useful? | |
119 | ||
120 | - Unicode. Probably just use UTF-8 for everything. | |
121 | ||
122 | ||
123 | Hard links: | |
124 | ||
125 | At the moment, we can recreate hard links, but it's a bit | |
126 | inefficient: it depends on holding a list of all files in the tree. | |
127 | Every time we see a file with a linkcount >1, we need to search for | |
128 | another known name that has the same (fsid,inum) tuple. We could do | |
129 | that more efficiently by keeping a list of only files with | |
130 | linkcount>1, and removing files from that list as all their names | |
131 | become known. | |
132 | ||
133 | ||
134 | Scripting issues: | |
135 | ||
136 | - Perhaps support multiple scripting languages: candidates include | |
137 | Perl, Python, Tcl, Scheme (guile?), sh, ... | |
138 | ||
139 | - Simply running a subprocess and looking at its stdout/exit code | |
140 | might be sufficient, though it could also be pretty slow if it's | |
141 | called often. | |
142 | ||
143 | - There are security issues about running remote code, at least if | |
144 | it's not running in the users own account. So we can either | |
145 | disallow it, or use some kind of sandbox system. | |
146 | ||
147 | ||
148 | Scripting hooks: | |
149 | ||
150 | - Whether to transfer a file | |
151 | ||
152 | - What basis file to use | |
153 | ||
154 | - Logging | |
155 | ||
156 | - Whether to allow transfers (for public servers) | |
157 | ||
158 | - Authentication | |
159 | ||
160 | - Locking | |
161 | ||
162 | ||
163 | Interactive interface: | |
164 | ||
165 | - Something like ncFTP, or integration into GNOME-vfs. Probably | |
166 | hold a single socket connection open. | |
167 | ||
168 | - Can either call us as a separate process, or as a library. | |
169 | ||
170 | - The standalone process needs to produce output in a form easily | |
171 | digestible by a calling program, like the --emacs feature some | |
172 | have. | |
173 | ||
174 | - Yow! emacs support. (You could probably build that already, of | |
175 | course.) | |
176 | ||
177 | ||
178 | Pie-in-the-sky features: | |
179 | ||
180 | These might have a severe impact on the protocol, and are not | |
181 | clearly in our core requirements. It looks like in many of them | |
182 | having scripting hooks will allow us | |
183 | ||
184 | - Transport over UDP multicast. The hard part is handling multiple | |
185 | destinations which have different basis files. We can look at | |
186 | multicast-TFTP for inspiration. | |
187 | ||
188 | - Conflict resolution. Possibly general scripting support will be | |
189 | sufficient. | |
190 | ||
191 | - Integrate with locking. It's hard to see a good general solution, | |
192 | because Unix systems have several locking mechanisms, and grabbing | |
193 | the lock from programs that don't expect it could cause deadlocks, | |
194 | timeouts, or other problems. Scripting support might help. | |
195 | ||
196 | - Replicate in place, rather than to a temporary file. This is | |
197 | dangerous in the case of interruption, and it also means that the | |
198 | delta can't refer to blocks that have already been overwritten. | |
199 | On the other hand we could semi-trivially do this at first by | |
200 | simply generating a delta with no copy instructions. | |
201 | ||
202 | - Replicate block devices. Most of the difficulties here are to do | |
203 | with replication in place, though on some systems we will also | |
204 | have to do I/O on block boundaries. | |
205 | ||
206 | ||
207 | In favour of evolving the protocol: | |
208 | ||
209 | - Keeping compatibility with existing rsync servers will help with | |
210 | adoption and testing. | |
211 | ||
212 | - We should at the very least be able to fall back to the new | |
213 | protocol. | |
214 | ||
215 | - Error handling is not so good. | |
216 | ||
217 | ||
218 | In favour of using a new protocol: | |
219 | ||
220 | - Maintaining compatibility might soak up development time that | |
221 | would better go into improving a new protocol. | |
222 | ||
223 | - If we start from scratch, it can be documented as we go, and we | |
224 | can avoid design decisions that make the protocol complex or | |
225 | implementation-bound. | |
226 | ||
227 | ||
228 | Error handling: | |
229 | ||
230 | - Errors should come back reliably, and be clearly associated with | |
231 | the particular file that caused the problem. | |
232 | ||
233 | - Some errors ought to cause the whole transfer to abort; some are | |
234 | just warnings. If any errors have occurred, then rsync ought to | |
235 | return an error. | |
236 | ||
237 | ||
238 | Concurrency: | |
239 | ||
240 | - We want to keep the CPU, filesystem, and network as full as | |
241 | possible as much of the time as possible. | |
242 | ||
243 | - We can do nonblocking network IO, but not so for disk. | |
244 | ||
245 | - It makes sense to on the destination be generating signatures and | |
246 | applying patches at the same time. | |
247 | ||
248 | - Can structure this with nonblocking, threads, separate processes, | |
249 | etc. | |
250 | ||
251 | ||
252 | Uses: | |
253 | ||
254 | - Mirroring software distributions: | |
255 | ||
256 | - Synchronizing laptop and desktop | |
257 | ||
258 | - NFS filesystem migration/replication. See | |
259 | http://www.ietf.org/proceedings/00jul/00july-133.htm#P24510_1276764 | |
260 | ||
261 | - Sync with PDA | |
262 | ||
263 | - Network backup systems | |
264 | ||
265 | - CVS filemover | |
266 | ||
267 | ||
268 | Conflict resolution: | |
269 | ||
270 | - Requires application-specific knowledge. We want to provide | |
271 | policy, rather than mechanism. | |
272 | ||
273 | - Possibly allowing two-way migration across a single connection | |
274 | would be useful. | |
275 | ||
276 | ||
277 | Moved files: | |
278 | ||
279 | - There's no trivial way to detect renamed files, especially if they | |
280 | move between directories. | |
281 | ||
282 | - If we had a picture of the remote directory from last time on | |
283 | either machine, then the inode numbers might give us a hint about | |
284 | files which may have been renamed. | |
285 | ||
286 | - Files that are renamed and not modified can be detected by | |
287 | examining the directory listing, looking for files with the same | |
288 | size/date as the origin. | |
289 | ||
290 | ||
291 | Filesystem migration: | |
292 | ||
293 | The NFSv4 working group wants atomic migration. Most of the | |
294 | responsibility for this lies on the NFS server or OS. | |
295 | ||
296 | If migrating a whole tree, then we could do a nearly-atomic rename | |
297 | at the end. This ties in to having separate basis and destination | |
298 | files. | |
299 | ||
300 | NFSv4 probably wants to migrate file locks, but that's not really | |
301 | our problem. | |
302 | ||
303 | ||
304 | Scalability: | |
305 | ||
306 | We should aim to work well on machines in use in a year or two. | |
307 | That probably means transfers of many millions of files in one | |
308 | batch, and gigabytes or terabytes of data. | |
309 | ||
310 | For argument's sake: at the low end, we want to sync ten files for a | |
311 | total of 10kb across a 1kB/s link. At the high end, we want to sync | |
312 | 1e9 files for 1TB of data across a 1GB/s link. | |
313 | ||
314 | On the whole CPU usage is not normally a limiting factor, if only | |
315 | because running over SSH burns a lot of cycles on encryption. | |
316 | ||
317 | ||
318 | Streaming: | |
319 | ||
320 | A big attraction of rsync is that there are few round-trip delays: | |
321 | basically only one to get started, and then everything is | |
322 | pipelined. This is a problem with FTP, and NFS (at least up to | |
323 | v3). NFSv4 can pipeline operations, but building on that is | |
324 | probably a bit complicated. |