Commit | Line | Data |
---|---|---|
57625988 DD |
1 | To: rsync@lists.samba.org |
2 | Subject: Rsync performance increase through buffering | |
3 | From: Craig Barratt <craig@atheros.com> | |
4 | Message-ID: <auto-000002369223@atheros.com> | |
5 | Date: Sun, 08 Dec 2002 23:48:57 -0800 | |
6 | ||
7 | I've been studying the read and write buffering in rsync and it turns | |
8 | out most I/O is done just a couple of bytes at a time. This means there | |
9 | are lots of system calls, and also most network traffic comprises lots | |
10 | of small packets. The behavior is most extreme when sending/receiving | |
11 | file deltas of identical files. | |
12 | ||
13 | The main case where I/O is buffered is writes from the server (when | |
14 | io multiplexing is on). These are usually buffered in 4092 byte | |
15 | chunks with a 4 byte header. However, reading of these packets is | |
16 | usually unbuffered, and writes from the client are generally not | |
17 | buffered. For example: when receiving 1st phase checksums (6 bytes | |
18 | per block), 2 reads are done: one of 4 bytes and one of 2 bytes, | |
19 | meaning there are 4 system calls (select/read/select/read) per 6 | |
20 | bytes of checksum data). | |
21 | ||
22 | One cost of this is some performance, but a significant issue is that | |
23 | unbuffered writes generate very short (and very many) ethernet packets, | |
24 | which means the overhead is quite large on slow network connections. | |
25 | ||
26 | The initial file_list writing is typically buffered, but reading it on | |
27 | the client is not. | |
28 | ||
29 | There are some other unneeded system calls: | |
30 | ||
31 | - One example is that show_progress() calls gettimeofday() even | |
32 | if do_progress is not set. show_progress() is called on every | |
33 | block, so there is an extra system call per (700 byte) block. | |
b10e9d48 | 34 | (NOTE: this is currently fixed in CVS.) |
57625988 DD |
35 | |
36 | - Another example is that file_write writes each matching (700 byte) | |
37 | block without buffering, so that's another system call per block. | |
38 | ||
39 | To study this behavior I used rsync-2.5.6cvs and had a benchmark area | |
40 | comprising around 7800 files of total size 530MB. | |
41 | ||
42 | Here are some results doing sends and receives via rsyncd, all on the | |
43 | same machine, with identical source and destination files. In each | |
44 | case --ignore-times (-I) is set, so that every file is processed: | |
45 | ||
46 | - Send test: | |
47 | ||
48 | strace -f rsync -Ir . localhost::test |& wc | |
49 | ||
50 | shows there are about 2,488,775 system calls. | |
51 | ||
52 | - Receive test: | |
53 | ||
54 | strace -f rsync -Ir localhost::test . |& wc | |
55 | ||
56 | shows there are about 1,615,931 system calls. | |
57 | ||
58 | - Rsyncd has a roughly similar numbers of system calls. | |
59 | ||
60 | - Send test from another machine (cygwin/WinXP laptop): | |
61 | ||
62 | tcpdump port 873 |& wc | |
63 | ||
64 | shows there are about 701,111 ethernet packets (many of them only | |
65 | have a 4 byte payload). | |
66 | ||
67 | Since the source and dest files are the same, the send test only | |
68 | wrote 1,738,797 bytes and read 2,139,848 bytes. | |
69 | ||
70 | These results are similar to rsync 2.5.5. | |
71 | ||
72 | Below is a patch to a few files that adds read and write buffering in | |
73 | the places where the I/O was unbuffered, adds buffering to write_file() | |
74 | and removes the unneeded gettimeofday() system call in show_progress(). | |
75 | ||
76 | The results with the patch are: | |
77 | ||
78 | - Send test: 46,835 system calls, versus 2,488,775. | |
79 | ||
80 | - Receive test: 138,367 system calls, versus 1,615,931. | |
81 | ||
82 | - Send test from another machine: 5,255 ethernet packets, versus 701,111. | |
83 | If the tcp/ip/udp/802.3 per-packet overhead is around 60 bytes, that | |
84 | means the base case transfers an extra 42MB of data, even though the | |
85 | useful data is only around 2MB. | |
86 | ||
87 | The absolute running time on the local rsyncd test isn't much different, | |
88 | probably because the test is really disk io limited and system calls on | |
89 | an unloaded linux system are pretty fast. | |
90 | ||
91 | However, on a network test doing a send from cygwin/WinXP to rsyncd | |
92 | on rh-linux the running time improves from about 700 seconds to 215 | |
93 | seconds (with a cpu load of around 17% versus 58%, if you believe | |
94 | cygwin's cpu stats). This is probably an extreme case since the system | |
95 | call penalty in cygwin is high. But I would suspect a significant | |
96 | improvement is possible with a slow network connection, since a lot | |
97 | less data is being sent. | |
98 | ||
99 | Note also that without -I rsync is already very fast, since it skips | |
100 | (most) files based on attributes. | |
101 | ||
102 | With or without this patch the test suite passes except for | |
103 | daemon-gzip-upload. One risk of buffering is the potential for | |
104 | a bug caused by a missing io_flush: deadlock is possible, so try | |
105 | the patch at your own risk... | |
106 | ||
107 | Craig | |
108 | ||
109 | ########################################################################### | |
110 | diff -bur rsync/fileio.c rsync-craig/fileio.c | |
111 | --- rsync/fileio.c Fri Jan 25 15:07:34 2002 | |
112 | +++ rsync-craig/fileio.c Sat Dec 7 22:21:10 2002 | |
113 | @@ -76,7 +76,35 @@ | |
114 | int ret = 0; | |
115 | ||
116 | if (!sparse_files) { | |
117 | - return write(f,buf,len); | |
118 | + static char *writeBuf; | |
119 | + static size_t writeBufSize; | |
120 | + static size_t writeBufCnt; | |
121 | + | |
122 | + if ( !writeBuf ) { | |
123 | + writeBufSize = MAX_MAP_SIZE; | |
124 | + writeBufCnt = 0; | |
125 | + writeBuf = (char*)malloc(MAX_MAP_SIZE); | |
126 | + if (!writeBuf) out_of_memory("write_file"); | |
127 | + } | |
128 | + ret = len; | |
129 | + do { | |
130 | + if ( buf && writeBufCnt < writeBufSize ) { | |
131 | + size_t copyLen = len; | |
132 | + if ( copyLen > writeBufSize - writeBufCnt ) { | |
133 | + copyLen = writeBufSize - writeBufCnt; | |
134 | + } | |
135 | + memcpy(writeBuf + writeBufCnt, buf, copyLen); | |
136 | + writeBufCnt += copyLen; | |
137 | + buf += copyLen; | |
138 | + len -= copyLen; | |
139 | + } | |
140 | + if ( !buf || writeBufCnt == writeBufSize ) { | |
141 | + int thisRet = write(f, writeBuf, writeBufCnt); | |
142 | + if ( thisRet < 0 ) return thisRet; | |
143 | + writeBufCnt = 0; | |
144 | + } | |
145 | + } while ( buf && len > 0 ); | |
146 | + return ret; | |
147 | } | |
148 | ||
149 | while (len>0) { | |
150 | diff -bur rsync/flist.c rsync-craig/flist.c | |
151 | --- rsync/flist.c Sat Jul 27 11:01:21 2002 | |
152 | +++ rsync-craig/flist.c Sun Dec 8 16:28:14 2002 | |
153 | @@ -889,7 +889,7 @@ | |
154 | flist = flist_new(); | |
155 | ||
156 | if (f != -1) { | |
157 | - io_start_buffering(f); | |
158 | + io_start_buffering_out(f); | |
159 | } | |
160 | ||
161 | for (i = 0; i < argc; i++) { | |
162 | diff -bur rsync/io.c rsync-craig/io.c | |
163 | --- rsync/io.c Wed Apr 10 19:11:50 2002 | |
164 | +++ rsync-craig/io.c Sun Dec 8 17:54:23 2002 | |
165 | @@ -41,8 +41,8 @@ | |
166 | ||
167 | static int io_multiplexing_out; | |
168 | static int io_multiplexing_in; | |
169 | -static int multiplex_in_fd; | |
170 | -static int multiplex_out_fd; | |
171 | +static int multiplex_in_fd = -1; | |
172 | +static int multiplex_out_fd = -1; | |
173 | static time_t last_io; | |
174 | static int no_flush; | |
175 | ||
176 | @@ -286,17 +286,31 @@ | |
177 | static size_t remaining; | |
178 | int tag, ret = 0; | |
179 | char line[1024]; | |
180 | + static char *buffer; | |
181 | + static size_t bufferIdx = 0; | |
182 | + static size_t bufferSz; | |
183 | ||
184 | - if (!io_multiplexing_in || fd != multiplex_in_fd) | |
185 | + if (fd != multiplex_in_fd) | |
186 | return read_timeout(fd, buf, len); | |
187 | ||
188 | + if (!io_multiplexing_in && remaining == 0) { | |
189 | + if (!buffer) { | |
190 | + bufferSz = 2 * IO_BUFFER_SIZE; | |
191 | + buffer = malloc(bufferSz); | |
192 | + if (!buffer) out_of_memory("read_unbuffered"); | |
193 | + } | |
194 | + remaining = read_timeout(fd, buffer, bufferSz); | |
195 | + bufferIdx = 0; | |
196 | + } | |
197 | + | |
198 | while (ret == 0) { | |
199 | if (remaining) { | |
200 | len = MIN(len, remaining); | |
201 | - read_loop(fd, buf, len); | |
202 | + memcpy(buf, buffer + bufferIdx, len); | |
203 | + bufferIdx += len; | |
204 | remaining -= len; | |
205 | ret = len; | |
206 | - continue; | |
207 | + break; | |
208 | } | |
209 | ||
210 | read_loop(fd, line, 4); | |
211 | @@ -305,8 +319,16 @@ | |
212 | remaining = tag & 0xFFFFFF; | |
213 | tag = tag >> 24; | |
214 | ||
215 | - if (tag == MPLEX_BASE) | |
216 | + if (tag == MPLEX_BASE) { | |
217 | + if (!buffer || remaining > bufferSz) { | |
218 | + buffer = Realloc(buffer, remaining); | |
219 | + if (!buffer) out_of_memory("read_unbuffered"); | |
220 | + bufferSz = remaining; | |
221 | + } | |
222 | + read_loop(fd, buffer, remaining); | |
223 | + bufferIdx = 0; | |
224 | continue; | |
225 | + } | |
226 | ||
227 | tag -= MPLEX_BASE; | |
228 | ||
229 | @@ -327,7 +349,9 @@ | |
230 | rprintf((enum logcode) tag, "%s", line); | |
231 | remaining = 0; | |
232 | } | |
233 | - | |
234 | + if (remaining == 0) { | |
235 | + io_flush(); | |
236 | + } | |
237 | return ret; | |
238 | } | |
239 | ||
240 | @@ -344,8 +368,6 @@ | |
241 | size_t total=0; | |
242 | ||
243 | while (total < N) { | |
244 | - io_flush(); | |
245 | - | |
246 | ret = read_unbuffered (fd, buffer + total, N-total); | |
247 | total += ret; | |
248 | } | |
249 | @@ -531,7 +553,7 @@ | |
250 | static char *io_buffer; | |
251 | static int io_buffer_count; | |
252 | ||
253 | -void io_start_buffering(int fd) | |
254 | +void io_start_buffering_out(int fd) | |
255 | { | |
256 | if (io_buffer) return; | |
257 | multiplex_out_fd = fd; | |
258 | @@ -540,6 +562,11 @@ | |
259 | io_buffer_count = 0; | |
260 | } | |
261 | ||
262 | +void io_start_buffering_in(int fd) | |
263 | +{ | |
264 | + multiplex_in_fd = fd; | |
265 | +} | |
266 | + | |
267 | /** | |
268 | * Write an message to a multiplexed stream. If this fails then rsync | |
269 | * exits. | |
270 | @@ -726,7 +753,7 @@ | |
271 | { | |
272 | multiplex_out_fd = fd; | |
273 | io_flush(); | |
274 | - io_start_buffering(fd); | |
275 | + io_start_buffering_out(fd); | |
276 | io_multiplexing_out = 1; | |
277 | } | |
278 | ||
279 | diff -bur rsync/main.c rsync-craig/main.c | |
280 | --- rsync/main.c Thu Aug 1 13:46:59 2002 | |
281 | +++ rsync-craig/main.c Sun Dec 8 17:39:07 2002 | |
282 | @@ -346,6 +346,8 @@ | |
283 | exit_cleanup(0); | |
284 | } | |
285 | ||
286 | + io_start_buffering_in(f_in); | |
287 | + io_start_buffering_out(f_out); | |
288 | send_files(flist,f_out,f_in); | |
289 | io_flush(); | |
290 | report(f_out); | |
291 | @@ -421,7 +423,7 @@ | |
292 | close(error_pipe[1]); | |
293 | if (f_in != f_out) close(f_in); | |
294 | ||
295 | - io_start_buffering(f_out); | |
296 | + io_start_buffering_out(f_out); | |
297 | ||
298 | io_set_error_fd(error_pipe[0]); | |
299 | ||
300 | @@ -476,6 +478,7 @@ | |
301 | } | |
302 | } | |
303 | ||
304 | + io_start_buffering_in(f_in); | |
305 | if (delete_mode && !delete_excluded) | |
306 | recv_exclude_list(f_in); | |
307 | ||
308 | @@ -569,6 +572,7 @@ | |
309 | extern int cvs_exclude; | |
310 | extern int delete_mode; | |
311 | extern int delete_excluded; | |
312 | + io_start_buffering_out(f_out); | |
313 | if (cvs_exclude) | |
314 | add_cvs_excludes(); | |
315 | if (delete_mode && !delete_excluded) | |
316 | @@ -578,7 +582,10 @@ | |
317 | if (verbose > 3) | |
318 | rprintf(FINFO,"file list sent\n"); | |
319 | ||
320 | + io_flush(); | |
321 | + io_start_buffering_out(f_out); | |
322 | send_files(flist,f_out,f_in); | |
323 | + io_flush(); | |
324 | if (remote_version >= 24) { | |
325 | /* final goodbye message */ | |
326 | read_int(f_in); | |
327 | @@ -590,6 +597,7 @@ | |
328 | wait_process(pid, &status); | |
329 | } | |
330 | report(-1); | |
331 | + io_flush(); | |
332 | exit_cleanup(status); | |
333 | } | |
334 | ||
57625988 DD |
335 | diff -bur rsync/proto.h rsync-craig/proto.h |
336 | --- rsync/proto.h Wed Jul 31 17:37:02 2002 | |
337 | +++ rsync-craig/proto.h Sun Dec 8 16:27:55 2002 | |
338 | @@ -102,7 +102,8 @@ | |
339 | void read_buf(int f,char *buf,size_t len); | |
340 | void read_sbuf(int f,char *buf,size_t len); | |
341 | unsigned char read_byte(int f); | |
342 | -void io_start_buffering(int fd); | |
343 | +void io_start_buffering_out(int fd); | |
344 | +void io_start_buffering_in(int fd); | |
345 | void io_flush(void); | |
346 | void io_end_buffering(void); | |
347 | void write_int(int f,int32 x); | |
348 | diff -bur rsync/receiver.c rsync-craig/receiver.c | |
349 | --- rsync/receiver.c Tue May 28 08:42:51 2002 | |
350 | +++ rsync-craig/receiver.c Sat Dec 7 22:09:04 2002 | |
351 | @@ -273,6 +273,11 @@ | |
352 | offset += len; | |
353 | } | |
354 | ||
355 | + /* | |
356 | + * do a write flush | |
357 | + */ | |
358 | + write_file(fd, NULL, 0); | |
359 | + | |
360 | end_progress(total_size); | |
361 | ||
362 | if (fd != -1 && offset > 0 && sparse_end(fd) != 0) { |