Got rid of the gettimeofday() hunk for progress.c since it's no longer
[rsync/rsync-patches.git] / craigb-perf.diff
CommitLineData
57625988
DD
1To: rsync@lists.samba.org
2Subject: Rsync performance increase through buffering
3From: Craig Barratt <craig@atheros.com>
4Message-ID: <auto-000002369223@atheros.com>
5Date: Sun, 08 Dec 2002 23:48:57 -0800
6
7I've been studying the read and write buffering in rsync and it turns
8out most I/O is done just a couple of bytes at a time. This means there
9are lots of system calls, and also most network traffic comprises lots
10of small packets. The behavior is most extreme when sending/receiving
11file deltas of identical files.
12
13The main case where I/O is buffered is writes from the server (when
14io multiplexing is on). These are usually buffered in 4092 byte
15chunks with a 4 byte header. However, reading of these packets is
16usually unbuffered, and writes from the client are generally not
17buffered. For example: when receiving 1st phase checksums (6 bytes
18per block), 2 reads are done: one of 4 bytes and one of 2 bytes,
19meaning there are 4 system calls (select/read/select/read) per 6
20bytes of checksum data).
21
22One cost of this is some performance, but a significant issue is that
23unbuffered writes generate very short (and very many) ethernet packets,
24which means the overhead is quite large on slow network connections.
25
26The initial file_list writing is typically buffered, but reading it on
27the client is not.
28
29There are some other unneeded system calls:
30
31 - One example is that show_progress() calls gettimeofday() even
32 if do_progress is not set. show_progress() is called on every
33 block, so there is an extra system call per (700 byte) block.
b10e9d48 34 (NOTE: this is currently fixed in CVS.)
57625988
DD
35
36 - Another example is that file_write writes each matching (700 byte)
37 block without buffering, so that's another system call per block.
38
39To study this behavior I used rsync-2.5.6cvs and had a benchmark area
40comprising around 7800 files of total size 530MB.
41
42Here are some results doing sends and receives via rsyncd, all on the
43same machine, with identical source and destination files. In each
44case --ignore-times (-I) is set, so that every file is processed:
45
46 - Send test:
47
48 strace -f rsync -Ir . localhost::test |& wc
49
50 shows there are about 2,488,775 system calls.
51
52 - Receive test:
53
54 strace -f rsync -Ir localhost::test . |& wc
55
56 shows there are about 1,615,931 system calls.
57
58 - Rsyncd has a roughly similar numbers of system calls.
59
60 - Send test from another machine (cygwin/WinXP laptop):
61
62 tcpdump port 873 |& wc
63
64 shows there are about 701,111 ethernet packets (many of them only
65 have a 4 byte payload).
66
67Since the source and dest files are the same, the send test only
68wrote 1,738,797 bytes and read 2,139,848 bytes.
69
70These results are similar to rsync 2.5.5.
71
72Below is a patch to a few files that adds read and write buffering in
73the places where the I/O was unbuffered, adds buffering to write_file()
74and removes the unneeded gettimeofday() system call in show_progress().
75
76The results with the patch are:
77
78 - Send test: 46,835 system calls, versus 2,488,775.
79
80 - Receive test: 138,367 system calls, versus 1,615,931.
81
82 - Send test from another machine: 5,255 ethernet packets, versus 701,111.
83 If the tcp/ip/udp/802.3 per-packet overhead is around 60 bytes, that
84 means the base case transfers an extra 42MB of data, even though the
85 useful data is only around 2MB.
86
87The absolute running time on the local rsyncd test isn't much different,
88probably because the test is really disk io limited and system calls on
89an unloaded linux system are pretty fast.
90
91However, on a network test doing a send from cygwin/WinXP to rsyncd
92on rh-linux the running time improves from about 700 seconds to 215
93seconds (with a cpu load of around 17% versus 58%, if you believe
94cygwin's cpu stats). This is probably an extreme case since the system
95call penalty in cygwin is high. But I would suspect a significant
96improvement is possible with a slow network connection, since a lot
97less data is being sent.
98
99Note also that without -I rsync is already very fast, since it skips
100(most) files based on attributes.
101
102With or without this patch the test suite passes except for
103daemon-gzip-upload. One risk of buffering is the potential for
104a bug caused by a missing io_flush: deadlock is possible, so try
105the patch at your own risk...
106
107Craig
108
109###########################################################################
110diff -bur rsync/fileio.c rsync-craig/fileio.c
111--- rsync/fileio.c Fri Jan 25 15:07:34 2002
112+++ rsync-craig/fileio.c Sat Dec 7 22:21:10 2002
113@@ -76,7 +76,35 @@
114 int ret = 0;
115
116 if (!sparse_files) {
117- return write(f,buf,len);
118+ static char *writeBuf;
119+ static size_t writeBufSize;
120+ static size_t writeBufCnt;
121+
122+ if ( !writeBuf ) {
123+ writeBufSize = MAX_MAP_SIZE;
124+ writeBufCnt = 0;
125+ writeBuf = (char*)malloc(MAX_MAP_SIZE);
126+ if (!writeBuf) out_of_memory("write_file");
127+ }
128+ ret = len;
129+ do {
130+ if ( buf && writeBufCnt < writeBufSize ) {
131+ size_t copyLen = len;
132+ if ( copyLen > writeBufSize - writeBufCnt ) {
133+ copyLen = writeBufSize - writeBufCnt;
134+ }
135+ memcpy(writeBuf + writeBufCnt, buf, copyLen);
136+ writeBufCnt += copyLen;
137+ buf += copyLen;
138+ len -= copyLen;
139+ }
140+ if ( !buf || writeBufCnt == writeBufSize ) {
141+ int thisRet = write(f, writeBuf, writeBufCnt);
142+ if ( thisRet < 0 ) return thisRet;
143+ writeBufCnt = 0;
144+ }
145+ } while ( buf && len > 0 );
146+ return ret;
147 }
148
149 while (len>0) {
150diff -bur rsync/flist.c rsync-craig/flist.c
151--- rsync/flist.c Sat Jul 27 11:01:21 2002
152+++ rsync-craig/flist.c Sun Dec 8 16:28:14 2002
153@@ -889,7 +889,7 @@
154 flist = flist_new();
155
156 if (f != -1) {
157- io_start_buffering(f);
158+ io_start_buffering_out(f);
159 }
160
161 for (i = 0; i < argc; i++) {
162diff -bur rsync/io.c rsync-craig/io.c
163--- rsync/io.c Wed Apr 10 19:11:50 2002
164+++ rsync-craig/io.c Sun Dec 8 17:54:23 2002
165@@ -41,8 +41,8 @@
166
167 static int io_multiplexing_out;
168 static int io_multiplexing_in;
169-static int multiplex_in_fd;
170-static int multiplex_out_fd;
171+static int multiplex_in_fd = -1;
172+static int multiplex_out_fd = -1;
173 static time_t last_io;
174 static int no_flush;
175
176@@ -286,17 +286,31 @@
177 static size_t remaining;
178 int tag, ret = 0;
179 char line[1024];
180+ static char *buffer;
181+ static size_t bufferIdx = 0;
182+ static size_t bufferSz;
183
184- if (!io_multiplexing_in || fd != multiplex_in_fd)
185+ if (fd != multiplex_in_fd)
186 return read_timeout(fd, buf, len);
187
188+ if (!io_multiplexing_in && remaining == 0) {
189+ if (!buffer) {
190+ bufferSz = 2 * IO_BUFFER_SIZE;
191+ buffer = malloc(bufferSz);
192+ if (!buffer) out_of_memory("read_unbuffered");
193+ }
194+ remaining = read_timeout(fd, buffer, bufferSz);
195+ bufferIdx = 0;
196+ }
197+
198 while (ret == 0) {
199 if (remaining) {
200 len = MIN(len, remaining);
201- read_loop(fd, buf, len);
202+ memcpy(buf, buffer + bufferIdx, len);
203+ bufferIdx += len;
204 remaining -= len;
205 ret = len;
206- continue;
207+ break;
208 }
209
210 read_loop(fd, line, 4);
211@@ -305,8 +319,16 @@
212 remaining = tag & 0xFFFFFF;
213 tag = tag >> 24;
214
215- if (tag == MPLEX_BASE)
216+ if (tag == MPLEX_BASE) {
217+ if (!buffer || remaining > bufferSz) {
218+ buffer = Realloc(buffer, remaining);
219+ if (!buffer) out_of_memory("read_unbuffered");
220+ bufferSz = remaining;
221+ }
222+ read_loop(fd, buffer, remaining);
223+ bufferIdx = 0;
224 continue;
225+ }
226
227 tag -= MPLEX_BASE;
228
229@@ -327,7 +349,9 @@
230 rprintf((enum logcode) tag, "%s", line);
231 remaining = 0;
232 }
233-
234+ if (remaining == 0) {
235+ io_flush();
236+ }
237 return ret;
238 }
239
240@@ -344,8 +368,6 @@
241 size_t total=0;
242
243 while (total < N) {
244- io_flush();
245-
246 ret = read_unbuffered (fd, buffer + total, N-total);
247 total += ret;
248 }
249@@ -531,7 +553,7 @@
250 static char *io_buffer;
251 static int io_buffer_count;
252
253-void io_start_buffering(int fd)
254+void io_start_buffering_out(int fd)
255 {
256 if (io_buffer) return;
257 multiplex_out_fd = fd;
258@@ -540,6 +562,11 @@
259 io_buffer_count = 0;
260 }
261
262+void io_start_buffering_in(int fd)
263+{
264+ multiplex_in_fd = fd;
265+}
266+
267 /**
268 * Write an message to a multiplexed stream. If this fails then rsync
269 * exits.
270@@ -726,7 +753,7 @@
271 {
272 multiplex_out_fd = fd;
273 io_flush();
274- io_start_buffering(fd);
275+ io_start_buffering_out(fd);
276 io_multiplexing_out = 1;
277 }
278
279diff -bur rsync/main.c rsync-craig/main.c
280--- rsync/main.c Thu Aug 1 13:46:59 2002
281+++ rsync-craig/main.c Sun Dec 8 17:39:07 2002
282@@ -346,6 +346,8 @@
283 exit_cleanup(0);
284 }
285
286+ io_start_buffering_in(f_in);
287+ io_start_buffering_out(f_out);
288 send_files(flist,f_out,f_in);
289 io_flush();
290 report(f_out);
291@@ -421,7 +423,7 @@
292 close(error_pipe[1]);
293 if (f_in != f_out) close(f_in);
294
295- io_start_buffering(f_out);
296+ io_start_buffering_out(f_out);
297
298 io_set_error_fd(error_pipe[0]);
299
300@@ -476,6 +478,7 @@
301 }
302 }
303
304+ io_start_buffering_in(f_in);
305 if (delete_mode && !delete_excluded)
306 recv_exclude_list(f_in);
307
308@@ -569,6 +572,7 @@
309 extern int cvs_exclude;
310 extern int delete_mode;
311 extern int delete_excluded;
312+ io_start_buffering_out(f_out);
313 if (cvs_exclude)
314 add_cvs_excludes();
315 if (delete_mode && !delete_excluded)
316@@ -578,7 +582,10 @@
317 if (verbose > 3)
318 rprintf(FINFO,"file list sent\n");
319
320+ io_flush();
321+ io_start_buffering_out(f_out);
322 send_files(flist,f_out,f_in);
323+ io_flush();
324 if (remote_version >= 24) {
325 /* final goodbye message */
326 read_int(f_in);
327@@ -590,6 +597,7 @@
328 wait_process(pid, &status);
329 }
330 report(-1);
331+ io_flush();
332 exit_cleanup(status);
333 }
334
57625988
DD
335diff -bur rsync/proto.h rsync-craig/proto.h
336--- rsync/proto.h Wed Jul 31 17:37:02 2002
337+++ rsync-craig/proto.h Sun Dec 8 16:27:55 2002
338@@ -102,7 +102,8 @@
339 void read_buf(int f,char *buf,size_t len);
340 void read_sbuf(int f,char *buf,size_t len);
341 unsigned char read_byte(int f);
342-void io_start_buffering(int fd);
343+void io_start_buffering_out(int fd);
344+void io_start_buffering_in(int fd);
345 void io_flush(void);
346 void io_end_buffering(void);
347 void write_int(int f,int32 x);
348diff -bur rsync/receiver.c rsync-craig/receiver.c
349--- rsync/receiver.c Tue May 28 08:42:51 2002
350+++ rsync-craig/receiver.c Sat Dec 7 22:09:04 2002
351@@ -273,6 +273,11 @@
352 offset += len;
353 }
354
355+ /*
356+ * do a write flush
357+ */
358+ write_file(fd, NULL, 0);
359+
360 end_progress(total_size);
361
362 if (fd != -1 && offset > 0 && sparse_end(fd) != 0) {