+++ /dev/null
-To: rsync@lists.samba.org
-Subject: Rsync performance increase through buffering
-From: Craig Barratt <craig@atheros.com>
-Message-ID: <auto-000002369223@atheros.com>
-Date: Sun, 08 Dec 2002 23:48:57 -0800
-
-I've been studying the read and write buffering in rsync and it turns
-out most I/O is done just a couple of bytes at a time. This means there
-are lots of system calls, and also most network traffic comprises lots
-of small packets. The behavior is most extreme when sending/receiving
-file deltas of identical files.
-
-The main case where I/O is buffered is writes from the server (when
-io multiplexing is on). These are usually buffered in 4092 byte
-chunks with a 4 byte header. However, reading of these packets is
-usually unbuffered, and writes from the client are generally not
-buffered. For example: when receiving 1st phase checksums (6 bytes
-per block), 2 reads are done: one of 4 bytes and one of 2 bytes,
-meaning there are 4 system calls (select/read/select/read) per 6
-bytes of checksum data).
-
-One cost of this is some performance, but a significant issue is that
-unbuffered writes generate very short (and very many) ethernet packets,
-which means the overhead is quite large on slow network connections.
-
-The initial file_list writing is typically buffered, but reading it on
-the client is not.
-
-There are some other unneeded system calls:
-
- - One example is that show_progress() calls gettimeofday() even
- if do_progress is not set. show_progress() is called on every
- block, so there is an extra system call per (700 byte) block.
- (NOTE: this was already fixed in 2.6.0, so it is no longer in
- this patch.)
-
- - Another example is that file_write writes each matching (700 byte)
- block without buffering, so that's another system call per block.
-
-To study this behavior I used rsync-2.5.6cvs and had a benchmark area
-comprising around 7800 files of total size 530MB.
-
-Here are some results doing sends and receives via rsyncd, all on the
-same machine, with identical source and destination files. In each
-case --ignore-times (-I) is set, so that every file is processed:
-
- - Send test:
-
- strace -f rsync -Ir . localhost::test |& wc
-
- shows there are about 2,488,775 system calls.
-
- - Receive test:
-
- strace -f rsync -Ir localhost::test . |& wc
-
- shows there are about 1,615,931 system calls.
-
- - Rsyncd has a roughly similar numbers of system calls.
-
- - Send test from another machine (cygwin/WinXP laptop):
-
- tcpdump port 873 |& wc
-
- shows there are about 701,111 ethernet packets (many of them only
- have a 4 byte payload).
-
-Since the source and dest files are the same, the send test only
-wrote 1,738,797 bytes and read 2,139,848 bytes.
-
-These results are similar to rsync 2.5.5.
-
-Below is a patch to a few files that adds read and write buffering in
-the places where the I/O was unbuffered, adds buffering to write_file()
-and removes the unneeded gettimeofday() system call in show_progress().
-
-The results with the patch are:
-
- - Send test: 46,835 system calls, versus 2,488,775.
-
- - Receive test: 138,367 system calls, versus 1,615,931.
-
- - Send test from another machine: 5,255 ethernet packets, versus 701,111.
- If the tcp/ip/udp/802.3 per-packet overhead is around 60 bytes, that
- means the base case transfers an extra 42MB of data, even though the
- useful data is only around 2MB.
-
-The absolute running time on the local rsyncd test isn't much different,
-probably because the test is really disk io limited and system calls on
-an unloaded linux system are pretty fast.
-
-However, on a network test doing a send from cygwin/WinXP to rsyncd
-on rh-linux the running time improves from about 700 seconds to 215
-seconds (with a cpu load of around 17% versus 58%, if you believe
-cygwin's cpu stats). This is probably an extreme case since the system
-call penalty in cygwin is high. But I would suspect a significant
-improvement is possible with a slow network connection, since a lot
-less data is being sent.
-
-Note also that without -I rsync is already very fast, since it skips
-(most) files based on attributes.
-
-With or without this patch the test suite passes except for
-daemon-gzip-upload. One risk of buffering is the potential for
-a bug caused by a missing io_flush: deadlock is possible, so try
-the patch at your own risk...
-
-Craig
-
-###########################################################################
-Note: This is an updated patch for 2.6.0, created by Wayne Davison.
-###########################################################################
---- orig/fileio.c 6 Dec 2003 21:07:27 -0000 1.8
-+++ rsync-2.6.0/fileio.c 24 Dec 2003 01:47:40 -0000
-@@ -69,6 +69,21 @@
- return len;
- }
-
-+
-+static char *wf_writeBuf;
-+static size_t wf_writeBufSize;
-+static size_t wf_writeBufCnt;
-+
-+int flush_write_file(int f)
-+{
-+ int ret = write(f, wf_writeBuf, wf_writeBufCnt);
-+ if (ret < 0)
-+ return ret;
-+ /* if (ret < wf_writeBufCnt) ??? */
-+ wf_writeBufCnt = 0;
-+ return ret;
-+}
-+
- /*
- * write_file does not allow incomplete writes. It loops internally
- * until len bytes are written or errno is set.
-@@ -83,7 +98,22 @@
- int len1 = MIN(len, SPARSE_WRITE_SIZE);
- r1 = write_sparse(f, buf, len1);
- } else {
-- r1 = write(f, buf, len);
-+ if (!wf_writeBuf) {
-+ wf_writeBufSize = MAX_MAP_SIZE;
-+ wf_writeBufCnt = 0;
-+ wf_writeBuf = new_array(char, MAX_MAP_SIZE);
-+ if (!wf_writeBuf) out_of_memory("write_file");
-+ }
-+ r1 = MIN(len, wf_writeBufSize - wf_writeBufCnt);
-+ if (r1) {
-+ memcpy(wf_writeBuf + wf_writeBufCnt, buf, r1);
-+ wf_writeBufCnt += r1;
-+ }
-+ if (wf_writeBufCnt == wf_writeBufSize) {
-+ if (flush_write_file(f) < 0) return -1;
-+ if (!r1 && len)
-+ continue;
-+ }
- }
- if (r1 <= 0) {
- if (ret > 0) return ret;
---- orig/flist.c 15 Dec 2003 08:10:31 -0000 1.144
-+++ rsync-2.6.0/flist.c 24 Dec 2003 01:47:40 -0000
-@@ -925,7 +925,7 @@
- flist = flist_new();
-
- if (f != -1) {
-- io_start_buffering(f);
-+ io_start_buffering_out(f);
- if (filesfrom_fd >= 0) {
- if (argv[0] && !push_dir(argv[0], 0)) {
- rprintf(FERROR, "push_dir %s failed: %s\n",
---- orig/io.c 24 Dec 2003 01:46:07 -0000 1.113
-+++ rsync-2.6.0/io.c 24 Dec 2003 01:47:41 -0000
-@@ -41,8 +41,8 @@
-
- static int io_multiplexing_out;
- static int io_multiplexing_in;
--static int multiplex_in_fd;
--static int multiplex_out_fd;
-+static int multiplex_in_fd = -1;
-+static int multiplex_out_fd = -1;
- static time_t last_io;
- static int no_flush;
-
-@@ -440,17 +440,31 @@
- static size_t remaining;
- int tag, ret = 0;
- char line[1024];
-+ static char *buffer;
-+ static size_t bufferIdx = 0;
-+ static size_t bufferSz;
-
-- if (!io_multiplexing_in || fd != multiplex_in_fd)
-+ if (fd != multiplex_in_fd)
- return read_timeout(fd, buf, len);
-
-+ if (!io_multiplexing_in && remaining == 0) {
-+ if (!buffer) {
-+ bufferSz = 2 * IO_BUFFER_SIZE;
-+ buffer = new_array(char, bufferSz);
-+ if (!buffer) out_of_memory("read_unbuffered");
-+ }
-+ remaining = read_timeout(fd, buffer, bufferSz);
-+ bufferIdx = 0;
-+ }
-+
- while (ret == 0) {
- if (remaining) {
- len = MIN(len, remaining);
-- read_loop(fd, buf, len);
-+ memcpy(buf, buffer + bufferIdx, len);
-+ bufferIdx += len;
- remaining -= len;
- ret = len;
-- continue;
-+ break;
- }
-
- read_loop(fd, line, 4);
-@@ -459,8 +473,16 @@
- remaining = tag & 0xFFFFFF;
- tag = tag >> 24;
-
-- if (tag == MPLEX_BASE)
-+ if (tag == MPLEX_BASE) {
-+ if (!buffer || remaining > bufferSz) {
-+ buffer = realloc_array(buffer, char, remaining);
-+ if (!buffer) out_of_memory("read_unbuffered");
-+ bufferSz = remaining;
-+ }
-+ read_loop(fd, buffer, remaining);
-+ bufferIdx = 0;
- continue;
-+ }
-
- tag -= MPLEX_BASE;
-
-@@ -482,6 +504,9 @@
- remaining = 0;
- }
-
-+ if (remaining == 0)
-+ io_flush();
-+
- return ret;
- }
-
-@@ -498,8 +523,6 @@
- size_t total=0;
-
- while (total < N) {
-- io_flush();
--
- ret = read_unbuffered(fd, buffer + total, N-total);
- total += ret;
- }
-@@ -682,7 +705,7 @@
- static char *io_buffer;
- static int io_buffer_count;
-
--void io_start_buffering(int fd)
-+void io_start_buffering_out(int fd)
- {
- if (io_buffer) return;
- multiplex_out_fd = fd;
-@@ -691,6 +714,11 @@
- io_buffer_count = 0;
- }
-
-+void io_start_buffering_in(int fd)
-+{
-+ multiplex_in_fd = fd;
-+}
-+
- /**
- * Write an message to a multiplexed stream. If this fails then rsync
- * exits.
-@@ -881,7 +909,7 @@
- {
- multiplex_out_fd = fd;
- io_flush();
-- io_start_buffering(fd);
-+ io_start_buffering_out(fd);
- io_multiplexing_out = 1;
- }
-
---- orig/main.c 20 Dec 2003 16:57:24 -0000 1.176
-+++ rsync-2.6.0/main.c 24 Dec 2003 01:47:41 -0000
-@@ -380,6 +380,8 @@
- exit_cleanup(0);
- }
-
-+ io_start_buffering_in(f_in);
-+ io_start_buffering_out(f_out);
- send_files(flist,f_out,f_in);
- io_flush();
- report(f_out);
-@@ -454,7 +456,7 @@
- close(error_pipe[1]);
- if (f_in != f_out) close(f_in);
-
-- io_start_buffering(f_out);
-+ io_start_buffering_out(f_out);
-
- io_set_error_fd(error_pipe[0]);
-
-@@ -508,6 +510,7 @@
- }
- }
-
-+ io_start_buffering_in(f_in);
- if (delete_mode && !delete_excluded)
- recv_exclude_list(f_in);
-
-@@ -606,6 +609,7 @@
- extern int cvs_exclude;
- extern int delete_mode;
- extern int delete_excluded;
-+ io_start_buffering_out(f_out);
- if (cvs_exclude)
- add_cvs_excludes();
- if (delete_mode && !delete_excluded)
-@@ -617,7 +621,10 @@
- if (verbose > 3)
- rprintf(FINFO,"file list sent\n");
-
-+ io_flush();
-+ io_start_buffering_out(f_out);
- send_files(flist,f_out,f_in);
-+ io_flush();
- if (protocol_version >= 24) {
- /* final goodbye message */
- read_int(f_in);
-@@ -629,6 +636,7 @@
- wait_process(pid, &status);
- }
- report(-1);
-+ io_flush();
- exit_cleanup(status);
- }
-
---- orig/proto.h 6 Dec 2003 21:07:27 -0000 1.161
-+++ rsync-2.6.0/proto.h 24 Dec 2003 01:47:41 -0000
-@@ -68,6 +68,7 @@
- const char *line, int include);
- void add_cvs_excludes(void);
- int sparse_end(int f);
-+int flush_write_file(int f);
- int write_file(int f,char *buf,size_t len);
- struct map_struct *map_file(int fd,OFF_T len);
- char *map_ptr(struct map_struct *map,OFF_T offset,int len);
-@@ -101,7 +102,8 @@
- void read_buf(int f,char *buf,size_t len);
- void read_sbuf(int f,char *buf,size_t len);
- unsigned char read_byte(int f);
--void io_start_buffering(int fd);
-+void io_start_buffering_out(int fd);
-+void io_start_buffering_in(int fd);
- void io_flush(void);
- void io_end_buffering(void);
- void write_int(int f,int32 x);
---- orig/receiver.c 15 Dec 2003 08:14:27 -0000 1.56
-+++ rsync-2.6.0/receiver.c 24 Dec 2003 01:47:41 -0000
-@@ -307,6 +307,8 @@
- offset += len;
- }
-
-+ flush_write_file(fd);
-+
- if (do_progress)
- end_progress(total_size);
-