且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

rsync和MyISAM表

更新时间:2023-01-31 09:58:47

rsync仍然必须计算块哈希值什么改变。可能是没有修改的情况是一个快捷方式看文件mod时间/大小。


I'm trying to use rsync to backup MySQL data. The tables use the MyISAM storage engine.

My expectation was that after the first rsync, subsequent rsyncs would be very fast. It turns out, if the table data was changed at all, the operation slows way down.

I did an experiment with a 989 MB MYD file containing real data:

Test 1 - recopying unmodified data

  • rsync -a orig.MYD copy.MYD
    • takes a while as expected
  • rsync -a orig.MYD copy.MYD
    • instantaneous - speedup is in the millions

Test 2 - recopying slightly modified data

  • rsync -a orig.MYD copy.MYD
    • takes a while as expected
  • UPDATE table SET counter = counter + 1 WHERE id = 12345
  • rsync -a orig.MYD copy.MYD
    • takes as long as the original copy!

What gives? Why is rsync taking forever just to copy a tiny change?

Edit: In fact, the second rsync in Test 2 takes as long as the first. rsync is apparently copying the whole file again.

Edit: Turns out when copying from local to local, --whole-file is implied. Even with --no-whole-file, the performance is still terrible.

rsync still has to calculate block hashes to determine what's changed. It may be that the no-modification case is a shortcut looking at file mod time / size.