synk

synchronize files between hosts
Log | Files | Refs | README | LICENSE

commit 671e5ee704165beccfd85e7dbb60740d0c703bc7
parent e91bc489c877001d81332b0f1d26c41a54c572d9
Author: Willy <willyatmailoodotorg>
Date:   Fri Aug 26 08:58:42 +0200

Provide an insight of the algorithm

Diffstat:
README | 99+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------
1 file changed, 90 insertions(+), 9 deletions(-)
diff --git a/README b/README @@ -4,8 +4,8 @@ synk Synchronize a bunch of files between different hosts. * active/passive replication -* daemon mode using inotify(1) -* one-shot mode for cron(1) +* one-shot tool +* uses timestamps for comparison * spawn rsync(1) processes usage @@ -21,15 +21,96 @@ usage EOF $ synk -v $HOME/file - phobos.z3bra.org:/home/z3bra/file 1464274181 - apophis.z3bra.org:/home/z3bra/file 1464260388 - doom.z3bra.org:/home/z3bra/file 1464273098 + localhost /home/z3bra/file 43b5c67 146426324 + phobos.z3bra.org /home/z3bra/file 549fb41 1464274181 + apophis.z3bra.org /home/z3bra/file 34fc2ae 1464260388 + doom.z3bra.org /home/z3bra/file df3738b 1464273098 LATEST: phobos.z3bra.org synk: rsync -azEq --delete phobos.z3bra.org:/home/z3bra/file apophis.z3bra.org:/home/z3bra/file synk: rsync -azEq --delete phobos.z3bra.org:/home/z3bra/file doom.z3bra.org:/home/z3bra/file $ synk -v $HOME/file - phobos.z3bra.org:/home/z3bra/file 1464274181 - apophis.z3bra.org:/home/z3bra/file 1464274181 - doom.z3bra.org:/home/z3bra/file 1464274181 - SYNKED! + localhost /home/z3bra/file 549fb41 1464274168 + phobos.z3bra.org /home/z3bra/file 549fb41 1464274181 + apophis.z3bra.org /home/z3bra/file 549fb41 1464275472 + doom.z3bra.org /home/z3bra/file 549fb41 1464275478 + SYNKED: /home/z3bra/file + +how it works +------------ + +Local client spawns server-mode instances to peers via ssh: + + synk file + \_ ssh phobos.z3bra.org 'synk -s' + \_ ssh apophis.z3bra.org 'synk -s' + \_ ssh doom.z3bra.org 'synk -s' + +Client sends metadata for "file" to each peer, using this structure: + + struct metadata_t { + const char path[PATH_MAX]; + unsigned char sha512[64]; + long mtime; + }; + +Each peer will the recreate this structure locally, using metadata.path, +and send it back to the client. + +The client then has a linked-list of its peers containing the peer info +and associated metadata: + + struct node_t { + struct metadata_t meta; + struct sockaddr_in peer; + SLIST_ENTRY(node_t) entries; + }; + +This list is then processed to figure out wether all peer.meta.hash +match or not. In case a difference is found, the node.meta.mtime are +evaluated to find the higher one (sort list by mtime?). + +In this case, we have two possibilities: + +### localhost is the most recent + +If localhost is the most recent, we just spawn `rsync(1)` processes +locally to update the file with peer that have a different hash: + + synk file + \_ rsync -azEq file phobos.z3bra.org:file + \_ rsync -azEq file apophis.z3bra.org:file + \_ rsync -azEq file doom.z3bra.org:file + +### remote peer X is the most recent + +We need to spawn `rsync(1)` processes remotely on the host, to sync it +with all the peers (except for localhost). +Either we do the same as we do locally, but prepending each command with `ssh $PEER`: + + synk file + \_ rsync -azEq phobos.z3bra.org:file file + \_ ssh phobos.z3bra.org 'rsync -azEq file apophis.z3bra.org:file' + \_ ssh phobos.z3bra.org 'rsync -azEq file doom.z3bra.org:file' + +Or, we simply run a new `synk` process on the peer, so that he figure +out that he has to sync the other peers (we still need to use `rsync(1)` +for localhost, as we're unable to guess our external address): + + synk file + \_ rsync -azEq phobos.z3bra.org:file file + \_ ssh phobos.z3bra.org 'synk -h apophis.z3bra.org -h doom.z3bra.org file' + +Which would result, on the remote peer: + + /usr/sbin/sshd + \_ sshd: z3bra [priv] + \_ sshd: z3bra@pts/0 + \_ synk -h apophis.z3bra.org -h doom.z3bra.org file + \_ rsync -azEq file apophis.z3bra.org:file + \_ rsync -azEq file doom.z3bra.org:file + +The only *real* issue with this solution is that the logic to discover the +newest file will have to run twice, once on the initial calling process +(locally), and once on the newest peer. This shouldn't be a huge issue, +but it's still bothering to do the same thing twice.