monochromatic

monochromatic blog: http://blog.z3bra.org
git clone git://z3bra.org/monochromatic
Log | Files | Refs

backup-someone.txt (10113B)


      1 # [Backup, someone ?](#)
      2 ## — 24 September, 2014
      3 
      4 **FRIENDLY REMINDER: Have you back up your data today ?**
      5 
      6 If you've never seen this sentence, then write it down, and put it somewhere
      7 in evidence.
      8 
      9 <q>Why ?</q> you ask ? Because. Having multiple copies of your data is important
     10 if you plan on keeping them on the long term.
     11 You know, a hard drive will not tell you: <q>Hey ! I'm gonna die in two days
     12 around 2 am, please copy me somewhere else.</q>. There are so many way to loose
     13 data... And you'll experience some of them, trust me !
     14 
     15 Anyway, back to the topic ! In this post, I'm gonna tell you a *simple* way to
     16 backup your data. All you need is the following:
     17 
     18 * A external storage support (USB key, hard drive, tapes, ...)
     19 * An archiver (cpio, tar, ar, ...)
     20 * A compressor (gzip, bzip2, xz, ...)
     21 * Some shell glue
     22 
     23 ### Preparation
     24 
     25 First, you need to figure out what you want to backup: configs ? multimedia ?
     26 code ? For the purpose of this article, Let's say I want to backup all my
     27 images, located in `/data/img`. Let's figure out the size of this directory:
     28 
     29     ── du -sh /data/img
     30     5.5G    /data/img/
     31 
     32 This could fit on my USB key. Let's mount and prepare it. In the meantime, we
     33 will create a user dedicated to the backup process:
     34 
     35     # useradd -M -g users
     36     # mount /dev/sdd1 /mnt
     37     # mkdir /mnt/backup
     38     # chown backup:users /mnt/backup
     39 
     40 Now the drive is ready to accept backups. Let's see how to create them.
     41 
     42 ### Backing up
     43 
     44 What's a backup already ?
     45 
     46 > In information technology, a backup, or the process of backing up, refers to
     47 > the copying and archiving of computer data so it may be used to restore the
     48 > original after a data loss event. The verb form is to back up in two words,
     49 > whereas the noun is backup.
     50 
     51 **RECOVER**, that's the only word that matter. A backup is useless if you can't
     52 recover data from it. PERIOD.
     53 
     54 In my case, I chose `cpio`, because I find it simple to recover data from a cpio
     55 archive. We'll see later how to do so. If you find it [easier to do with
     56 tar](http://xkcd.com/1168/), feel free to adapt the following to your likings.
     57 
     58 So what's the plan ? First, we'll create an archive containing all the files we
     59 want. Then, compress the said archive to gain some space, and finally, manage
     60 those backups to keep multiple copies.
     61 
     62 #### Archiving
     63 
     64 For this task, I chose `cpio`, which takes filenames on stdin, and creates an
     65 archive to stdout. The fact it outputs to stdout give the ability to compress
     66 the archive while it's created. A good thing with it is that it will only use
     67 512 bytes of RAM ! Indeed, when you pipe data through a pipe, it will only pass
     68 512 bytes at a time, then wait for the data to be processed, and so on... YOu
     69 can check your pipe buffer with `ulimit -a`. Anyways:
     70 
     71     ── find /data/img -type f | cpio -o | gzip -c > /mnt/backup/images.cpio.gz
     72 
     73 And the archive is created and compressed ! Pretty easy isn't it ? Let's see how
     74 to manage them now.
     75 
     76 #### Managing
     77 
     78 Be creative for this part ! you can either use `$(date +%Y-%m-%d)` as a name for
     79 the backup, write a crawler to change names based on their timestamp, or maybe
     80 use some rotating script, like the one written by
     81 [ypnose](http://ywstd.fr/blog/2014/backup-snippet.html).
     82 
     83 I modified the script to allow an automatic rotation of files, in case the file
     84 number limit is reached. Here it is:
     85 
     86     #!/bin/sh
     87     #
     88     # z3bra - (c) wtfpl 2014
     89     # Backup a file, and rotate backups : file.0.BAK - file.1.BAK, ...
     90     #
     91     # Based on a original idea from Ypnose. Thanks mate !
     92     # <http://ywstd.fr/blog/2014/bakup-snippet.html>
     93 
     94     EXT=${EXT:-BAK} # extension used for backup
     95     LIM=${LIM:-9}   # maximum number of version to keep
     96     PAD=${PAD:-0}   # number to start with
     97 
     98     usage() {
     99         cat <<EOF
    100     usage: `basename $0` [-hrv] <file>
    101             -h  : print this help
    102             -r  : perform a rotation if \$LIM is reached
    103             -v  : verbose mode
    104     EOF
    105     }
    106 
    107     # report action performed in verbose mode
    108     log() {
    109         # do not log anything if not in $VERBOSE mode
    110         test -z $VERBOSE && return
    111 
    112         echo "[$(date +%Y-%m-%d)] - $*"
    113     }
    114 
    115     # rotate backups to leave moar room
    116     rotate() {
    117         # do not rotate if the rotate flags wasn't provided
    118         test -z $ROTATE && return
    119 
    120         # delete the oldest backup
    121         rm ${FILE}.${PAD}.${EXT}
    122 
    123         # move every file down one place
    124         for N1 in `seq $PAD $LIM`; do
    125             N2=$(( N1 + ROTATE ))
    126 
    127             # don't go any further
    128             test -f ${FILE}.${N2}.${EXT} || return
    129 
    130             # move file down $ROTATE place
    131             log "${FILE}.${N2}.${EXT} -> ${FILE}.${N1}.${EXT}"
    132             mv ${FILE}.${N2}.${EXT} ${FILE}.${N1}.${EXT}
    133         done
    134     }
    135 
    136     # actually archive files
    137     archive() {
    138         # test the presence of each version, and create one that doesn't exists
    139         for N in `seq $PAD $LIM`; do
    140             if test ! -f ${FILE}.${N}.${EXT}; then
    141 
    142                 # cope the file under it's new name
    143                 log "Created: ${FILE}.${N}.${EXT}"
    144                 cp ${FILE} ${FILE}.${N}.${EXT}
    145 
    146                 exit 0
    147             fi
    148         done
    149     }
    150 
    151     while getopts "hrv" opt; do
    152         case $opt in
    153             h) usage; exit 0 ;;
    154             r) ROTATE=1 ;;
    155             v) VERBOSE=1 ;;
    156             *) usage; exit 1 ;;
    157         esac
    158     done
    159 
    160     shift $((OPTIND - 1))
    161 
    162     test $# -lt 1 && usage && exit 1
    163 
    164     FILE=$1
    165 
    166     # in case limit is reach, remove the oldest backup
    167     test -f ${FILE}.${LIM}.${EXT} && rotate
    168 
    169     # if rotation wasn't performed, we'll not archive anything
    170     test -f ${FILE}.${LIM}.${EXT} || archive
    171 
    172     echo "Limit of $LIM .$EXT files reached run with -r to force rotation"
    173     exit 1
    174 
    175 Now, to "archive" a file, all you need to do is :
    176 
    177     ── cd /mnt/backup
    178     ── backup.sh -r images.cpio.gz
    179 
    180 And it will create the following tree:
    181 
    182     ── ls /mnt/backup
    183     images.cpio.gz        images.cpio.gz.3.BAK images.cpio.gz.7.BAK
    184     images.cpio.gz.0.BAK  images.cpio.gz.4.BAK images.cpio.gz.8.BAK
    185     images.cpio.gz.1.BAK  images.cpio.gz.5.BAK images.cpio.gz.9.BAK
    186     images.cpio.gz.2.BAK  images.cpio.gz.6.BAK
    187 
    188 Aaaaaand we're done ! Wrap it all in a crontab, and the backup process will
    189 start:
    190 
    191     # start a backup a 2 am, everyday
    192     0 2 * * * find /data/img -type f |cpio -o |gzip > /mnt/backup/image.cpio.gz
    193 
    194     # rotate backups limiting their number to 7 (a whole week)
    195     0 3 * * * cd /mnt/backup && LIM=6 backup.sh -r image.cpio.gz
    196 
    197 Should be enough for now. But here comes the most important part...
    198 
    199 ### Restoring
    200 
    201 This is the most important one, but not the trickiest, don't worry. We're on
    202 friday, and your friends are arriving in a few minutes to see the photos from
    203 your last trip. Before they arrive, you decide to cleanup the directory, and
    204 notice a `.filedb-47874947392` created by your camera in the said directory.
    205 Let's remove it:
    206 
    207     ── cd /data/img/2014/trip_to_sahara/
    208     ── ls -a .filedb-*
    209     .filedb-47874947392
    210     ── rm -f .filedb- *
    211     rm: can't remove '.filedb-': No such file or directory
    212     ── ls -la .
    213     total 0
    214     drwxr-xr-x    1 z3bra    users          402 Sep 24 00:41 .
    215     drwxr-xr-x    1 z3bra    users          402 Sep 24 00:41 ..
    216     -rw-r--r--    1 z3bra    users            0 Sep 24 00:58 .filedb-47874947392
    217 
    218 <q>Oh god.. Why..?</q>
    219 This shitty space between the '-' and the '\*' in your `rm` command is going to
    220 fuck your presentation up !
    221 Hopefully, you made a backup this morning at 2 am... Let's restore your whole
    222 directory from it:
    223 
    224     ── mount /dev/sdd1 /mnt
    225     ── cd /mnt/backup
    226     ── ls -la
    227     total 0
    228     drwxr-xr-x    1 z3bra    users      402 Sep 10 00:41 .
    229     drwxr-xr-x    1 z3bra    users      402 Sep 10 00:41 ..
    230     -rw-r--r--    1 z3bra    users        0 Sep 19 02:01 images.cpio.gz
    231     -rw-r--r--    1 z3bra    users        0 Sep 15 03:00 images.cpio.gz.0.BAK
    232     -rw-r--r--    1 z3bra    users        0 Sep 16 03:00 images.cpio.gz.1.BAK
    233     -rw-r--r--    1 z3bra    users        0 Sep 17 03:00 images.cpio.gz.2.BAK
    234     -rw-r--r--    1 z3bra    users        0 Sep 18 03:00 images.cpio.gz.3.BAK
    235     -rw-r--r--    1 z3bra    users        0 Sep 19 03:00 images.cpio.gz.4.BAK
    236     -rw-r--r--    1 z3bra    users        0 Sep 13 03:00 images.cpio.gz.5.BAK
    237     -rw-r--r--    1 z3bra    users        0 Sep 14 03:00 images.cpio.gz.6.BAK
    238 
    239 We are friday 19 september. As you can see from the timestamp, backups number
    240 5/6 are from last week. The backup from this morning is the number 4, and the
    241 latest is the one without any number.
    242 
    243 `cpio` allow extracting files from an archive using the following syntax
    244 
    245     ── cpio -i -d < archive.cpio
    246 
    247 `-i` ask for an extraction, while `-d` tells `cpio` to recreate the directory
    248 tree if it does not exists. Check the [wikipedia
    249 article](http://wikipedia.org/cpio) for more explanations on how it works.
    250 
    251 So, to restore our lost directory you'd proceed like this:
    252 
    253     # archive was created from absolute path, and cpio restor files from current
    254     # directory, so let's move to root, to restore files directly
    255     ── cd /
    256 
    257     # you can pass globbing patterns to cpio, so that it only restores what you
    258     # want. Don't forget to decompress the archive first
    259     ── gzip -cd /mnt/backup/images.cpio.gz | cpio -ivd data/img/2014/trip_to_sahara/*
    260     data/img/2014/trip_to_sahara/IMG-0001.JPG
    261     data/img/2014/trip_to_sahara/IMG-0002.JPG
    262     data/img/2014/trip_to_sahara/IMG-0003.JPG
    263     data/img/2014/trip_to_sahara/IMG-0004.JPG
    264     data/img/2014/trip_to_sahara/IMG-0005.JPG
    265     data/img/2014/trip_to_sahara/IMG-0006.JPG
    266     data/img/2014/trip_to_sahara/.filedb-47874947392
    267     23 blocks
    268 
    269     ── ls /data/img/2014/trip_to_sahara
    270     IMG-0001.JPG IMG-0003.JPG IMG-0005.JPG
    271     IMG-0002.JPG IMG-0004.JPG IMG-0006.JPG
    272 
    273     # be careful this time !
    274     ── rm /data/img/2014/trip_to_sahara/.filedb-47874947392
    275 
    276 And it's all good ! Don't forget to keep your drive safe, and duplicate it if
    277 you can, just in case.
    278 
    279 Hope it will be useful to someone, cheers !
    280 
    281 <!-- vim: set ft=markdown ts=4 et tw=80: -->