Dar
This is not a complete article: This is a draft, a work in progress that is intended to be published into an article, which may or may not be ready for inclusion in the main wiki. It should not necessarily be considered factual or authoritative.
Parent page: Storage and file management
The dar
(stands for Disk ARchiver) utility was written from the ground up as a modern
replacement to the classical Unix tar
tool. First released in 2002, dar
is open
source, is actively maintained, and can be compiled on any Unix-like system.
Similar to tar
,
dar
supports full / differential / incremental backups. Unlike tar
, each
dar
arhive includes a file index for fast file access and restore -- this is especially useful for large
archives! dar
has built-in compression on a file-by-file basis, making it more resilient
against data corruption, and you can optionally tell it not to compress already highly compressed files
such as mp4
and gz
. dar
supports strong encryption,
can split archives at 1-byte resolution, supports extended file attributes, sparse files, hard and
symbolic (soft) links, can detect data corruption in both headers and saved data and recover with minimal
data loss, and has many other desirable features. On the dar
page you can find a detailed feature-by-feature tar
-to-dar
comparison.
Where to find dar
Since dar
can be compiled on the command-line, you can install it easily on Linux and
MacOS. On Compute Canada clusters a slightly out-of-date version can be found in /cvmfs
:
[user_name@localhost]$ which dar
/cvmfs/soft.computecanada.ca/nix/var/nix/profiles/16.09/bin/dar
[user_name@localhost]$ dar --version
dar version 2.5.3, Copyright (C) 2002-2052 Denis Corbin
...
If you want a newer version, you can compile it from source (replace 2.6.3 with the latest version number):
[user_name@localhost]$ wget https://sourceforge.net/projects/dar/files/dar/2.6.3/dar-2.6.3.tar.gz
[user_name@localhost]$ tar xvfz dar-*.gz && /bin/rm -f dar-*.gz
[user_name@localhost]$ cd dar-*
[user_name@localhost]$ ./configure --prefix=$HOME/dar --disable-shared
[user_name@localhost]$ make
[user_name@localhost]$ make install-strip
[user_name@localhost]$ $HOME/dar/bin/dar --version
Using dar
manually
Basic archiving and extracting
Let's say, in the current directory you have a subdirectory test
. To pack it into an archive,
you can type in the current directory:
[user_name@localhost]$ dar -w -c all -g test
This will create an archive file all.1.dar
, where all
is the base name and
1
is the slice name. You can break a single archive into multiple slices (below). You can
include multiple directories and files into an archive, e.g.
[user_name@localhost]$ dar -w -c all -g testDir1 -g testDir2 -g file1 -f file 2
Please note that all paths should be relative to the current directory.
To list the archive's contents, use only the base name:
[user_name@localhost]$ dar -l all
To extract a single file into a subdirectory restore
, use the base name and the file path:
[user_name@localhost]$ dar -R restore/ -O -w -x all -v -g test/filename
The flag -O
will tell dar
to ignore file ownership. Wrong ownership would be a
problem if you are restoring someone else's files and you are not root. However, even if you are
restoring your own files, dar
will throw a message that you are doing this as non-root and
will ask you to confirm. To disable this warning, use -O
. The flag -w
will
disable a warning if restore/test
already exists.
To extract an entire directory, type:
[user_name@localhost]$ dar -R restore/ -O -w -x all -v -g test
Similar to creating an archive, you can pass multiple directories and files by using multiple
-g
flags. Note that dar
does not accept Unix wild masks after -g
.
Incremental backups
You can create differential and incremental backups with dar
, by passing the base name of
the reference archive with -A
. For example, let's say on Monday you create a full backup
named monday
:
[user_name@localhost]$ dar -w -c monday -g test
On Tuesday you modify some of the files and then include only these files into a new, incremental backup
named tuesday
, using monday
archive as a reference:
[user_name@localhost]$ dar -w -A monday -c tuesday -g test
On Wednesday you modify more files, and at the end of the day you create a new backup named
wednesday
, now using tuesday
archive as a reference:
[user_name@localhost]$ dar -w -A tuesday -c wednesday -g test
Now you have three files:
[user_name@localhost]$ ls *.dar
monday.1.dar tuesday.1.dar wednesday.1.dar
The file wednesday.1.dar
contains only the files that you modified on Wednesday, but not the
files from Monday or Tuesday. Therefore, the command
[user_name@localhost]$ dar -R restore -O -x wednesday
will only restore files that were modified on Wednesday. To restore everything, you have to go through all backups in the chronological order:
[user_name@localhost]$ dar -R restore -O -w -x monday # restore the full backup
[user_name@localhost]$ dar -R restore -O -w -x tuesday # restore the first incremental backup
[user_name@localhost]$ dar -R restore -O -w -x wednesday # restore the second incremental backup
Limiting the size of each slice
Using dar
via functions
Using dar
would be much easier if you did not have to memorize and specify all the flags on the command line. Here we provide several bash functions for easy backup. Please note that these functions assume that you are below your quota (so you can write files!), have read and write permissions, i.e. all the common-sense assumptions. It is your job to ensure that this is the case, and that dar
archived/restored your files correctly before you delete the originals. In other words, please test everything before including these functions into your workflow.