Introduction
This document has for objective to compare common backup tools under Unix (Linux, FreeBSD, MACOS X...), among the most commonly available today.
- The first target we want to address is being able to copy a directory tree and files with the best fidelity,
- The second target is being able to backup and restore a whole system from a minimal environment without assistance of an already existing local server (disaster context).
- The third target is being able to securely keep for the long term an archived data. Securely here means having the ability to detect data corruption and limit its impact on the rest of the archive.
Depending on the targets we may need compression and/or ciphering inside backup, but also denpending on the context (public cloud storage, removable media, ...), limited storage space.
Backup softwares that requires servers already running on the local network (For examples Bacula, Amanda, Bareos, UrBackup, Burp...) cannot address our second target as we would have first to reconstruct such server in case of disaster (from what then?) in order be able to restore our system and its data. They are over complex for the first target and are not suitable for the third.
Partition cloning systems (clonezilla, MondoRescue, RescueZilla, partclone, dump and consorts) are targetted at block copy and as such cannot backup a live system: you have to shutdown and boot on a CD/USB key or run in single user-mode in order to "backup". This cannot be automated and has a strong impact on the user as she/he has to interrupt her/his work during the whole backup operation.
Looking at the remaining backup tools, with or without Graphical User Interface, most of them rely on one of the three backend softwares, tar, rsync and dar:
- Software based on dar: gdar, DarGUI, Baras, Darbup, Darbrrd, HUbackup, SaraB...
- Software based on rsync: TimeShift, rsnapshot...
- Software based on tar: BackupPC, Duplicity, fwbackups...
We will thus compare these three softwares for the different test famillies described below.
Tests Famillies
Several aspects are to be considered:
Benchmark Results
The results presented here are a synthesis of the test logs. This synthesis is in turn summarized one step further in conclusion of this document.
Completness of backup and restoration
Software | plain file | symlink | hardlinked files | hardlinked sockets | hardlinked pipes | user | group | perm. | ACL | Extended Attributes | FS Attributes | atime | mtime | ctime | btime | Spares File | Disk usage optimization |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Dar | yes | yes | yes | yes | yes | yes | yes | yes | yes | yes | yes | yes | yes | - | yes(1) | yes | yes |
Rsync | yes | yes | yes | yes | yes | yes | yes | yes | yes(4) | yes(5) | - | - | yes | - | yes(1) | yes(6) | yes(6) |
Tar | yes | yes | yes | - (2) | - | yes | yes | yes | yes(7) | yes(8) | - | - | yes(3) | - | yes(1) | yes(6) | - |
- (1) "Yes" under MACoS X, FreeBSD and BSD systems. As of today (year 2020), Linux has no way to set the btime aka birthtime or yet creation time
- (2) tar does even not save and restore plain normal sockets, but that's not a big issue in fact as Unix sockets should be recreated by the applications that provide the corresponding service
- (3) unless
--xattrs
is provided, mtime is saved by tar but with an accuracy of only 1 second, while today's systems provide nanosecond precision - (4) needs -A option
- (5) needs -X option
- (6) needs -S option
- (7) needs --acl option
- (8) needs --xattrs option
See the test logs for all the details.
Feature set
In addition to the exhaustivity of the restored data (seen above), several features are a must have when creating backups. Their description and what they bring to a backup process is given below, followed by a table of how they are supported on the different softwares under test:
- Historization
- Historization is the ability to restore a deleted file even long after the mistake has been made by rotating backups over an arbitrary large number of backup set. Having associated tools to quickly locate the backup where resides a particular file's version becomes important when the history increases. Historization can be done with only full backups, but of course better leverages differential and incremental backups.
- Data filtering
-
Not all files need to be saved:
-
some directories (like
/tmp, /proc, /sys, /dev, /home/*/.cache
) are useless to save -
some files based on their name or part of their name --- their extension for
example, (like emacs's backup files
*~
or your music files*.mp3
you already have archives somewhere, and so on) need not to be saved neither. - You may wish to ignore files located one or more particular mounted filesystem, or at the opposite, only consider certains volume/disk/mounted filesystem and ignore all others, and have different backup rotation cycles for those.
- You may also find better to tag files one by one (manually by mean of an automated process of your own), to be excluded from or included in the backup
- Instead of tagging you could also let a process define a long file listing to backup and/or to ignore.
- Last, you may well need a mix of several of these mechanisms at the same time
-
some directories (like
- Slicing (or multi-volume)
-
Having a backup split into several files of given max size can address several needs:
- hold the backup on several removal media (CD, DVD, USB keys...) smaller than the backup itself
- transfer the backup from a large space to another by mean of a smaller removable media
- transfer the backup over the network and recover at the last transmitted slice rather than restarting the whole transfer in case of network issue
- store the backup int the cloud where the provider limits the file size
- be able to restore a backup on a system where storage space cannot hold both the backup and the restored system
- transfer back from the cloud only a few slices to restore some files, when cloud provider does not provide adhoc protocols (sftp, ftp, ...) but only a user web based interface
Last the previously identified use cases for backup slicing turn around limited storage space, thus having compression available when multi-volume is used is a key point here. - Symmetric strong encryption
-
Symmetric strong encryption is the ability to cipher a backup with a password or passphrase and use that same key to decipher it. Some
well known algorithms in this area are AES, blowfish, camellia...
Symmetric strong encryption is interesting for the following cases:- if your disk is ciphered, would you store your backup in clear on the cloud?
- you do not trust your cloud provider to not inspect your data and make marketing profile of yourself with it.
- You want to prevent your patented data or industrial secret recipies from falling into the competition's hands or goverment agencies that could clone it without fear of being prosecuted. This use case applies whether your backup is stored on local disk, removable media or public cloud.
- Simply because in your country, you have the right and the freedom to have privacy.
- Because your today democratic country could tomorrow verse into a dictatorship and based on some arbitrary criteria, (belief, political opinion, sexual orientation...) you could suffer tomorrow from having this information having been accessible today to the authorities or even having been publicly released, while you still need backup using arbitrary storage medium.
- Asymmetric strong encryption
-
Asymmetrical strong encryption is the ability to cipher a backup with a public key and having the corresponding private key for deciphering it (PGP, GnuPG...).
Asymmetric encrypion is mainly interesting when exchanging data over Internet between different persons, or eventually for archiving data in the public cloud. Having it for backup seems not appropriate and is more complex than symmetric strong encryption, as restoration requires the private key, which thus must be stored outside the backup itself still be protected from unauthorized access. The private key use can still be protected with a password or a passphrase but this gives the same feature level as symmetrical encryption with a more complex process and not much more security. - Protection against plain-text attack
- Ciphering data must be done with a minimum level of security, in particular when the ciphered data has well defined structure and patterns, like a backup file format is expected to have. Knowing such expected structure of the clear data may lead an attacker to undisclose the whole ciphered data. This is known as plain-text attack.
- Key derivation function
-
- Using the same password/passphrase for different backups is convenient but not secure. Having a key derivation function using a salt let you use the same password/passphrase while the data will be encrypted with a different key each time, this is the role of the Key Derivation Function (KDF) (PKCS5/PBKDF2, Argon2...).
- Another need for a KDF is that usually the human provided password/passphrase are weak: Even when we use letters, digits and some special characters, passwords and passphrases are still located in a small area of possible keys that a dictionnary attack can leverage. As the KDF is also by design CPU intensive, it costs a lot of effort and time to an attacker to derive each word of a dictionnary to its resulting KDF transformed words. The required time to perform a dictionnary attack can thus be multiplied by several hundred thousand times, leading to an effective time of tens of years and even centuries rather than hours or days.
- File change detection
-
When backing up a live system, it is important to detect, retry saving or flag files that changed during the time
they were read for backup. In such situation, the backed file could be recorded in a state it never had: As the backup process
reads sequentially from the beginning to the end, if a modification A is done at the end of file then a
modification B is made at its beginning during this file's backup, the backup may contain B and not A
while at not time
the file contained B without A. Seen the short time a file can be read, time accuracy of micro or nanoseconds
is mandatory to detect such file change during a backup process, else you will screw up your data in the backup and have nothing
to rely on in the occurence of a deleted file by mistake, disk crash or disaster.
At restoration time, if the file has been saved anyway, it should be good to know the such file was not saved properly, maybe restoring a older version but a sane one would be better. Something the user/sysadmin cannot guess if the backup does not hold such type of information. - Multi-level backup
-
Multi-level backup is the ability to make use of full backups, differential backups and/or eventually incremental backups.
The advantage of differential and incremental backups compared to full ones is the much shorter time they require to complete and the reduces storage space and/or bandwidth they imply when transfered over the network. - Binary delta
- Without binary delta, when performing a differential or incremental backup, if a file has changed since the previous backup, it will be resaved entirely. Some huge files made by some well know applications (mailboxes for example) would consume a lot of storage space and lead to a long backup time even when performing incremental or differential backups. Binary delta is the ability to only store the part of a file that changed since a reference state, this lead to important space gain and reduction of the backup duration.
- Detecting suspicious modifications
- When performing a backup based on a previous one (differential, incremental, decremental backups), it is possible to check the way the metadata of saved files have changed until then and warn the user when some uncommon pattens are met. Those may be the trace of a rootkit, virus, ransomware or trojan, trying to hide its presence and activities.
- Snapshot
-
A snapshot is like a differential backup made right after the full backup (no file has changed): it is a minimal
set of information that can be used to:
- create an incremental or differential backup without having the full backup around or more generally the backup of reference: When backup are stored remotely, snapshot is a must.
- compare the current living filesystem with a status it had at the time the snapshot was made
- bring some metadata redundancy and repairing mean to face a corrupted backup
- On-fly hashing
-
On-fly hashing is the ability to generate a hashing of the backup at the same time it is generated and before it is written
to storage. Such hash can be used to:
- validate a backup has been properly transfered to a public storage cloud having hash computation done in parallel
- check that no data corruption has occured (doubt about disk or memory) even when the backup is written to local disk
- Run custom command during operation
- For an automated backup process, it is often necessary to run commands before and after the backup operation itself. But also during the backup process. For example, when entering a directory, one could need to run an arbitrary command generating a file that will be included in the backup. Or while exiting such directory performing some cleanup operation in that same directory. Another use case is found when slicing the backup, by the ability to perform after each slice generation a custom operation like uploading the slice to cloud, burning to DVD-/+RW, loading a tape from a tape library...
- Dry-run execution
- When tuning a backup process, it is often necessary to verify quickly that all will work flawlessly without having to wait for a backup to complete, consume storage resource and network bandwidth.
- User message within backup
- Allowing the user to add an arbitrary message within the backup may be useful when the filename is too small to hold the needed information (like the context the backup or archive was made, hint for the passphrase... and so on).
- Backup sanity test
- It is crutial in a backup process to validate that the generated backup is usable. There are many reasons it could not be the case, from a data corruption in memory, on disk or over the network ; a disk space saturation leading to truncated backup, down to a software bug.
- Comparing with original data
- One step further for backup and archiving validation is compairing file content and metadata with the system it has.
- Tunable verbosity
- When a backup process is in production and works nicely, it is usually interesting to have the minimal output possible for that any error still be possible to log. While when setting up a backup process, having more detailed information is required to understand and validate that the backup process follows the expected path.
- Modify the backup's content
-
Once a backup has been completed, you might notice that you have saved extra files you ought not to save. Being able to drop
them from the backup to save some space without having to restart the whole backup may lead to a huge time saving.
You might also need to add some extra files that were outside the backup scope, having the possibility to add them without restarting the whole backup process may also lead to a huge time saving. - Stdin/stdout backup read/write
- Having the ability to pipe the generated backup to an arbitrary command is on of the ultimate key of backup software flexibility.
- Remote network storage
- This is the ability to produce directly a backup to a network storage without using local disk, and to be able to restore directly reading a backup from the such remote storage still without using local storage. Network/Remote storage is to be understood as remote network storage like public cloud, private cloud, personal NAS... that are accesible from the network by mean of a file transfer protocols (scp, sftp, ftp, rcp, http, https...)
Feature | Dar | Rsync | Tar |
---|---|---|---|
Historization | Yes | - | Yes |
Data filtering by directory | Yes | Yes | Yes |
Data filtering by filename | Yes | Yes | limited |
Data filtering by filesystem | Yes | limited | limited |
Data filtering by tag | limited | - | - |
Data filtering by files listing | Yes | yes | limited |
Slicing/multi-volume | Yes | - | limited |
Symmetric encryption | Yes | - | Yes |
Asymmetric encryption | Yes | - | Yes |
Plain-text attack protection | Yes | - | - |
PBKDF2 Key Derivation Function | Yes | - | - |
ARGON2 Key Derivation Function | Yes | - | - |
File change detection | Yes | - | limited |
Multi-level backup | Yes | - | Yes |
Binary delta | Yes | Yes | - |
Detecting suspicious modifications | Yes | - | - |
Snapshot for diff/incr. backup | Yes | - | Yes |
Snapshot for comparing | Yes | - | - |
Snapshot for redundancy | Yes | - | - |
On-fly hashing | Yes | - | - |
Run custom command during operation | Yes | - | limited |
Dry-run execution | Yes | Yes | - |
User message within backup | Yes | - | - |
Backup sanity test | Yes | - | Yes |
Comparing with original data | Yes | - | Yes |
Tunable verbosity | Yes | Yes | limited |
Modify the backup's content | Yes | Yes | limited |
Stdin/stdout backup read/write | Yes | - | Yes |
Remote network storage | Yes | limited | Yes |
The presented results above is a synthesis of the test logs
Robustness
The objective here is to see how a minor data corruption can impacts the backup. Such type of corruption (a single bit invertion) can be caused by network transfert, cosmic particle hitting the memory bank, or simply due to the time passing stored on a particular medium. In real life data corruption may impact more than one bit, right. But if the ability to workaround the corruption of a single bit does not bring any information about the ability to recover larger volume of data corruption, the inability to recover a single bit, is enough to know that the same software will behave even worse when larger portion of data corruption will be met.
Behavior | Dar | Rsync | Tar alone | Tar + gzip |
---|---|---|---|---|
Detects backup corruption | Yes | - | - | Yes |
Warn or avoid restoring corrupted data | Yes | - | - | Yes |
Able to restore all files not concerned by the corruption | Yes | Yes | Yes | - |
To protect your data, you can go one step further computing data redundancy with Parchive on top of your backup or archives. This will allow you to repair them in case of corruption.
- Though, rsync is not adapted to that process as creating a global redundancy of a directory tree is much more complex and error-prone. At the opposite, tar and dar are pretty well adapted as a backup may be a single file or a few big files if using slicing or multi-volume backup.
- Second, whatever is the redundancy level you select, if the data corruption exceed this level, you will not be able to repair your backups and archives. Thus, better relying on a robust and redundant backup file structure, and here dar has some big advantages.
- Last, if execution time is important for you, having a sliced backup with a slice size smaller than the available RAM and running Parchive right after each slice created, will save a lot of disk I/O and can speed up the overall process by more than 40%. But here too, only dar provides this possibility.
The presented results above is a synthesis of the test logs.
Performance
In the following, we have distinguished two purposes of backup tools: the "identical" copy of a set of files and directories (short term operation) and the usual backup operation (long term storage and historization).
Performance of file copy operation
The performance aspect to consider for this target is exclusively the execution speed, this may imply data reduction on the wire only if the bandwidth is low enough for the compression processing time added does not ruine the gain on transfer time. Compression time is not dependent on the backup tool but on the data, and we will see in the backup performances tests, the way the different backup tools do reduce data on the wire. For the execution time we get the following results:
Single huge file
The copied data was a Linux distro installation ISO file
Linux system
The copied data was a fresh fully featured Linux installed system
Conclusion
for local copy cp is the fastest but totally unusable for remote copy. At first sight one could think tar would be the best alternative for remote copy, but that would not take into account the fact you will probably want to use secured connection (unless all segments of the underlying network are physically yours, end to end). Thus once the backup will be generated, using tar will require an extra user operation, extra computing time to cipher/decipher and time to transfer the data while both alternatives, rsync and dar, have it integrated: they can copy and transfer at the same time, with both the gain of time and the absence of added operations for the user.
In consequence, for remote copy, if this is for a unique/single remote copy, dar will be faster than rsync most of the time (even when using compression to cope with low bandwidth, see the backup test results, below). But for recurring remote copy even if rsync is not faster that dar, it has the advantage of being designed espetially for this task as in that context we do not need to store the data compressed nor ciphered. Things we can summarize as follows:
Operation | Best Choice | Alternative |
---|---|---|
Local copy | cp | tar |
One-time remote copy | dar | rsync |
recurrent remote copy | rsync | dar |
See the corresponding test logs for more details
Performance of backup operation
For backup we consider the following criteria by order of importance:
- data reduction on backup storage
- data reduction when transmitted over the network
- execution time to restore a few files
- execution time to restore a full and differential backups
- execution time to create a full and differential backups
Why this order?
- Because usually backup creation is done at low priority in background and on a day to day basis, the execution time is less important than reducing the storage usage: reducing storage usage gives longer backup history and increases the ability to recover accidentically removed files much later after the mistake has been done (which may be detected weeks or months afterward).
- Next, while your backup storage can be anything, including low cost or high end dedicated one, we see more and more frequently externalized backups, which main declinaison is based on public cloud storage, leading to relatively cheap disaster recovery solution. However, your WAN/Internet acces will be drained by the backup volumes flying away and you probably don't want them to consume too much of this bandwidth which could slow down your business or Internet access. As a workaround, one could rate-limit the bandwidth for backup exchanges only. But doing so will extend the backup transfer time so much that you may have to reduce the backup frequency to not have two backups transfered at the same time. This would lead you to lose accuracy of saved data: A too low backup frequency will only allow you to restore your systems in the state they had several days instead of several hours or several tens of minutes, before the disaster occured. For that reason data reduction on the wire is the second criterium. Note that data reduction on storage usually implies data reduction on the wire, but the opposite is not always true, depending on the backup tool used.
- Next, it is much more frequent to have to restore a few files (corrupted or deleted by mistake) and we need this to be quick because this is an interactive operation and that the missing data is mandatory to go forward for one's work, which workflow may impact several other persons.
- The least frequent operation (hopefully) is the restoration of a whole system in case of disaster. Having it performing quick is of course important, but less than having a complete, robust, accurate and recent backup somewhere, that you can count on to restore your systems in the most recent possible state.
Note that the following result do not take into account the performance penalty implied by the network latency. Several reasons to that:
- it would not measure the software performance but the network bandwidth and latency which is not the object of this benchmark and may vary with distance, link layer technology and number of devices crossed,
- We can assume the network penalty to be proportional to data processed by each software, as all protocol used are usually TCP based (ftp, sftp, scp, ssh, ...), which performance is related to the operating system parameters (window size, MTU, etc.) not to the backup software itself. As we only rely on tmpfs filesystems for this benchmark to avoid mesuring the disk I/O performance, we may approximate that a network latency increase or a reduction of network bandwidth would just inflate the relative execution time of the different tested softwares in a linear manner. In other words, adding network between system and backup storage should thus not modify the relative performances of the softwares under test.
For all the backup performance tests that follow (but not for file copy performance tests seen above), compression has been activated using the same and most commonly supported algorithm: gzip at level 6. Other algorithms may complete faster or provide better compression ratio, but this is linked to chosen compression algorithm and data to compress, not to the backup tools tested here.
Data reduction on backup storage
Full backup
Differential backup
Full + Differential backup
This is a extrapolation of the required volume for backup, after one week of daily backup of the Linux system under test, assuming the activity is as minimal each day as it was here between the initial day of the full backup and the day of the first differential backup (a few package upgrade and no user activity).
This previous results concerns the backup of a steady Linux system, relative difference of data reduction might favorize both rsync and dar+binary delta when the proportion of large files being slightly modified increases (like mailboxe files).
Data reduction over network
Full backup
Differential backup
Full + Differential backup
This is the same extrapolation done above (one week of daily backup), but for the volume of data transmitted over the network instead of the backup volume on storage.
Execution time to restore a few files
Here the phenomenum is even more important when the file to restore is located near the end of the tar backup, as tar sequentially reads the whole backup up to the requested file.
Execution time to restore a whole system - full backup
Execution time to restore a single differential backup
Execution time to restore a whole system - full + differential backup
We use here the same extrapolation of a week of daily backup done above: the first backup being a full backup and differential/incremental backups done the next days.
Clarifying the terms used: the differential backup saves only what has changed since the full backup was made. The consequence is that each day the backup is slightlty bigger to process, depending on the way data changed (if all files change every day, like mailboxes, user files, ...) each new differential backup will have the same size and take the same processing time to complete. At the opposite, if new data is added each day, the differential backup size will be each day the sum of the incremental backups that could be done instead since the full backup was made.
At the difference of the differential backup, the incremental backup saves only what has changed since the last backup (full or incremental). For constant activity like the steady Linux system we used here, the incremental backup size should be the same along the time (and equivalent to the size of the first differential backup), thus the extrapolation is easy and not questionable: the restoration time is the time to restore the full and the time to restore the first differential backup times the number of days that passed.
Execution time to restore a whole system - lower bound
The lower bound, is the sum of the execution time of the restoration of the full backup and one differential backup seen just above. It corresponds the minimum execution time restoring a whole system from full+differnential backup.
Execution time to restore a whole system - higher bound
The higher bound, is the sum of the execution time of the restoration plus seven times the execution time of the differential backup. It corresponds the worse case scenario where each day new data is added (still using a steady Linux system with constant activity). It also corresponds the scenario of restoring a whole system from a full+incremental backups (7 incremental backup have to be restored, in that week span scenario):
Execution time to create a backup
Ciphering/deciphering performance
There is several reasons that implies the need of ciphering data:
- if your disk is ciphered, would you store your backup in clear on the cloud?
- do you trust your cloud provider to not inspect your data for marketing profiling?
- Are you sure your patented data, secret industrial recipies will not be used by competition?
- and so on
The ciphering execution time is independent on the nature of the backup, full or differential, compressed or not. To evaluate the ciphering performance we will use the same data sets as previously, both compressed and uncompressed. However not all software under test are able to cipher the resulting backup. rsync is not able to do so.
Full backup+restoration execution time
Execution time for the restoration of a single file
Storage requirement ciphered without compression
See the corresponding test logs for more details.
Conclusion
So far we have measured different perfomance aspects, evaluated available features, tested backup robusness and observed backup exhaustivity of the different backup softwares under test. This gives a lot of information already summarized above. But it would still not be of a great use to anyone reading this document (espetially the one jumping to its conclusion ;^) ) so we have to get back to use cases and their respective requirements to obtain the essential oil drop anyone can use immediately:
Criteria for the different use cases
Use Cases | Key Point | Optional interesting features |
---|---|---|
Local directory copy |
|
|
remote directory copy - wide network |
|
|
remote directory copy - narrow network |
|
|
Full backups only |
|
|
full+diff/incr. backup |
|
|
Archiving of private data |
|
|
Archiving of public data |
|
|
Private data exchange over Internet |
|
|
Public data exchange over Internet |
|
|
Complementary criteria depending on the storage type
And depending on the target storage, the following adds on top:
Use Cases | Key Point | Optional interesting features | Local disk |
|
|
---|---|---|
Data stored on private NAS |
|
|
Data stored on public cloud |
|
|
Data stored on removable media (incl. tapes) |
|
|
Essential oil drop
In summary, putting in front of these requirements the different measures we did:
- exhasitivity of backed up data
- available features around backup
- backup robustness facing to media corruption
- overall performance
We can summarize the best software to put in front of each particular use case:
Use Cases | Local disk storage | Private NAS | Public Cloud | Removable media |
---|---|---|---|---|
Local directory copy |
cp
dar not the fastest
rsync not the fastest
tar not the fastest
|
- | - | - |
One time remote directory copy | - |
dar
rsyncnot the fastest
tarno network protocol embedded
|
dar
rsyncnot the fastest
tarno network protocol embedded
|
dar
rsyncnot the fastest
tarno network protocol embedded
|
Recurrent remote directory copy | - |
darfastest but automation is a bit less straight forward than using rsync
rsync
tarno network protocol embedded
|
darfastest but automation is a bit less straight forward than using rsync
rsync
tarno network protocol embedded
|
darfastest but automation is a bit less straight forward than using rsync
rsync
tarno network protocol embedded
|
Full backups only (private data) |
darhas the advantage to provide long historization of backups
rsyncno data reduction on storage, slow to restore a whole filesystem
tarnot saving all file attributes and inode types, slow to restore a few files
|
dar
rsyncno data reduction on storage
tarnot saving all file attributes and inode types, slow to restore a few files, no network protocol embedded
|
dar
rsyncno data ciphering and no reduction on storage
tarnot embedded ciphering, not the strongest data encryption, not saving all file attributes and inode types, slow to restore a few files, no network protocol embedded
|
dar
rsyncno multi-volume support, no data ciphering and no reduction on storage
tarcompression and multi-volume are not supported at the same time, not saving all file attributes and inode types, not embedded ciphering, not the strongest data encryption
|
full+diff/incr. backups (priate data) |
dar
rsyncdifferential backup not supported, full backup is overwritten
tarnot saving all file attributes and inode types, slow to restore a few files
|
dar
rsyncdifferential backup not supported, full backup is overwritten
tarnot saving all file attributes and inode types, slow to restore a few files, no network protocol embedded
|
dar
rsyncdifferential backup not supported, full backup is overwritten
tarnot embedded ciphering, not the strongest data encryption, not saving all file attributes and inode types, slow to restore a few files, no network protocol embedded
|
dar
rsyncdifferential backup not supported, full backup is overwritten, no support for multi-volime, no data reduction, no ciphering
tarcompression and multi-volume are not supported at the same time, not saving all file attributes and inode types, not embedded ciphering, not the strongest data encryption
|
Archiving of private data |
dar
rsyncno data reduction on storage, no detection of data corruption, complex parity data addition
tarno detection of data corruption or loss of all data after the first corruption met
|
dar
rsyncno data reduction, no detection of data corruption, complex parity data addition
tarno detection of data corruption or loss of all data after the first corruption met
|
dar
rsyncno ciphering, no data reduction, no detection of data corruption, complex parity data addition
tarno detection of data corruption or loss of all data after the first corruption met, no embedded ciphering, no protection against plain-text attack
|
dar
rsyncno data reduction, no multi-volume, no ciphering, no detection of data corruption, complex parity data addition
tarcompression and multi-volume are not supported at the same time, no detection of data corruption or loss of all data after the first corruption met, no ciphering
|
Archiving of public data |
darmost robust format but not as standard as tar's
rsyncno reduction on storage
tar
|
darmost robust archive format but not as standard as tar's
rsyncno reduction on storage, complicated to download a directory tree and files from other protocols than rsync
tar
|
darmost robust archive format but not as standard as tar
rsyncno reduction on storage, complicated to download a directory tree and files from other protocols than rsync
tar
|
dar
rsyncno reduction on storage, no multi-volume, no detection of data corruption, complex parity data addition
tarcompression and multi-volume are not supported at the same time
|
Private data exchange over Internet |
dar
rsyncnot the best data reduction over the network
tarbest data reduction on network but no embedded ciphering, no integrated network protocols
|
dar
rsyncno data reduction on storage, not the best data reduction over the network
tarbest data reduction on network, but lack of embedded ciphering, lack of integrated network protocols
|
dar
rsyncno ciphering and no data reduction on storage
tarno embedded ciphering, no integrated network protocols, no protection against plain-text attack, only old KDF functions supported, complex and error prone use of openssl to cipher the archive
|
- |
Public data exchange over Internet |
darnot the best data reduction over the network
rsyncnot the best data reduction over the network
tar
|
darnot the best data reduction over the network
rsyncno data reduction on storage, not the best data reduction over the network
tar
|
darnot the best data reduction over the network
rsyncno data reduction on storage, not the best data reduction over the network
tar
|
- |
In each cell of the previous table, the different softwares are listed in alphabetical order, they get colorized according to the following code:
Color codes | best solution |
good solution |
not optimal |
not adapted |
---|
Hovering the mouse on a particular item gives more details about the reason it has not been selected as the best solution for a particular need.