out of date, confusing
- Ars Technica Forums: The state of opensource backups
- Using a USB external hard disk for backups with Linux
"Delta based incrementals make sense for tape drives. You run a full backup once, then incremental deltas for every day. When enough time has passed since the full backup, you do a new full backup, and then future incrementals are based on that. Repeat forever."
Versioning with rsync
- The rsync algorithm - an algorithm for updating a file on one machine to be identical to a file on another machine. We assume that the two machines are connected by a low-bandwidth high-latency bi-directional communications link. The algorithm identifies parts of the source file which are identical to some part of the destination file, and only sends those parts which cannot be matched in this way. Effectively, the algorithm computes a set of differences without having both files on the same machine. The algorithm works best when the files are similar, but will also function correctly and reasonably efficiently when the files are quite different.
- rsnapshot - Local filesystem snapshots are handled with rsync. Secure remote connections are handled with rsync over ssh, while anonymous rsync connections simply use an rsync server. Both remote and local transfers depend on rsync. rsnapshot saves much more disk space than you might imagine. The amount of space required is roughly the size of one full backup, plus a copy of each additional file that is changed. rsnapshot makes extensive use of hard links, so if the file doesn't change, the next snapshot is simply a hard link to the exact same file.
- Grsync - an rsync GUI (Graphical User Interface). Rsync is the well-known and powerful command line directory and file synchronization tool. Grsync makes use of the GTK libraries and is released under the GPL license, so it is opensource. It doesn't need the gnome libraries to run, but can of course run under gnome, kde or unity pretty fine. It can be effectively used to synchronize local directories and it supports remote targets as well (even though it doesn't support browsing the remote folder). Sample uses of grsync include: synchronize a music collection with removable devices, backup personal files to a networked drive, replication of a partition to another one, mirroring of files, etc.
Arno's SmartBackup Script
- Arno's SmartBackup Script - 'intelligent' version of rsync
- luckyBackup is an application that backs-up and/or synchronizes any directories with the power of rsync. It is simple to use, fast (transfers over only changes made and not all data), safe (keeps your data safe by checking all declared directories before proceeding in any data manipulation ), reliable and fully customizable
- zsync is a file transfer program. It allows you to download a file from a remote server, where you have a copy of an older version of the file on your computer already. zsync downloads only the new parts of the file. It uses the same algorithm as rsync. However, where rsync is designed for synchronising data from one computer to another within an organisation, zsync is designed for file distribution, with one file on a server to be distributed to thousands of downloaders. zsync requires no special server software — just a web server to host the files — and imposes no extra load on the server, making it ideal for large scale file distribution.
- https://github.com/ryt/psync Psync (inspired by grsync) makes it easy to use rsync with multiple apps/sites.
- Rclone - a command-line program to manage files on cloud storage. It is a feature-rich alternative to cloud vendors' web storage interfaces. Over 70 cloud storage products support rclone including S3 object stores, business & consumer file storage services, as well as standard transfer protocols. Rclone has powerful cloud equivalents to the unix commands rsync, cp, mv, mount, ls, ncdu, tree, rm, and cat. Rclone's familiar syntax includes shell pipeline support, and --dry-run protection. It is used at the command line, in scripts or via its API. Users call rclone "The Swiss army knife of cloud storage", and "Technology indistinguishable from magic". Rclone really looks after your data. It preserves timestamps and verifies checksums at all times. Transfers over limited bandwidth; intermittent connections, or subject to quota can be restarted, from the last good file transferred. You can check the integrity of your files. Where possible, rclone employs server-side transfers to minimise local bandwidth use and transfers from one provider to another without using local disk. Virtual backends wrap local and cloud file systems to apply encryption, compression, chunking, hashing and joining. Rclone mounts any local, cloud or virtual filesystem as a disk on Windows, macOS, linux and FreeBSD, and also serves these over SFTP, HTTP, WebDAV, FTP and DLNA. Rclone is mature, open-source software originally inspired by rsync and written in Go. The friendly support community is familiar with varied use cases. Official Ubuntu, Debian, Fedora, Brew and Chocolatey repos. include rclone. For the latest version downloading from rclone.org is recommended. Rclone is widely used on Linux, Windows and Mac. Third-party developers create innovative backup, restore, GUI and business process solutions using the rclone command line or API.  
- https://github.com/rclone/rclone - "rsync for cloud storage" - Google Drive, S3, Dropbox, Backblaze B2, One Drive, Swift, Hubic, Wasabi, Google Cloud Storage, Yandex Files
- https://en.wikipedia.org/wiki/Rclone - an open source, multi threaded, command line computer program to manage or migrate content on cloud and other high latency storage. Its capabilities include sync, transfer, crypt, cache, union, compress and mount. The rclone website lists supported backends including S3 and Google Drive.
- synk - a multi-host synchronisation tool backed by rsync(1) and ssh(1).
- rdiff-backup backs up one directory to another, possibly over a network. The target directory ends up a copy of the source directory, but extra reverse diffs are stored in a special subdirectory of that target directory, so you can still recover files lost some time ago. The idea is to combine the best features of a mirror and an incremental backup. rdiff-backup also preserves subdirectories, hard links, dev files, permissions, uid/gid ownership, modification times, extended attributes, acls, and resource forks. Also, rdiff-backup can operate in a bandwidth efficient manner over a pipe, like rsync. Thus you can use rdiff-backup and ssh to securely back a hard drive up to a remote location, and only the differences will be transmitted. Finally, rdiff-backup is easy to use and settings have sensical defaults.
- man, readme, examples, wiki
- sync and rdiff-backup do not share any code, but rdiff-backup uses the rsync algorithm
- rdiffWeb is a web interface for browsing and restoring from rdiff-backup repositories. It is written in Python and is distributed under the GPL license.
duply (simple duplicity)
- Duplicity backs directories by producing encrypted tar-format volumes and uploading them to a remote or local file server. Because duplicity uses librsync, the incremental archives are space efficient and only record the parts of files that have changed since the last backup. Because duplicity uses GnuPG to encrypt and/or sign these archives, they will be safe from spying and/or modification by the server.
- Déjà Dup (day-ja-doop) is a simple backup tool. It hides the complexity of doing backups the Right Way (encrypted, off-site, and regular) and uses duplicity as the backend.
- Duplicati is a free backup client that securely stores encrypted, incremental, compressed backups on cloud storage services and remote file servers. It works with Amazon S3, Windows Live SkyDrive, Google Drive (Google Docs), Rackspace Cloud Files or WebDAV, SSH, FTP (and many more). Duplicati has built-in AES-256 encryption and backups can be signed using GNU Privacy Guard. A built-in scheduler makes sure that backups are always up-to-date. Last but not least, Duplicati provides various options and tweaks like filters, deletion rules, transfer and bandwidth options to run backups for specific purposes.
- Duplicity redone in C#
- not that CLI ready?
- Areca Backup is a personal file backup software developed in Java.
- zip/zip64 only
- No deduplication
- BackupPC is a high-performance, enterprise-grade system for backing up Linux and WinXX PCs and laptops to a server's disk. BackupPC is highly configurable and easy to install and maintain. Given the ever decreasing cost of disks and raid systems, it is now practical and cost effective to backup a large number of machines onto a server's local disk or network storage. This is what BackupPC does. For some sites, this might be the complete backup solution. For other sites, additional permanent archives could be created by periodically backing up the server to tape. A variety of Open Source systems are available for doing backup to tape. BackupPC is written in Perl and extracts backup data via SMB using Samba, tar over ssh/rsh/nfs, or rsync. It is robust, reliable, well documented and freely available as Open Source on SourceForge.
- Supports NFS, SSH, SMB and rsync
- Perl with web interface
- Deduplication via hardlinks
- AMANDA, the Advanced Maryland Automatic Network Disk Archiver, is a backup solution that allows the IT administrator to set up a single master backup server to back up multiple hosts over network to tape drives/changers or disks or optical media. Amanda uses native utilities and formats (e.g. dump and/or GNU tar) and can back up a large number of servers and workstations running multiple versions of Linux or Unix. Amanda uses a native Windows client to back up Microsoft Windows desktops and servers.
"the Amanda planner runs on the server to decide exactly how to go about backing things up. It, too, contacts each Amanda client and requests an estimate of the size of full and incremental dumps for each DLE. It then does some complex planning based on the history of each DLE, the estimated sizes, the available storage space, and a number of tweakable parameters to decide what to back up. This often confuses newcomers, who have control issues and want to tell Amanda when to do full backups and when to do incrementals. The planner is one of Amanda's strengths! Don't fight it!"
- unique backup format
- config over inc/diff/full 'interesting'
- no web interface?
- "a beast to get up and running. But it is lighting fast"
- calendar based
- Backupninja allows you to coordinate system backup by dropping a few simple configuration files into /etc/backup.d/. Most programs you might use for making backups don't have their own configuration file format. Backupninja provides a centralized way to configure and schedule many different backup utilities. It allows for secure, remote, incremental filesytem backup (via rdiff-backup), compressed incremental data, backup system and hardware info, encrypted remote backups (via duplicity), safe backup of MySQL/PostgreSQL databases, subversion or trac repositories, burn CD/DVDs or create ISOs, incremental rsync with hardlinking.
- Disk ARchive is a shell command that backs up directory trees and files, taking care of hard links, Extended Attributes, sparse files, MacOS's file forks, any inode type (including Solaris Door inodes), etc.
- backup2l - low-maintenance backup/restore tool. backup2l is a lightweight command line tool for generating, maintaining and restoring backups on a mountable file system (e. g. hard disk). The main design goals are are low maintenance effort, efficiency, transparency and robustness. In a default installation, backups are created autonomously by a cron script. supports hierarchical differential backups with a user-specified number of levels and backups per level. With this scheme, the total number of archives that have to be stored only increases logarithmically with the number of differential backups since the last full backup. Hence, small incremental backups can be generated at short intervals while time- and space-consuming full backups are only sparsely needed.
- Obnam is an easy, secure backup program. Snapshot backups. Every generation looks like a complete snapshot, so you don't need to care about full versus incremental backups, or rotate real or virtual tapes. Data de-duplication, across files, and backup generations. If the backup repository already contains a particular chunk of data, it will be re-used, even if it was in another file in an older backup generation. This way, you don't need to worry about moving around large files, or modifying them. Encrypted backups, using GnuPG.
- sounds well thought out, slow with sftp?
- no web interface
- unique format
- author suffers from NIH ;)
- network backup
- constant backup, snapshots available
- client side encryption
- ZBackup - a globally-deduplicating backup tool, based on the ideas found in rsync. Feed a large .tar into it, and it will store duplicate regions of it only once, then compress and optionally encrypt the result. Feed another .tar file, and it will also re-use any data found in any previous backups. This way only new changes are stored, and as long as the files are not very different, the amount of storage required is very low. Any of the backup files stored previously can be read back in full at any time. The program is format-agnostic, so you can feed virtually any files to it (any types of archives, proprietary formats, even raw disk images -- but see Caveats). This is achieved by sliding a window with a rolling hash over the input at a byte granularity and checking whether the block in focus was ever met already. If a rolling hash matches, an additional full cryptographic hash is calculated to ensure the block is indeed the same. The deduplication happens then.
The program has the following features:
- Parallel LZMA or LZO compression of the stored data
- Built-in AES encryption of the stored data
- Possibility to delete old backup data
- Use of a 64-bit rolling hash, keeping the amount of soft collisions to zero
- Repository consists of immutable files. No existing files are ever modified
- Written in C++ only with only modest library dependencies
- Safe to use in production (see below)
- Possibility to exchange data between repos without recompression
- UrBackup - Client/Server Open Source Network Backup for Windows and Linux. An easy to setup Open Source client/server backup system, that through a combination of image and file backups, accomplishes both data safety and a fast restoration time. File and image backups are made while the system is running without interrupting current processes. UrBackup also continuously watches folders you want backed up in order to quickly find differences to previous backups. Because of that, incremental file backups are really fast. Your files can be restored through the web interface, via the client or the Windows Explorer while the backups of drive volumes can be restored with a bootable CD or USB-Stick (bare metal restore).
- https://github.com/bup/bup - Very efficient backup system based on the git packfile format, providing fast incremental saves and global deduplication (among and within files, including virtual machine images). Current release is 0.29, and the development branch is master.
Doesn't remove large deleted files from archive?
- Burp is a network backup and restore program. It uses librsync in order to save network traffic and to save on the amount of space that is used by each backup. It also uses VSS (Volume Shadow Copy Service) to make snapshots when backing up Windows computers.
- https://github.com/borgbackup/borg - fork of Attic, a deduplicating backup program. Optionally, it supports compression and authenticated encryption. The main goal of Borg is to provide an efficient and secure way to backup data. The data deduplication technique used makes Borg suitable for daily backups since only changes are stored. The authenticated encryption technique makes it suitable for backups to not fully trusted targets. Fork of Attic.
usage: borg create [-h] [-v] [--debug] [--lock-wait N] [--show-rc] [--no-files-cache] [--umask M] [--remote-path PATH] [-s] [-p] [--filter STATUSCHARS] [-e PATTERN] [--exclude-from EXCLUDEFILE] [--exclude-caches] [--exclude-if-present FILENAME] [--keep-tag-files] [-c SECONDS] [-x] [--numeric-owner] [--timestamp yyyy-mm-ddThh:mm:ss] [--chunker-params CHUNK_MIN_EXP,CHUNK_MAX_EXP,HASH_MASK_BITS,HASH_WINDOW_SIZE] [-C COMPRESSION] [--read-special] [-n] ARCHIVE PATH [PATH ...] # Create new archive
# Backup ~/Documents into an archive named "my-documents" $ borg create /mnt/backup::my-documents ~/Documents # Backup ~/Documents and ~/src but exclude pyc files $ borg create /mnt/backup::my-files \ ~/Documents \ ~/src \ --exclude '*.pyc' # Backup the root filesystem into an archive named "root-YYYY-MM-DD" # use zlib compression (good, but slow) - default is no compression NAME="root-`date +%Y-%m-%d`" $ borg create -C zlib,6 /mnt/backup::$NAME / --do-not-cross-mountpoints
-e PATTERN, --exclude PATTERN exclude paths matching PATTERN --exclude-from EXCLUDEFILE read exclude patterns from EXCLUDEFILE, one per line --exclude-caches exclude directories that contain a CACHEDIR.TAG file (http://www.brynosaurus.com/cachedir/spec.html) --exclude-if-present FILENAME exclude directories that contain the specified file -x, --one-file-system stay in same file system, do not cross mount points
- Holland - an Open Source backup framework originally developed by Rackspace and written in Python. It’s goal is to help facilitate backing up databases with greater configurability, consistency, and ease. Holland currently focuses on MySQL, however future development will include other database platforms and even non-database related applications. Because of it’s plugin structure, Holland can be used to backup anything you want by whatever means you want.
- Kopia - a fast and secure open-source backup/restore tool that allows you to create encrypted snapshots of your data and save the snapshots to remote or cloud storage of your choice, to network-attached storage or server, or locally on your machine. Kopia does not 'image' your whole machine. Rather, Kopia allows you to backup/restore any and all files/directories that you deem are important or critical. Kopia has both CLI (command-line interface) and GUI (graphical user interface) versions, making it the perfect tool for both advanced and regular users. You can read more about Kopia's unique features -- which include compression, deduplication, end-to-end 'zero knowledge' encryption, and error correction -- to get a better understanding of how Kopia works.
- Link-Backup - a backup utility that creates hard links between a series of backed-up trees, and intelligently handles renames, moves, and duplicate files without additional storage or transfer.Transfer occurs over standard i/o locally or remotely between a client and server instance of this script. Remote backups rely on the secure remote shell program ssh.Link-Backup comes with a web based viewer of the backups it makes.
See also *nix#Btrfs.
Using btrfs snapshots instead of cp -al has two major advantages. First of all creating a snapshot is much faster than using hardlinks. and the second advantage is, that meta-information about the file will be preserved (ownership, access and modification-time and also file-attributes). When using hardlinks this information will have the state of the most recent backup-process (also for older backups). Last but not least, if you use a new version of btrfs you can also lock the snapshots down to be read-only.
- Do-It-Yourself Backup System Using Rsync and Btrfs - April 6th, 2011
- basic idea, no actual code
- migrate rsnapshot-based backup to btrfs-snapshots - 23 Okt, 2011
- serverfault: btrfs-enabled backup solution - Feb 3, 2012
- Full System Backup (and restore) Feb 8th, 2012
- btrfs max number of hardlinks gotcha - May 28th, 2012, sorted
- Arch Forum: Manage btrfs snapshots
- Incremental backups with btrfs - Sep 7th, 2011
"Of course, the problem with this is that snapshots are, essentially, COW hard links; this means that if there's a corruption on the disk for a file, it'll affect all child snapshots."
"Rsync integration. Now that we have code to efficiently find newly updated files, we need to tie it into tools such as rsync and dirvish. (For bonus points, we can even tell rsync _which blocks_ inside a file have changed. Would need to work with the rsync developers on that one.)"
- python, year ago
- 2 years old
- python gui
- https://github.com/jcrd/snapback - snapback snapshots and backs up btrfs subvolume daily using a systemd timer.
- https://github.com/TestudoAquatilis/btrbackup - Simple backup solution for local backups from btrfs to btrfs filesystems using snapshots.
- Quick local backup with rsync & btrfs - Mar 25, 2010. basic, old command.
- clairvoyant backup - aug 2010, bash, somewhat complex
- btrfs-time-machine - ruby, year ago
- rsyncbtrfs - bash, 11 months ago
- btrfs-backup - ruby, 8 months ago, basic, no docs
- btr-backup - ruby/bash, 7 months ago, rotation
- clockfort/btr-backup - very basic bash, 4 months
- snap - bash, 3 months ago
- btrbackup - bash, moderatly complex, 3 months ago
- butterbackup - python, web itnerface, recent
- https://github.com/moviuro/butter - butter is a btrfs snapshot manager.
- Mondo - backs up your GNU/Linux server or workstation to tape, CD-R, CD-RW, DVD-R[W], DVD+R[W], NFS or hard disk partition. In the event of catastrophic data loss, you will be able to restore all of your data [or as much as you want], from bare metal if necessary. Mondo is in use by Lockheed-Martin, Nortel Networks, Siemens, HP, IBM, NASA's JPL, the US Dept of Agriculture, dozens of smaller companies, and tens of thousands of users. Mondo is comprehensive. Mondo supports LVM 1/2, RAID, ext2, ext3, ext4, JFS, XFS, ReiserFS, VFAT, and can support additional filesystems easily: just e-mail the mailing list with your request. It supports software raid as well as most hardware raid controllers. It supports adjustments in disk geometry, including migration from non-RAID to RAID. It supports BIOS and UEFI boot modes. Mondo runs on all major Linux distributions (Fedora, RHEL, OpenSUSE, SLES, Mageia, Debian, Ubuntu, Gentoo) and is getting better all the time. You may even use it to backup non-Linux partitions, such as NTFS. Mondo is free! It has been published under the GPL v2 (GNU Public License), partly to expose it to thousands of potential beta-testers but mostly as a contribution to the Linux community.
- Relax-and-Recover - a setup-and-forget Linux bare metal disaster recovery solution. It is easy to set up and requires no maintenance so there is no excuse for not using it. Home user: recover from a broken hard disk using a bootable USB stick, recover a broken system from your bootloader. Enterprise: collect small ISO images on a central server, integrate with your backup solution, integrate with your monitoring solution
- https://github.com/nethappen/blocksync-fast - a program written in C that clones and synchronizes any block devices (entire disks, partitions, or files (disk images) using fast and efficient methods. It uses buffered reads and writes to combine adjacent blocks together reducing the number of I/O operations. At synchronization process program overwrites only changed blocks which reduces data transfer and maintains blocks deduplication in Copy-on-write file systems.
- https://github.com/tasket/wyng-backup - able to deliver faster incremental backups for logical volumes and disk images. It accesses copy-on-write metadata (instead of comparing all data for each backup, to instantly find changes since the last backup. Combined with its efficient archive format, Wyng can also very quickly reclaim space from older backup sessions.
- https://github.com/kimono-koans/httm - Interactive, file-level Time Machine-like tool for ZFS/btrfs/nilfs
- https://github.com/rust-util-collections/btm - an incremental data backup mechanism that does not require downtime.
- btrbk - a backup tool for btrfs subvolumes, taking advantage of btrfs specific capabilities to create atomic snapshots and transfer them incrementally to your backup locations.The source and target locations are specified in a config file, which allows to easily configure simple scenarios like "laptop with locally attached backup disks", as well as more complex ones, e.g. "server receiving backups from several hosts via ssh, with different retention policy".
btrbk -n list all