Storage

From Things and Stuff Wiki
Revision as of 08:46, 16 November 2023 by Milk (talk | contribs) (→‎Btrfs)
Jump to navigation Jump to search


Media

See also *nix#Devices, Media#CD / DVD, Sharing


  • https://en.wikipedia.org/wiki/Create,_read,_update_and_delete - the four basic functions of persistent storage. Alternate words are sometimes used when defining the four basic functions of CRUD, such as retrieve instead of read, modify instead of update, or destroy instead of delete. CRUD is also sometimes used to describe user interface conventions that facilitate viewing, searching, and changing information; often using computer-based forms and reports. The term was likely first popularized by James Martin in his 1983 book Managing the Data-base Environment. The acronym may be extended to CRUDL to cover listing of large data sets which bring additional complexity such as pagination when the data sets are too large to be easily held in memory.



  • https://en.wikipedia.org/wiki/Nearline_storage - a portmanteau of "near" and "online storage") is a term used in computer science to describe an intermediate type of data storage that represents a compromise between online storage (supporting frequent, very rapid access to data) and offline storage/archiving (used for backups or long-term storage, with infrequent access to data). Nearline storage dates back to the IBM 3850 Mass Storage System (MSS) tape library, which was announced in 1974.


Tape

  • UberCassette - solution for the archival and restoration of programs on cassette tapes for 8-bit machines from the 80's - I want it to archive just about any format's tapes.There are many programs out there for doing this, some of them work, some don't so well. Many of them work as long as you have an old enough machine to run them on. No other program does many different types, though.




Floppy disk



  • https://github.com/davidgiven/fluxengine - a very cheap USB floppy disk interface capable of reading and writing exotic non-PC floppy disk formats. It allows you to use a conventional PC drive to accept Amiga disks, CLV Macintosh disks, bizarre 128-sector CP/M disks, and other weird and bizarre formats. (Although not all of these are supported yet. I could really use samples.)The hardware consists of a single, commodity part with a floppy drive connector soldered onto it. No ordering custom boards, no fiddly surface mount assembly, and no fuss: nineteen simpler solder joints and you're done. You can make one for $15 (plus shipping).

Optical disc

CD

See also Rip / Tag, Playback#CD


cd-drive
  # drive information, provided by libcdio
  • Digital Audio Extraction - Each CD drive reads audio discs slightly out (a number of samples), if your CD drive supports 'Accurate Stream' it will be a constant value, this value tends to be the same for each particular make and model of CD Drive. A small number of drives have [Purged] as the offset, these drives were found not to have a constant drive offset (perhaps different manufacturing batches, or firmwares), as such they have been removed from AccurateRip's drive database (should you have one of these drives, 3 matching key disks will be required to configure AccurateRip).


  • https://github.com/cmcginty/mktoc - simplifies the steps needed to create audio CD TOC files for the cdrdao CD burning program. For users familiar with ExactAudioCopy or CdrWin, TOC files are synonymous with CUE sheets. The primary goal of mktoc is to create TOC files using a previously generated CUE sheet.

CD ISO

  • http://wiki.osdev.org/ISO_9660 - the standard file system for CD-ROMs. It is also widely used on DVD and BD media and may as well be present on USB sticks or hard disks. Its specifications are available for free under the name ECMA-119.


  • http://wiki.osdev.org/Mkisofs - a utility that creates an ISO 9660 image from files on disk. "mkisofs is effectively a pre-mastering program to generate the iso9660 filesystem - it takes a snapshot of a given directory tree, and generates a binary image which will correspond to an iso9660 filesystem when written to a block device." Developers of operating systems will mainly be interested in creating ISO filesystems for bootable CD, DVD, or BD via El-Torito. Nevertheless, ISO filesystems may also be booted from hard disk or USB stick.


  • Libburnia - a project for reading, mastering and writing optical discs. Currently it is comprised of libraries named libisofs, libburn, libisoburn, a cdrecord emulator named cdrskin, and an integrated multi-session tool named xorriso. The software runs on GNU/Linux, FreeBSD, Solaris, NetBSD, OpenBSD. It is base of the GNU xorriso package.


  • http://libburnia-project.org/wiki/Xorriso - xorriso is a command line and dialog application, which creates, loads, manipulates and writes ISO 9660 filesystem images with Rock Ridge extensions. It is part of the libisoburn release tarball. It copies file objects from POSIX compliant filesystems into Rock Ridge enhanced ISO 9660 filesystems and performs session-wise manipulation of such filesystems. It can load the management information of existing ISO images and it writes the session results to optical media or to filesystem objects. If linked with zlib then it is able to produce the zisofs compression format. Directory tree, whole session, and single data files may be equipped with MD5 checksums.

Burning

  • Xfburn - a simple CD/DVD burning tool based on libburnia libraries. It can blank CD/DVD(-RW)s, burn and create iso images, audio CDs, as well as burn personal compositions of data to either CD or DVD. It Is stable, and under ongoing development.


  • QPxTool - the linux way to get full control over your CD/DVD drives. It is the Open Source Solution which intends to give you access to all available Quality Checks (Q-Checks) on written and blank media, that are available for your drive. This will help you to find the right media and the optimized writing speed for your hardware, which will increase the chance for a long data lifetime.


  • https://github.com/sonejostudios/CDMasterTool - a tool for audio CD creation, TOC and CUE files manipulation, CD burning with CD-TEXT, drive(s) and CD analysis and Commandline launcher. It is mainly based of Cdrdao and libcdio, as well as a couple of other GNU/Linux tools. The main goal is to burn Audio CDs with CD-TEXT, out of a DAW's Red Book export WAV/TOC/CUE combination.

DVD

  • dvdisaster - a tool for creating error correction data (“ecc data”) for optical media such as CD, DVD and BD discs. Use cases for creating ecc data; recovering defective media using ecc data, and for general maintenanance of optical media.


  • https://github.com/ldo/dvdauthor - a program that will generate a DVD-Video movie from a valid MPEG-2 stream that should play when you put it in a DVD player. To start you need MPEG-2 files that contain the necessary DVD-Video VOB packets. These can be generated with FFmpeg, or by by passing `-f 8` to `mplex`.



Blu-Ray




OCD

  • https://en.wikipedia.org/wiki/Optical_Disc_Archive - a storage technology that was introduced by the Sony Corporation. It uses removable cartridges, where each cartridge holds 11 optical discs. Each of the internal optical discs is similar to, but not compatible with, a Blu-ray disc. The latest version of the cartridge, that has a total capacity of about 5.5TB, uses discs that hold about 500GB each. The technology was publicly announced on 16 April 2012 during the NAB Show with the first units shipping in February 2013.





M-DISC

  • https://en.wikipedia.org/wiki/M-DISC - a write-once optical disc technology introduced in 2009 by Millenniata, Inc. and available as DVD and Blu-ray discs. M-DISC's design is intended to provide greater archival media longevity. Millenniata claims that properly stored M-DISC DVD recordings will last 1000 years. The M-DISC DVD looks like a standard disc, except it is slightly thicker and almost transparent.

HDD



Failure



Flash

  • https://en.wikipedia.org/wiki/Flash_memory - an electronic (solid-state) non-volatile computer storage medium that can be electrically erased and reprogrammed.Toshiba developed flash memory from EEPROM (electrically erasable programmable read-only memory) in the early 1980s and introduced it to the market in 1984.[citation needed] The two main types of flash memory are named after the NAND and NOR logic gates. The individual flash memory cells exhibit internal characteristics similar to those of the corresponding gates.While EPROMs had to be completely erased before being rewritten, NAND-type flash memory may be written and read in blocks (or pages) which are generally much smaller than the entire device. NOR-type flash allows a single machine word (byte) to be written – to an erased location – or read independently.

The NAND type is found primarily in memory cards, USB flash drives, solid-state drives (those produced in 2009 or later), and similar products, for general storage and transfer of data. NAND or NOR flash memory is also often used to store configuration data in numerous digital products, a task previously made possible by EEPROM or battery-powered static RAM. One key disadvantage of flash memory is that it can only endure a relatively small number of write cycles in a specific block.

Example applications of both types of flash memory include personal computers, PDAs, digital audio players, digital cameras, mobile phones, synthesizers, video games, scientific instrumentation, industrial robotics, and medical electronics. In addition to being non-volatile, flash memory offers fast read access times, although not as fast as static RAM or ROM. Its mechanical shock resistance helps explain its popularity over hard disks in portable devices, as does its high durability, ability to withstand high pressure, temperature and immersion in water, etc.

Although flash memory is technically a type of EEPROM, the term "EEPROM" is generally used to refer specifically to non-flash EEPROM which is erasable in small blocks, typically bytes. Because erase cycles are slow, the large block sizes used in flash memory erasing give it a significant speed advantage over non-flash EEPROM when writing large amounts of data. As of 2013, flash memory costs much less than byte-programmable EEPROM and had become the dominant memory type wherever a system required a significant amount of non-volatile solid-state storage.


See also Distros##Live USB


  • https://en.wikipedia.org/wiki/Solid-state_storage - (sometimes abbreviated as SSS) is a type of non-volatile computer storage that stores and retrieves digital information using only electronic circuits, without any involvement of moving mechanical parts. This differs fundamentally from the traditional electromechanical storage paradigm, which accesses data using rotating or linearly moving media coated with magnetic material. Solid-state storage devices typically store data using electrically-programmable non-volatile flash memory, although some devices use battery-backed volatile random-access memory (RAM). Lacking any moving mechanical parts, solid-state storage operates much faster than traditional electromechanical storage; as a downside, solid-state storage is significantly more expensive and suffers from the write amplification phenomenon. To satisfy the requirements of applications in various types of computer systems and appliances,[citation needed] solid-state storage devices come in various types, form factors, storage-space sizes, and interfacing options.


  • Memory Technology Device (MTD) Subsystem for Linux. - MTD subsystem (stands for Memory Technology Devices) provides an abstraction layer for raw flash devices. It makes it possible to use the same API when working with different flash types and technologies, e.g. NAND, OneNAND, NOR, AG-AND, ECC'd NOR, etc.MTD subsystem does not deal with block devices like MMC, eMMC, SD, CompactFlash, etc. These devices are not raw flashes but they have a Flash Translation layer inside, which makes them look like block devices. These devices are the subject of the Linux block subsystem, not MTD. Please, refer to this FAQ section for a short list of the main differences between block and MTD devices. And the raw flash vs. FTL devices UBIFS section discusses this in more details.


  • H2test - Data integrity test for USB sticks and other media. An advantage over other tools is, that it does not need administrative rights to access the device - but it should not be NTFS formatted, as it does not allow to test everything.

SSD


You can FAT32 format hard drive partitions larger than 32 GB with one of the tools from http://www.uwe-sieber.de/usbtrouble_e.html#format


  • F3 by Digirati - GPLv3 implementation of the algorithm of H2testw, and other tools that I have been implementing to speed up the identification of fake drives as well as making them usable: f3probe, f3fix, and f3brew. My implementation of H2testw, which I've broken into two applications named f3write and f3read, runs on Linux, Macs, Windows/Cygwin, and FreeBSD. f3probe is the fastest way to identify fake drives and their real sizes. f3fix enables users to use the real capacity of fake drives without losing data. f3brew helps developers to infer how fake drives work. f3probe, f3fix, and f3brew currently runs only on Linux.

SD/MicroSD cards


To sort


  • https://en.wikipedia.org/wiki/Disk_sector - a subdivision of a track on a magnetic disk or optical disc. Each sector stores a fixed amount of user-accessible data, traditionally 512 bytes for hard disk drives (HDDs) and 2048 bytes for CD-ROMs and DVD-ROMs. Newer HDDs use 4096-byte (4 KiB) sectors, which are known as the Advanced Format (AF). The sector is the minimum storage unit of a hard drive. Most disk partitioning schemes are designed to have files occupy an integral number of sectors regardless of the file's actual size. Files that do not fill a whole sector will have the remainder of their last sector filled with zeroes. In practice, operating systems typically operate on blocks of data, which may span multiple sectors.


  • https://en.wikipedia.org/wiki/Advanced_Format - any disk sector format used to store data on magnetic disks in hard disk drives (HDDs) that exceeds 512, 520, or 528 bytes per sector, such as the 4096, 4112, 4160, and 4224-byte (4 KB) sectors of an Advanced Format Drive (AFD). Larger sectors enable the integration of stronger error correction algorithms to maintain data integrity at higher storage densities. Advanced Format is also considered a milestone technology in the history of HDD storage, where data has been generally processed in 512-byte segments since at least the introduction of consumer-grade HDDs in the early 1980s, and in similar or smaller chunks in the professional field since the invention of HDDs in 1956.


tmpfs

  • https://en.wikipedia.org/wiki/tmpfs - a temporary file storage paradigm implemented in many Unix-like operating systems. It is intended to appear as a mounted file system, but data is stored in volatile memory instead of a persistent storage device. A similar construction is a RAM disk, which appears as a virtual disk drive and hosts a disk file system.


  • https://wiki.archlinux.org/index.php/tmpfs - a temporary filesystem that resides in memory and/or swap partition(s). Mounting directories as tmpfs can be an effective way of speeding up accesses to their files, or to ensure that their contents are automatically cleared upon reboot.

Storage devices

Block and partition names;

sd[a,b,etc]
  drive

sda[1,2,etc]
  partition of drive


cat /proc/partitions
blkid
  # command-line utility to locate/print block device attributes 

File system mounts

lsblk
  # list information about block devices.
findmnt
df -a
mounts
mount -l
mount
mount /dev/sdxY /some/directory

umount /some/directory

mount -o remount /
  remount partition after /etc/fstab change
mount -o loop example.img /home/you/dir
pmount
  • https://linux.die.net/man/1/pmount - ("policy mount") is a wrapper around the standard mount program which permits normal users to mount removable devices without a matching /etc/fstab entry.
systemd-mount
systemd-mount /dev/sdb1
  # mounts to automatic mount point based on label


autofs




PySDM
  • PySDM - a Storage Device Manager that allows full customization of hard disk mountpoints without manually access to fstab.It also allows the creation of udev rules for dynamic configuration of storage devices


Automated

  • systemd-fstab-generator - a generator that translates /etc/fstab (see fstab(5) for details) into native systemd units early at boot and when configuration of the system manager is reloaded. This will instantiate mount and swap units as necessary.







Partitions


  • https://en.wikipedia.org/wiki/Partition_table - a table maintained on disk by the operating system describing the partitions on that disk. The terms partition table and partition map are most commonly associated with the MBR partition table of a Master Boot Record (MBR) in IBM PC compatibles, but it may be used generically to refer to other "formats" that divide a disk drive into partitions, such as: GUID Partition Table (GPT), Apple partition map (APM), or BSD disklabel.



  • GNU Parted manipulates partition tables. This is useful for creating space for new operating systems, reorganizing disk usage, copying data on hard disks and disk imaging. The package contains a library, libparted, as well as well as a command-line frontend, parted, which can also be used in scripts.
  • gnome-disks


MBR

  • http://en.wikipedia.org/wiki/Master_boot_record - a special type of boot sector at the very beginning of partitioned computer mass storage devices like fixed disks or removable drives intended for use with IBM PC-compatible systems and beyond. The concept of MBRs was publicly introduced in 1983 with PC DOS 2.0.

The MBR holds the information on how the partitions, containing file systems, are organized on that medium. The MBR also contains executable code to function as a loader for the installed operating system—usually by passing control over to the loader's second stage, or in conjunction with each partition's volume boot record (VBR). This MBR code is usually referred to as a boot loader.


  • https://en.wikipedia.org/wiki/Partition_type - a partition's entry in the partition table inside a master boot record (MBR) is a byte value intended to specify the file system the partition contains and/or to flag special access methods used to access these partitions (f.e. special CHS mappings, LBA access, logical mapped geometries, special driver access, hidden partitions, secured or encrypted file systems, etc.).

GPT



Cache

Loopback


  • https://www.mankier.com/4/loop - The loop device is a block device that maps its data blocks not to a physical device such as a hard disk or optical disk drive, but to the blocks of a regular file in a filesystem or to another block device. This can be useful for example to provide a block device for a filesystem image stored in a file, so that it can be mounted with the mount(8) command.
  • losetup - set up and control loop devices


Utils

Repair

SMART


  • smartmontools - contains two utility programs (smartctl and smartd) to control and monitor storage systems using the Self-Monitoring, Analysis and Reporting Technology System (SMART) built into most modern ATA and SCSI harddisks. In many cases, these utilities will provide advanced warning of disk degradation and failure. It is derived from smartsuite.
  • GSmartControl - a graphical user interface for smartctl (from smartmontools package), which is a tool for querying and controlling SMART (Self-Monitoring, Analysis, and Reporting Technology) data on modern hard disk and solid-state drives. It allows you to inspect the drive's SMART data to determine its health, as well as run various tests on it.



Virtualization

  • https://en.wikipedia.org/wiki/Logical_disk - a device that provides an area of usable storage capacity on one or more physical disk drive components in a computer system. Other terms that are used to mean the same thing are partition, logical volume, and in some cases a virtual disk (vdisk).

LVM

  • https://en.wikipedia.org/wiki/Logical_volume_management - provides a method of allocating space on mass-storage devices that is more flexible than conventional partitioning schemes. In particular, a volume manager can concatenate, stripe together or otherwise combine partitions into larger virtual ones that administrators can re-size or move, potentially without interrupting system use. Volume management represents just one of many forms of storage virtualization; its implementation takes place in a layer in the device-driver stack of an OS (as opposed to within storage devices or in a network).
  • LVM2 - refers to the userspace toolset that provide logical volume management facilities on linux. It is reasonably backwards-compatible with the original LVM toolset.




  • Volume group - the highest level abstraction used within the Logical Volume Manager (LVM). It gathers together a collection of Logical Volumes (LV) and Physical Volumes (PV) into one administrative unit.


Snapshots


RAID


GUI

  • Kvpm - KDE Volume and Partition Manager is a GUI front end for Linux LVM and Gnu parted. LVM2 groups and volumes can be created, removed and manipulated using most of the options supported by the standard LVM2 tools. Some support for creating and operating on partitions is also provided. It also handles creating and mounting file systems.




Caching

  • LVM SSD Caching - LVM has a cool feature that allows you to cache slow LVs on faster media. It has some limitations like not being able to cache an entire volume group but it does work. I had an extra Intel 520 240GB SSD that I was not using so I decided to add it to one of my servers to speed up an mdadm array.I can probably say that using Gitlab is a bit snappier through the web interface after implementing the SSD cache.

RAID

See also Hardware#RAID, Distros#Network attached storage




  • dmsetup - low level logical volume management



Management

System Storage Manager

  • System Storage Manager - provides easy to use command line interface to manage your storage using various technologies like lvm, btrfs, encrypted volumes and possibly more. In more sophisticated enterprise storage environments, management with Device Mapper (dm), Logical Volume Manager (LVM), or Multiple Devices (md) is becoming increasingly more difficult. With file systems added to the mix, the number of tools needed to configure and manage storage has grown so large that it is simply not user friendly. With so many options for a system administrator to consider, the opportunity for errors and problems is large. The btrfs administration tools have shown us that storage management can be simplified, and we are working to bring that ease of use to Linux filesystems in general.


Snapper

  • Snapper - a tool for Linux filesystem snapshot management. Apart from the obvious creation and deletion of snapshots, it can compare snapshots and revert differences between snapshots. In simple terms, this allows root and non-root users to view older versions of files and revert changes. The features include: Manually create snapshots, Automatically create snapshots, e.g. with YaST and zypp, Automatically create timeline of snapshots, Show and revert changes between snapshots, Works with btrfs, ext4 and thin-provisioned LVM volumes, Supports Access Control Lists and Extended Attributes, Automatic cleanup of old snapshots, Command line interface, D-Bus interface, PAM module to create snapshots during login and logout.


  • https://github.com/ricardomv/snapper-gui - a graphical user interface for the tool snapper for Linux filesystem snapshot management. It can compare snapshots and revert differences between snapshots. In simple terms, this allows root and non-root users to view older versions of files and revert changes. Currently works with btrfs, ext4 and thin-provisioned LVM volumes


Blivet

ButterManager

File systems

See also Files



  • https://en.wikipedia.org/wiki/Copy-on-write#In_computer_storage - COW may also be used as the underlying mechanism for snapshots, such as those provided by logical volume management, file systems such as Btrfs and ZFS, and database servers such as Microsoft SQL Server. Typically, the snapshots store only the modified data, and are stored close to the original, so they are only a weak form of incremental backup and cannot substitute for a full backup.
  • https://github.com/CyberShadow/thincow - enables non-destructive, copy-on-write edits of block devices involving copying/moving a lot of data around, using deduplication for minimal additional disk usage. Use cases include:



  • http://en.wikipedia.org/wiki/Virtual_file_system - or virtual filesystem switch is an abstraction layer on top of a more concrete file system. The purpose of a VFS is to allow client applications to access different types of concrete file systems in a uniform way. A VFS can, for example, be used to access local and network storage devices transparently without the client application noticing the difference. It can be used to bridge the differences in Windows, Mac OS and Unix filesystems, so that applications can access files on local file systems of those types without having to know what type of file system they are accessing.



  • wipefs - wipe a filesystem signature from a device




/etc/fstab

  • http://en.wikipedia.org/wiki/fstab - or file systems table, a system configuration file commonly found on Unix systems. On Linux, it is part of the util-linux package. The fstab file typically lists all available disks and disk partitions, and indicates how they are to be initialized or otherwise integrated into the overall system's file system. fstab is still used for basic system configuration, notably of a system's main hard drive and startup file system, but for other uses has been superseded in recent years by automatic mounting. The fstab file is most commonly used by the mount command, which reads the fstab file to determine which options should be used when mounting the specified device.


Formatting

sudo fdisk -l

sudo umount /dev/sdc1

# FAT
sudo mkfs.vfat -n 'device name' -I /dev/sdc1

# NTFS
sudo mkfs.ntfs -I /dev/sdc1

# EXT4
sudo mkfs.ext4 -n  -I /dev/sdc1

Swap


swapon -s
  # equivelant to cat /proc/swaps
free -m
  # shows memory used



  • https://en.wikipedia.org/wiki/Zswap - a Linux kernel feature that provides a compressed write-back cache for swapped pages, as a form of virtual memory compression. Instead of moving memory pages to a swap device when they are to be swapped out, zswap performs their compression and then stores them into a memory pool dynamically allocated in the system RAM. Later, writeback to the actual swap device is deferred or even completely avoided, resulting in a significantly reduced I/O for Linux systems that require swapping; the tradeoff is the need for additional CPU cycles to perform the compression.

Ext2/3/4

kjournald is responsible for the journal of ext3 [16]



  • https://github.com/gerard/ext4fuse - a read-only implementation of ext4 for FUSE. The main reason this exists is to be able to read linux partitions from OSX. However, it should work on top of any FUSE implementation. Linux and FreeBSD have been tested to some point and I've heard that OpenSolaris should also work.

ZFS




"FreeBSD ZFS tuning guide wiki indicates you'll need about 5GB of ram per 1TB of saved disk space"






Btrfs

See also Backup#Btrfs





Subvolumes appear like directories. inode is different.

"Btrfs support is included in the linux package (as a module). Needs a reboot after installing before btrfs recognised. User space utilities are available in btrfs-progs. For multi-devices support (RAID like feature of btrfs) aka btrfs volume in early boot, you have to enable btrfs mkinitcpio hook (provided by mkinitcpio package) to be able to use, for example, a root btrfs volume. If the btrfs volume is a non-system volume, one only needs to set USEBTRFS="yes" in /etc/rc.conf. However, if you only use bare btrfs partition, such options are not needed."

"The btrfs scrub command reads redundant data and validates all the checksums, correcting any errors it finds along the way, using the checksum to determine which copy is the valid one. But with a single drive, how can it correct anything? The metadata - the file system overhead that is used to manage your data - is always stored in a redundant manner by default, even on a single drive. As a result, any corrupted metadata can be corrected, on the fly."

"EXT4 checksums its journal, which AFAIK will protect against errors caused by sync failures (ie. power failure during disk I/O). But it’s not going to protect against latent sector errors. To do that, you need checksumming on all the file data, along the lines of what ZFS or BTRFS provides."

A cross-subvolume copy patch has made it into 3.6_rc. This patch will allow cp --reflink across subvolumes, as long as the copy does not cross mount points.

  • copy-on-write, without the ram requirement of zsf
  • snapshots every 30 seconds, ability to mount from previous gen



Commands

mkfs.btrfs -L [label] /dev/[device]
mount -t btrfs /dev/sdg /mnt/drivename


btrfs device add /dev/sdc /mnt/btrfs


btrfs filesystem df /media/drivename

btrfs filesystem show
btrfs filesystem defragment /


btrfs-debug-tree -R /dev/sdg
  show drive/subvolume infos, unmounted
btrfs subvolume create [<dest>/]

btrfs subvolume snapshot /mnt/btrfs /mnt/btrfs/snapshot_of_root

btrfs subvolume delete [<dest>/]


btrfs property set -ts /path/to/snapshot ro false
 # Change that to true to set it to read-only

btrfs property list -ts /path/to/snapshot
 # You can also use list to see the available properties


Cloning a file between subvolumes;

cp --reflink /mnt/MYFILES/myfile1 /mnt/MYFILES/myfile3


mount -t btrfs -o compress=lzo /dev/sdg /mnt/drivename


In a nutshell, you should look at:

  • btrfs scrub to detect issues on live filesystems
  • look at btrfs detected errors in syslog
  • mount -o ro,recovery to mount a filesystem with issues
  • btrfs-zero-log might help in specific cases.
  • btrfs restore will help you copy data off a broken btrfs filesystem.
  • btrfs check --repair, aka btrfsck is your last option if the ones above have not worked.
btrfs-scrub [options] <device>
  # scrub btrfs filesystem, verify block checksums


Tools




  • https://github.com/rkapl/btsdu - can show you what folders have the most changed data between snapshots. Simply run: btsdu -p /snapshots/old_snapshot /snapshots/new_snapshot


  • btrfs-gui - a graphical user interface tool for inspecting and managing btrfs filesystems. It is capable of managing filesystems on the local machine, and filesystems on remote network-accessible machines. It requires root access to the machine to perform most of its tasks (but separates the root-access part from the GUI).







  • https://github.com/Zygo/bees - a block-oriented userspace deduplication agent designed for large btrfs filesystems. It is an offline dedupe combined with an incremental data scan capability to minimize time data spends on disk from write to dedupe.
  • https://github.com/Lakshmipathi/dduper - a block-level out-of-band BTRFS dedupe tool. This works by fetching built-in checksum from BTRFS csum-tree, instead of reading file blocks and computing checksum itself. This hugely improves the performance. Please be aware that dduper is beta quality tool, so validate it, before running it on your critical data.


  • https://github.com/anybox/buttervolume - We believe BTRFS subvolumes are a powerful and lightweight storage solution for Docker volumes, allowing fast and easy replication (and backup, across several nodes of a small cluster.


Articles

XFS

bcachefs

ZUFS


WAFL

  • https://en.wikipedia.org/wiki/Write_Anywhere_File_Layout - a file layout[clarification needed] that supports large, high-performance RAID arrays, quick restarts without lengthy consistency checks in the event of a crash or power failure, and growing the filesystems size quickly. It was designed by NetApp for use in its storage appliances like NetApp FAS, AFF, Cloud Volumes ONTAP and ONTAP Select.Its author claims that WAFL is not a file system, although it includes one. It tracks changes similarly to journaling file systems as logs (known as NVLOGs) in dedicated memory storage device non-volatile random access memory, referred to as NVRAM or NVMEM. WAFL provides mechanisms that enable a variety of file systems and technologies that want to access disk blocks.


NILFS

  • https://en.wikipedia.org/wiki/NILFS - a log-structured file system implementation for the Linux kernel. It is being developed by Nippon Telegraph and Telephone Corporation (NTT) CyberSpace Laboratories and a community from all over the world. NILFS was released under the terms of the GNU General Public License (GPL).


DOS / Windows

FAT

FAT is a family of filesystems, comprising at least, in chronological order:

  • FAT12, a filesystem used on floppies since the late 1980s, in particular by MS-DOS;
  • FAT16, a small modification of FAT12 supporting larger media, introduced to support hard disks;
  • vFAT, which is backward compatible with FAT, but allows files to have longer names which only vFAT-aware applications running on vFAT-aware operating systems can see;
  • FAT32, another modification of FAT16 designed to support larger disk sizes. In practice FAT32 is almost always used with vFAT long file name support, but technically 16/32 and long-file-names-yes/no are independent.

Because those filesystems are very similar, they're usually handled by the same drivers and tools. mkfs.vfat and mkfs.fat are the same tool; an empty * FAT16 filesystem and an empty vFAT filesystem look exactly the same, so mkfs doesn't need to distinguish between them. (You can think of FAT16 and vFAT as two different ways of seeing the same filesystem rather than two separate filesystem formats.) [26]


Petit FAT

  • Petit FAT File System Module - a sub-set of FatFs module for tiny 8-bit microcontrollers. It is written in compliance with ANSI C and completely separated from the disk I/O layer. It can be incorporated into the tiny microcontrollers with limited memory even if the RAM size is less than sector size. Also full featured FAT file system module is available


NTFS

ReFS

  • https://en.wikipedia.org/wiki/ReFS - a Microsoft proprietary file system introduced with Windows Server 2012 with the intent of becoming the "next generation" file system after NTFS.ReFS was designed to overcome problems that had become significant over the years since NTFS was conceived, which are related to how data storage requirements had changed. The key design advantages of ReFS include automatic integrity checking and data scrubbing, removal of the need for running chkdsk, protection against data degradation, built-in handling of hard disk drive failure and redundancy, integration of RAID functionality, a switch to copy/allocate on write for data and metadata updates, handling of very long paths and filenames, and storage virtualization and pooling, including almost arbitrarily sized logical volumes (unrelated to the physical sizes of the used drives).


Apple

HFS

  • https://en.wikipedia.org/wiki/Hierarchical_File_System - a proprietary file system developed by Apple Inc. for use in computer systems running Mac OS. Originally designed for use on floppy and hard disks, it can also be found on read-only media such as CD-ROMs. HFS is also referred to as Mac OS Standard (or, erroneously, "HFS Standard"), while its successor, HFS Plus, is also called Mac OS Extended (or, erroneously, "HFS Extended"). With the introduction of Mac OS X 10.6, Apple dropped support for formatting or writing HFS disks and images, which remain supported as read-only volumes.

HFS+

  • https://en.wikipedia.org/wiki/HFS_Plus - or HFS+ is a file system developed by Apple Inc. It replaced the Hierarchical File System (HFS) as the primary file system of Apple computers with the 1998 release of Mac OS 8.1. HFS+ continued as the primary Mac OS X file system until it was itself replaced with the release of the Apple File System (APFS) with macOS High Sierra in 2017. HFS+ is also one of the formats used by the iPod digital music player. It is also referred to as Mac OS Extended or HFS Extended, where its predecessor, HFS, is also referred to as Mac OS Standard or HFS Standard. During development, Apple referred to this file system with the codename Sequoia.[5]HFS Plus is an improved version of HFS, supporting much larger files (block addresses are 32-bit length instead of 16-bit) and using Unicode (instead of Mac OS Roman or any of several other character sets) for naming items. Like HFS, HFS Plus uses B-trees to store most volume metadata, but unlike most other file systems, HFS Plus supports hard links to directories. HFS Plus permits filenames up to 255 characters in length, and n-forked files similar to NTFS, though until 2005 almost no system software took advantage of forks other than the data fork and resource fork. HFS Plus also uses a full 32-bit allocation mapping table rather than HFS's 16 bits, significantly improving space utilization with large disks.

APFS

  • https://en.wikipedia.org/wiki/Apple_File_System - a proprietary file system for macOS High Sierra and later, iOS 10.3 and later, tvOS 10.2 and later,[6] and watchOS 3.2 and later, developed and deployed by Apple Inc.[8][9] It aims to fix core problems of HFS+ (also called Mac OS Extended), APFS's predecessor on these operating systems. Apple File System is optimized for flash and solid-state drive storage, with a primary focus on encryption.


Flash

  • https://en.wikipedia.org/wiki/Flash_file_system - a file system designed for storing files on flash memory–based storage devices. While the flash file systems are closely related to file systems in general, they are optimized for the nature and characteristics of flash memory (such as to avoid write amplification), and for use in particular operating systems.


exFAT

  • https://en.wikipedia.org/wiki/exFAT - a Microsoft file system introduced in 2006 optimized for flash memory such as USB flash drives and SD cards. It is proprietary and Microsoft owns patents on several elements of its design. exFAT can be used where the NTFS file system is not a feasible solution (due to data structure overhead), yet the file size limit of the standard FAT32 file system (i.e. 4 GiB) remains in those scenarios.m exFAT has been adopted by the SD Card Association as the default file system for SDXC cards larger than 32 GiB.

SFFS

  • SFFS Flash File System - a Safe Flash File System that can support almost any NOR or NAND flash device. It provides a high degree of reliability and complete protection against unexpected power failure or reset events. SFFS provides wear leveling, bad block handling and ECC algorithms to ensure you get optimal use out of a flash device. SFFS is pre-integrated with the MQX RTOS and allows you to quickly create a a robust file system for an embedded device using on-chip or on-board flash devices. The SFFS Flash File System was specifically designed for embedded systems.

F2FS

  • F2FS Wiki - a file system that, from the start, takes into account the characteristics of NAND flash memory-based storage devices (such as solid-state disks, eMMC, and SD cards), which are widely used in computer systems ranging from mobile devices to servers. F2FS was designed on a basis of a log-structured file system approach, which it adapted to newer forms of storage. Jaegeuk Kim, the principal F2FS author, has stated that it remedies some known issues of the older log-structured file systems, such as the snowball effect of wandering trees and high cleaning overhead. In addition, since a NAND-based storage device shows different characteristics according to its internal geometry or flash memory management scheme (such as the Flash Translation Layer or FTL), it supports various parameters not only for configuring on-disk layout, but also for selecting allocation and cleaning algorithms.

YAFFS

  • A Robust Flash File System Since 2002 | Yaffs - A Flash File System for embedded use - (Yet Another Flash File System) is an open-source file system specifically designed to be fast, robust and suitable for embedded use with NAND and NOR Flash. It is widely used with Linux, RTOSs, or no OS at all, in consumer devices, avionics, and critical infrastructure. It is available under GNU Public License, GPL, or on commercial terms from Aleph One.

Tess satelliteNASA TESS Mission This Transiting Exoplanet Survey Satellite (TESS) will discover thousands of exoplanets in orbit around the brightest stars in the sky.

spiffs

log_fs.h

AXFS

  • https://github.com/jaredeh/axfs - The Advanced XIP File System is a Linux kernel filesystem driver that enables files to be executed directly from flash or ROM memory rather than being copied into RAM.

Networked




  • Bazil is a distributed file system designed for single-person disconnected operation. It lets you share your files across all your computers, with or without cloud services. FUSE is a programming library for writing file systems in userspace, in Go.



NFS


nfs - fstab format and options for the nfs file systems
mount.nfs


showmount -e server-Ip-address




SMB / CIFS

  • http://en.wikipedia.org/wiki/Server_Message_Block - SMB, one version of which was also known as Common Internet File System (CIFS), operates as an application-layer network protocol mainly used for providing shared access to files, printers, and serial ports and miscellaneous communications between nodes on a network. It also provides an authenticated inter-process communication mechanism. Most usage of SMB involves computers running Microsoft Windows, where it was known as "Microsoft Windows Network" before the introduction of Active Directory. Corresponding Windows services are LAN Manager Server (for the server component) and LAN Manager Workstation (for the client component).


  • Samba - opening windows to a wider world - the standard Windows interoperability suite of programs for Linux and Unix.Samba is Free Software licensed under the GNU General Public License, the Samba project is a member of the Software Freedom Conservancy.Since 1992, Samba has provided secure, stable and fast file and print services for all clients using the SMB/CIFS protocol, such as all versions of DOS and Windows, OS/2, Linux and many others.Samba is an important component to seamlessly integrate Linux/Unix Servers and Desktops into Active Directory environments. It can function both as a domain controller or as a regular domain member.



NBD

  • Network Block Device - What is it: With this compiled into your kernel, Linux can use a remote server as one of its block devices. Every time the client computer wants to read /dev/nbd0, it will send a request to the server via TCP, which will reply with the data requested. This can be used for stations with low disk space (or even diskless - if you use an initrd) to borrow disk space from other computers. Unlike NFS, it is possible to put any file system on it. But (also unlike NFS), if someone has mounted NBD read/write, you must assure that no one else will have it mounted.
  • https://en.wikipedia.org/wiki/Network_block_device - a device node whose content is provided by a remote machine. Typically, network block devices are used to access a storage device that does not physically reside in the local machine but on a remote one. As an example, a local machine can access a hard disk drive that is attached to another computer.


  • xNBD - yet another NBD (Network Block Device) server program, which works with the NBD client driver of Linux Kernel.

9P

  • https://en.wikipedia.org/wiki/9P_(protocol) - a network protocol developed for the Plan 9 from Bell Labs distributed operating system as the means of connecting the components of a Plan 9 system. Files are key objects in Plan 9. They represent windows, network connections, processes, and almost anything else available in the operating system. 9P was revised for the 4th edition of Plan 9 under the name 9P2000, containing various improvements. Some of the improvements made are, the removal of certain filename restrictions, the addition of a 'last modifier' metadata field for directories, and authentication files. The latest version of the Inferno operating system also uses 9P2000. The Inferno file protocol was originally called Styx, but technically it has always been a variant of 9P.


Union mount

  • https://en.wikipedia.org/wiki/Union_mount - a way of combining multiple directories into one that appears to contain their combined contents. Union mounting is supported in Linux, BSD and several of its successors, and Plan 9, with similar but subtly different behavior. As an example application of union mounting, consider the need to update the information contained on a CD-ROM or DVD. While a CD-ROM is not writable, one can overlay the CD's mount point with a writable directory in a union mount. Then, updating files in the union directory will cause them to end up in the writable directory, giving the illusion that the CD-ROM's contents have been updated.

Union mounting was implemented for Linux 0.99 in 1993; this initial implementation was called the Inheriting File System, but was abandoned by its developer because of its complexity. The next major implementation was UnionFS, which grew out of the FiST project at Stony Brook University. An attempt to replace UnionFS, aufs, was released in 2006, followed in 2009 by OverlayFS. Only in 2014 was this last union mount implementation added to the standard Linux kernel source code. Similarly, GlusterFS offers a possibility to mount different filesystems distributed across a network, rather than being located on the same machine.

UnionFS

aufs

Overlayfs

mhddfs

mergerfs

  • https://github.com/trapexit/mergerfs - a union filesystem geared towards simplifying storage and management of files across numerous commodity storage devices. It is similar to mhddfs, unionfs, and aufs.

Distributed

  • https://en.wikipedia.org/wiki/Comparison_of_distributed_file_systems - a distributed file system (DFS) or network file system is any file system that allows access to files from multiple hosts sharing via a computer network. This makes it possible for multiple users on multiple machines to share files and storage resources. Distributed file systems differ in their performance, mutability of content, handling of concurrent writes, handling of permanent or temporary loss of nodes or storage, and their policy of storing content.


Ceph

  • Ceph - a distributed object store and file system designed to provide excellent performance, reliability and scalability. Ceph's main goals are to be POSIX-compatible, and completely distributed without a single point of failure. The data is seamlessly replicated, making it fault tolerant. Clients mount the file system using a Linux kernel client. On March 19, 2010, Linus Torvalds merged the Ceph client for Linux kernel 2.6.34 which was released on May 16, 2010. An older FUSE-based client is also available. The servers run as regular Unix daemons.


seaweedfs

  • https://github.com/chrislusf/seaweedfs - a simple and highly scalable distributed file system. There are two objectives: to store billions of files! to serve the files fast! SeaweedFS implements an object store with O(1) disk seek and an optional Filer with POSIX interface, supporting S3 API, FUSE mount, Hadoop compatible.


LizardFS

Clustered / parallel

  • https://en.wikipedia.org/wiki/Clustered_file_system - a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system (only direct attached storage for each node). Clustered file systems can provide features like location-independent addressing and redundancy which improve reliability or reduce the complexity of the other parts of the cluster. Parallel file systems are a type of clustered file system that spread data across multiple storage nodes, usually for redundancy or performance.

GlusterFS

OrangeFS

Greyhole

  • Greyhole - An application that uses Samba to create a storage pool of all your available hard drives, and allows you to create redundant copies of the files you store, in order to prevent data loss when part of your hardware fails.

OpenDedup


to sort


  • https://github.com/Overv/vramfs - Unused RAM is wasted RAM, so why not put some of that VRAM in your graphics card to work?vramfs is a utility that uses the FUSE library to create a file system in VRAM. The idea is pretty much the same as a ramdisk, except that it uses the video RAM of a discrete graphics card to store files. It is not intented for serious use, but it does actually work fairly well, especially since consumer GPUs with 4GB or more VRAM are now available.










FUSE

  • - Mount a directory elsewhere with changed permissions. - a FUSE filesystem for mirroring a directory to another directory, similarly to mount --bind. The permissions of the mirrored directory can be altered in various ways.Some things bindfs can be used for: Making a directory read-only. Making all executables non-executable. Sharing a directory with a list of users (or groups). Modifying permission bits using rules with chmod-like syntax. Changing the permissions with which files are created.


GFS / GFS2

  • http://www.sourceware.org/cluster/gfs/ - a cluster file system. It allows a cluster of computers to simultaneously use a block device that is shared between them (with FC, iSCSI, NBD, etc...). GFS reads and writes to the block device like a local filesystem, but also uses a lock module to allow the computers coordinate their I/O so filesystem consistency is maintained. One of the nifty features of GFS is perfect consistency -- changes made to the filesystem on one machine show up immediately on all other machines in the cluster.


  • https://en.wikipedia.org/wiki/GFS2 - a shared-disk file system for Linux computer clusters. GFS2 differs from distributed file systems (such as AFS, Coda, InterMezzo, or GlusterFS) because GFS2 allows all nodes to have direct concurrent access to the same shared block storage. In addition, GFS or GFS2 can also be used as a local filesystem.

Embedded

Database


  • LibreCat/Catmandu - data processing toolkit, a new institutional repository system developed by LibreCat Group. LibreCat was development in 2013 in Bielefeld and was made available on GitHub from the start. Since 2015 the code is in production at Bielefeld. In 2016 Ghent University joined the development and is using the cataloging backend in production.

Compressed filesystems

zram

  • https://en.wikipedia.org/wiki/Zram -formerly called compcache, is a Linux kernel module for creating a compressed block device in RAM, in other words a RAM disk, but with on-the-fly disk compression. The block device created with zram can then be used for swap or as general-purpose RAM disk. The two most common uses for zram are for the storage of temporary files (/tmp) and as a swap device. Initially, zram had only the latter function, hence the original name "compcache" ("compressed cache").


zisofs

Squashfs

  • Squashfs - a compressed read-only filesystem for Linux. Squashfs is intended for general read-only filesystem use, for archival use (i.e. in cases where a .tar.gz file may be used), and in constrained block device/memory systems (e.g. embedded systems) where low overhead is needed.


DwarFS

  • https://github.com/mhx/dwarfs - The Deduplicating Warp-speed Advanced Read-only File System. A fast high compression read-only file system for Linux and Windows. A fast high compression read-only file system for Linux and Windows.

Image files

Tools




Quotas

Encryption


LUKS

EncFS

securefs

  • https://github.com/netheril96/securefs - securefs is a filesystem in userspace (FUSE) that transparently encrypts and authenticates data stored. It is particularly designed to secure data stored in the cloud. securefs mounts a regular directory onto a mount point. The mount point appears as a regular filesystem, where one can read/write/create files, directories and symbolic links. The underlying directory will be automatically updated to contain the encrypted and authenticated contents.

eCrypt

TrueCrypt / CipherShed


  • CipherShed - free (as in free-of-charge and free-speech) encryption software for keeping your data secure and private. It started as a fork of the now-discontinued TrueCrypt Project. Learn more about how CipherShed works and the project behind it.

Rubberhose


Tang / Clevis

Emulation

Repair / recovery


TestDisk / photorec

ddrescue

  • ddrescue - a data recovery tool. It copies data from one file or block device (hard disc, cdrom, etc) to another, trying to rescue the good parts first in case of read errors.

recoverdm

  • recoverdm - recover files/disks with damaged sectors. You can recover files as well complete devices.In case if finds sectors which simply cannot be recoverd, it writes an empty sector to the outputfile and continues. If you're recovering a CD or a DVD and the program cannot read the sector in "normal mode", then the program will try to read the sector in "RAW mode" (without error-checking etc.). This toolkit also has a utility called 'mergebad': mergebad merges multiple images into one. This can be usefull when you have, for example, multiple CD's with the same data which are all damaged. In such case, you can then first use recoverdm to retrieve the data from the damaged CD's into image-files and then combine them into one image with mergebad.


  • https://en.wikipedia.org/wiki/Data_remanence - the residual representation of digital data that remains even after attempts have been made to remove or erase the data. This residue may result from data being left intact by a nominal file deletion operation, by reformatting of storage media that does not remove data previously written to the media, or through physical properties of the storage media that allow previously written data to be recovered. Data remanence may make inadvertent disclosure of sensitive information possible should the storage media be released into an uncontrolled environment (e.g., thrown in the bin (trash) or lost). Various techniques have been developed to counter data remanence. These techniques are classified as clearing, purging/sanitizing, or destruction. Specific methods include overwriting, degaussing, encryption, and media destruction.


  • https://en.wikipedia.org/wiki/Data_erasure - sometimes referred to as data clearing, data wiping, or data destruction) is a software-based method of data sanitization that aims to completely destroy all electronic data residing on a hard disk drive or other digital media by overwriting data onto all sectors of the device in an irreversible process. By overwriting the data on the storage device, the data is rendered irrecoverable.