Files
General
- https://en.wikipedia.org/wiki/File_format - a standard way that information is encoded for storage in a computer file. It specifies how bits are used to encode information in a digital storage medium. File formats may be either proprietary or free and may be either unpublished or open.Some file formats are designed for very particular types of data: PNG files, for example, store bitmapped images using lossless data compression. Other file formats, however, are designed for storage of several different types of data: the Ogg format can act as a container for different types of multimedia including any combination of audio and video, with or without text (such as subtitles), and metadata. A text file can contain any stream of characters, including possible control characters, and is encoded in one of various character encoding schemes. Some file formats, such as HTML, scalable vector graphics, and the source code of computer software are text files with defined syntaxes that allow them to be used for specific purposes.
Types
- https://en.wikipedia.org/wiki/Media_type - formerly known as MIME type is a two-part identifier for file formats and format contents transmitted on the Internet. The Internet Assigned Numbers Authority (IANA) is the official authority for the standardization and publication of these classifications. Media types were originally defined in Request for Comments 2045 in November 1996 as a part of MIME (Multipurpose Internet Mail Extensions) specification, for denoting type of email message content and attachments; hence the original name, MIME type. Media types are also used by other internet protocols such as HTTP and document file formats such as HTML, for similar purpose.
- https://wiki.archlinux.org/index.php/Default_Applications - Programs implement default application associations in different ways, while command-line programs traditionally use environment variables, graphical applications tend to use XDG MIME Applications through either the GIO API, the Qt API or by executing /usr/bin/xdg-open, which is part of xdg-utils. Because xdg-open and XDG MIME Applications are quite complex various alternative resource openers were developed. These alternatives replace /usr/bin/xdg-open, which obviously only affects applications that actually use the executable. The following table lists some example applications for each method.
- https://en.wikipedia.org/wiki/File_association - associates a file with an application capable of opening that file. More commonly, a file association associates a class of files (usually determined by their filename extension, such as .txt) with a corresponding application (such as a text editor).
- Association between MIME types and applications - freedesktop.org specs
- https://github.com/mime-types/mime-types-data - provides a registry for information about MIME media type definitions. It can be used with the Ruby mime-types library or other software to determine defined filename extensions for MIME types, or to use filename extensions to look up the likely MIME type definitions.
- desktop-entry-spec - The desktop entry specification describes desktop entries: files describing information about an application such as the name, icon, and description. These files are used for application launchers and for creating menus of applications that can be launched.
Configuration files:
/usr/share/applications/defaults.list # global ~/.local/share/applications/defaults.list # per user, overrides global ~/.local/share/applications/mimeapps.list ~/.local/share/applications/mimeinfo.cache
[Default Applications] mimetype=desktopfile1;desktopfile2;...;desktopfileN
xdg-mime default firefox.desktop x-scheme-handler/https x-scheme-handler/http # add two lines to ~/.config/mimeapps.list
CLI
xdg-mime default Thunar.desktop inode/directory # make Thunar the default file-browser xdg-mime default xpdf.desktop application/pdf # make xpdf the default PDF viewer xdg-settings get default-web-browser # returns what app opens http/https
/usr/bin/vendor_perl/mimeopen -d $file.pdf
- xdg-settings - get and set various desktop environment settings
- mimeo - uses MIME-type file associations to determine which application should be used to open a file. It can launch files or print information such as the command that it would use, the detected MIME-type, etc. It is also possible to use regular expressions to associate arguments with applications. The most common example is to open URLs in browsers or associate file extensions with applications irrespective of their MIME-type.Mimeo tries to adhere to the relevant standards on freedesktop.org and should therefore be compatible with other applications that set or read MIME-type associations, e.g. PCManFM.
- https://github.com/AndyCrowd/fbrokendesktop - scan and check desktop files for broken exec lines in .desktop tiles
GUI
kcmshell5 filetypes
Metadata
Container
- https://en.wikipedia.org/wiki/Digital_container_format - or wrapper format is a metafile format whose specification describes how different elements of data and metadata coexist in a computer file. Among the earliest cross-platform container formats were Distinguished Encoding Rules and the 1985 Interchange File Format. Containers are frequently used in multimedia applications.
IFF / RIFF
Other
- Kaitai Struct - declarative binary format parsing language, a declarative language used to describe various binary data structures, laid out in files or in memory: i.e. binary file formats, network stream packet formats, etc.The main idea is that a particular format is described in Kaitai Struct language (.ksy file) and then can be compiled with ksc into source files in one of the supported programming languages. These modules will include a generated code for a parser that can read the described data structure from a file or stream and give access to it in a nice, easy-to-comprehend API.
Files & directories
- https://en.wikipedia.org/wiki/Directory_(computing) - a file system cataloging structure which contains references to other computer files, and possibly other directories. On many computers, directories are known as folders, or drawers to provide some relevancy to a workbench or the traditional office file cabinet. Files are organized by storing related files in the same directory. In a hierarchical filesystem (that is, one in which files and directories are organized in a manner that resembles a tree), a directory contained inside another directory is called a subdirectory. The terms parent and child are often used to describe the relationship between a subdirectory and the directory in which it is cataloged, the latter being the parent. The top-most directory in such a filesystem, which does not have a parent of its own, is called the root directory.
- https://en.wikipedia.org/wiki/Unix_file_types - For normal files in the file system, Unix does not impose or provide any internal file structure. This implies that from the point of view of the operating system, there is only one file type. The structure and interpretation thereof is entirely dependent on how the file is interpreted by software. Unix does however have some special files. These special files can be identified by the ls -l command which displays the type of the file in the first alphabetic letter of the file system permissions field. A normal (regular) file is indicated by a hyphen-minus '-'.
- https://en.wikipedia.org/wiki/File_descriptor - In the traditional implementation of Unix, file descriptors index into a per-process file descriptor table maintained by the kernel, that in turn indexes into a system-wide table of files opened by all processes, called the file table. This table records the mode with which the file (or other resource) has been opened: for reading, writing, appending, reading and writing, and possibly other modes. It also indexes into a third table called the inode table that describes the actual underlying files. To perform input or output, the process passes the file descriptor to the kernel through a system call, and the kernel will access the file on behalf of the process. The process does not have direct access to the file or inode tables.
On Linux, the set of file descriptors open in a process can be accessed under the path /proc/PID/fd/, where PID is the process identifier. In Unix-like systems, file descriptors can refer to any Unix file type named in a file system. As well as regular files, this includes directories, block and character devices (also called "special files"), Unix domain sockets, and named pipes. File descriptors can also refer to other objects that do not normally exist in the file system, such as anonymous pipes and network sockets.
The FILE data structure in the C standard I/O library usually includes a low level file descriptor for the object in question on Unix-like systems. The overall data structure provides additional abstraction and is instead known as a file handle.
- https://www.halolinux.us/kernel-reference/dentry-objects.html
- https://www.halolinux.us/kernel-reference/the-dentry-cache.html
Storage devices
Block and partition names;
sd[a,b,etc] drive sda[1,2,etc] partition of drive
cat /proc/partitions
blkid # command-line utility to locate/print block device attributes
File system mounts
lsblk # list information about block devices.
findmnt
df -a
mounts
mount -l
mount
mount /dev/sdxY /some/directory umount /some/directory mount -o remount / remount partition after /etc/fstab change
mount -o loop example.img /home/you/dir
pmount
- https://linux.die.net/man/1/pmount - ("policy mount") is a wrapper around the standard mount program which permits normal users to mount removable devices without a matching /etc/fstab entry.
systemd-mount
- https://www.freedesktop.org/software/systemd/man/systemd-mount.html - may be used to create and start a transient .mount or .automount unit of the file system WHAT on the mount point WHERE.
systemd-mount /dev/sdb1 # mounts to automatic mount point based on label
autofs
PySDM
- PySDM - a Storage Device Manager that allows full customization of hard disk mountpoints without manually access to fstab.It also allows the creation of udev rules for dynamic configuration of storage devices
Automated
- https://github.com/coldfix/udiskie - udisks automount script with tray icon
- https://github.com/h4tr3d/mount-tray - Application for mounting and unmounting removable storages via system tray using udisks
- http://ignorantguru.github.io/udevil
- http://igurublog.wordpress.com/downloads/script-devmon - now in udevil
- https://github.com/tom5760/usermount - A simple C program to automatically mount removable drives using UDisks2 and D-Bus.
Directory structure
See LSB, etc.
- https://en.wikipedia.org/wiki/stat_(system_call) - a Unix system call that returns file attributes about an inode. The semantics of stat() vary between operating systems. stat appeared in Version 1 Unix. It is among the few original Unix system calls to change, with Version 4's addition of group permissions and larger file size. As an example, Unix command ls uses this system call to retrieve information on files that includes:
- atime: time of last access (ls -lu)
- mtime: time of last modification (ls -l)
- ctime: time of last status change (ls -lc)
- https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard - defines the directory structure and directory contents in Unix[citation needed] and Unix-like operating systems. It is maintained by the Linux Foundation. The latest version is 3.0, released on 3 June 2015.[1] Currently it is only used by Linux distributions.
- http://hivelogic.com/articles/using_usr_local/
- Point/Counterpoint - /opt vs. /usr/local - March 2010
- Understanding the bin, sbin, usr/bin , usr/sbin split [2] - Dec 9, 2010
- -arch-dev-public- -RFC- merge /bin, /sbin, /lib into /usr/bin and /usr/lib - Mar 2nd, 2012
- on ., .., .dotfiles - Rob Pike, Aug 3rd, 2012
- rmshit - Keep $HOME or other dir clean from unwanted tempfiles, configs and other crap you'll never use that's autocreated upon execution of bad behaving applications
- BleachBit quickly frees disk space and tirelessly guards your privacy. Free cache, delete cookies, clear Internet history, shred temporary files, delete logs, and discard junk you didn't know was there. Designed for Linux and Windows systems, it wipes clean a thousand applications including Firefox, Internet Explorer, Adobe Flash, Google Chrome, Opera, Safari,and more. Beyond simply deleting files, BleachBit includes advanced features such as shredding files to prevent recovery, wiping free disk space to hide traces of files deleted by other applications, and vacuuming Firefox to make it faster.
bleachbit --clean system.cache system.localizations system.trash system.tmp
File permissions
- http://www.tldp.org/LDP/intro-linux/html/sect_03_04.html
- http://www.zzee.com/solutions/unix-permissions.shtml#setuid
- http://www.unix.com/tips-tutorials/19060-unix-file-permissions.html
stat -c '%A %a %n' * list permissions in octal
ulimit
- ulimit - provides control over the resources available to the shell and to processes started by it, on systems that allow such control.
- limits.conf - configuration file for the pam_limits module, /etc/security/limits.conf
chmod
chmod # change file mode bits
- Chmod, Umask, Stat, Fileperms, and File Permissions - 0000 to 0777 list, etc.
# Before: drwxr-xr-x 6 archie users 4096 Jul 5 17:37 Documents chmod g= Documents chmod o= Documents # After: drwx------ 6 archie users 4096 Jul 6 17:32 Documents
Users;
- u - the user who owns the file.
- g - other users who are in the file's group.
- o - all other users.
- a - all users; the same as ugo.
Operation;
- + - to add the permissions to whatever permissions the users already have for the file.
- - - to remove the permissions from whatever permissions the users already have for the file.
- = - to make the permissions the only permissions that the users have for the file.
Permissions; r - the permission the users have to read the file. w - the permission the users have to write to the file. x - the permission the users have to execute the file, or search it if it is a directory.
chown
- https://linux.die.net/man/1/chown - change file owner and group
chown -R user:group . change all files and directories to a specific use and group [7]
Access Control Lists
Partition must be mounted with acl option. "Operation not supported" from setfacl means this is not so.
mount -o remount,acl /
Set in /etc/fstab for persistence across boots.
Opening a directory requires execute permission.
setfacl -m "u:username:permissions" setfacl -m "u:uid:permissions" add permissions for user setfacl -m "g:groupname:permissions" setfacl -m "g:gid:permissions" add permissions for group setfacl -m "u:user:rwx" file add read, write, execure perms for user for file setfacl -Rm "u:user:rw" /dir add recursive read, write perms for user for dir setfacl -Rdm "u:user:rw" /dir add recursive read, write perms for user for dir and make them default for future changes
getfacl ./ shows access control list for directory or file
Copying
- linux - Is it better to use cat, dd, pv or another procedure to copy a CD/DVD? - Unix & Linux Stack Exchange - [8]
cp
cp - copy files and directories
cp [option]… [-T] source dest
cp target --parents a/b/c existing_dir # copies the file `a/b/c' to `existing_dir/a/b/c', creating any missing intermediate directories. [9]
cp ubuntu.img /dev/sdb && sync
scp
scp -P 2264 foobar.txt your_username@remotehost.edu:/some/remote/directory scp -rP 2264 folder your_username@remotehost.edu:/some/remote/directory
- dcp is a distributed file copy program that automatically distributes and dynamically balances work equally across nodes in a large distributed system without centralized state. http://filecopy.org/
rsync
- rsync(1) - a fast, versatile, remote (and local) file-copying tool.
- rsync is a software application and network protocol for Unix-like systems with ports to Windows that synchronizes files and directories from one location to another while minimizing data transfer by using delta encoding when appropriate. Quoting the official website: "rsync is a file transfer program for Unix systems. rsync uses the 'rsync algorithm' which provides a very fast method for bringing remote files into sync." An important feature of rsync not found in most similar programs/protocols is that the mirroring takes place with only one transmission in each direction. rsync can copy or display directory contents and copy files, optionally using compression and recursion.
- Easy Automated Snapshot-Style Backups with Rsync - This document describes a method for generating automatic rotating "snapshot"-style backups on a Unix-based system, with specific examples drawn from the author's GNU/Linux experience. Snapshot backups are a feature of some high-end industrial file servers; they create the illusion of multiple, full backups per day without the space or processing overhead. All of the snapshots are read-only, and are accessible directly by users as special system directories. It is often possible to store several hours, days, and even weeks' worth of snapshots with slightly more than 2x storage. This method, while not as space-efficient as some of the proprietary technologies (which, using special copy-on-write filesystems, can operate on slightly more than 1x storage), makes use of only standard file utilities and the common rsync program, which is installed by default on most Linux distributions. Properly configured, the method can also protect against hard disk failure, root compromises, or even back up a network of heterogeneous desktops automatically.
"Unfortunately “--sparse” and “--inplace” cannot be used together. Solution: When copying the file the first time, which means it does not exist on the target server use “rsync --sparse“. This will create a sparse file on the target server and copies only the used data of the sparse file. When the file already exists on the target server and you only want to update it use “rsync --inplace“. This will only transmit the changed blocks and can also append to the existing sparse file."
rsync [OPTION...] SRC... [DEST] # copy source file or directory to destination
rsync local-file user@remote-host:remote-file rsync -e='ssh -p8023' file remotehost:~/ # non-standard remote shell command, copy to remote users home directory
rsync -r --partial --progress srcdirectory destdirectory # recursive # --partial - resume partial files, # --progress - show progress during transfer, equiv to --info=flist2,name,progress # -P same as --partial --progress rsync -rP srcdirectory/ destdirectory # recursive, resume partial files with progress bar, don't copy root source folder
rsync -avh --inplace --no-whole-file SRC DEST -a, --archive # archive mode; equals -rlptgoD (no -A,-X,-H) # recursive, links, preserve permissions/times/groups/owner/device files. -r, --recursive recurse into directories -l, --links copy symlinks as symlinks -p, --perms preserve permissions -t, --times preserve modification times -g, --group preserve group -o, --owner preserve owner (super-user only) -D --devices --specials. --devices preserve device files (super-user only) --specials preserve special files -v, --verbose list files transfered -vv -h, --human-readable output numbers in a human-readable format --inplace This option is useful for transferring large files with block-based changes or appended data, and also on systems that are disk bound, not network bound. It can also help keep a copy-on-write filesystem snapshot from diverging the entire contents of a file that only has minor changes. --no-whole-file incremental delta-xfer
-A, --acls preserve ACLs (implies -p, save permissions) -X, --xattrs preserve extended attributes -H, --hard-links preserve hard links
--exclude-from=FILE read exclude patterns from FILE --sparse handle sparse files efficiently -W, --whole-file copy files whole (w/o delta-xfer algorithm) -d, --dirs transfer directories without recursing --numeric-ids transfer numeric group and user IDs rather than mapping user and group name -x, --one-file-system don't cross filesystem boundaries (mount points). "/*" as dest would bypass -x -xx as above, but don't create empty directories for mount points --delete delete extraneous files from dest dirs --delete-after receiver deletes after transfer, not during --delete-excluded also delete excluded files from dest dirs --ignore-errors go ahead even when there are IO errors instead of regarding as fatal --stats give some file-transfer stats -z compress files during transfer
rsync -aAXv --exclude={/dev/*,/proc/*,/sys/*,/tmp/*,/run/*,/mnt/*,/media/*,/lost+found} /* /path/to/backup/directory # archive, with permissions/ACL, and attributes, exclude directories
rsync -ahrHP --info=progress2 --no-i-r --partial-dir=.rsync-partial # archive, human readable, recursive, don't flatten hard links, progress bar, no incremental recursion, store partial transfers in temp dir # alias ccp
#!/bin/sh if [ $# -lt 1 ]; then echo "No destination defined. Usage: $0 destination" >&2 exit 1 elif [ $# -gt 1 ]; then echo "Too many arguments. Usage: $0 destination" >&2 exit 1 fi if [ $2 = "new" ]; then SPARCEINPLACE="--sparse" elif [ $2 = "rerun" ]; then SPARCEINPLACE="--inplace" fi RSYNC="ionice -c3 rsync" RSYNC_ARGS="-aAXv --no-whole-file --stats --ignore-errors --human-readable --delete" # -a = --recursive, --links, --perms, --times, --group, --owner, --devices, --specials # -A = --acls, -X = --xattrs, -v = --verbose # --no-whole-file = delta-xfer algorithm # --delete = delete extraneous destination files START=$(date +%s) # SOURCES="/dir1 /dir2 /file3" TARGET=$1 echo "Executing dry-run to see how many files must be transferred..." TODO=$(${RSYNC} --dry-run ${RSYNC_ARGS} ${SOURCES} ${TARGET}|grep "^Number of files transferred"|awk '{print $5}') ${RSYNC} ${RSYNC_ARGS} ${SPARCEINPLACE} --exclude={/dev/*,/proc/*,/sys/*,/tmp/*,/run/*,/mnt/*,/media/*,/lost+found} \ --exclude={/home/*/.gvfs,/home/*/.thumbnails,/home/*/.cache,/home/*/.cache/mozilla/*,/home/*/.cache/chromium/*} \ --exclude={/home/*/.local/share/Trash/,/home/*/.macromedia,/var/lib/mpd/*,/var/lib/pacman/sync/*, /var/tmp*} \ / $TARGET | pv -l -e -p -s "$TODO" FINISH=$(date +%s) echo "-- Total backup time: $(( ($FINISH-$START) / 60 ))m, $(( ($FINISH-$START) % 60 ))s" touch $1/backup-from-$(date '+%a_%d_%m_%Y_%H%M%S') # after refining excluded, --delete-excluded
0 5 * * * rsync-backup.sh /path/to/backup/directory rerun
- https://github.com/kristapsdz/openrsync - a clean-room implementation of rsync with a BSD (ISC) license. It's compatible with a modern rsync (3.1.3 is used for testing, but any supporting protocol 27 will do), but accepts only a subset of rsync's command-line arguments. At this time, openrsync runs only on OpenBSD. If you want to port to your system (e.g. Linux, FreeBSD), read the Portability section first. [12]
Daemon
rsync --daemon # run as a daemon
lsyncd
Services
dd
- dd - Copy a file, converting and formatting according to the options.
- dd is a common Unix program whose primary purpose is the low-level copying and conversion of raw data.
if stands for input file and of for output file.
dd if=/dev/sda of=/mnt/sdb1/backup.img create a backup dd if=/mnt/sdb1/backup.img of=/dev/sda restore a backup dd if=/dev/sdb of=/dev/sdc clone a drive dd if=/dev/sdb | ssh root@target "(cat >backup.img)" backup over network dd if=/dev/cdrom of=cdimage.iso backup a cd
dd if=/dev/sr0 of=myCD.iso bs=2048 conv=noerror,sync create an ISO disk image from a CD-ROM. dd if=/dev/sda2 of=/dev/sdb2 bs=4096 conv=noerror Clone one partition to another dd if=/dev/ad0 of=/dev/ad1 bs=1M conv=noerror Clone a hard disk "ad0" to "ad1". dd if=/dev/zero bs=1024 count=1000000 of=file_1GB dd if=file_1GB of=/dev/null bs=64k drive benchmark test and analyze the sequential read and write performance for 1024 byte blocks
dd if=local-file | ssh <host> "dd of=/path/to/remote-file"
hot-clone
tar pipe
- The Tar Pipe - It basically means "copy the src directory to dst, preserving permissions and other special stuff." It does this by firing up two tars – one tarring up src, the other untarring to dst, and wiring them together. [14]
File information
ls
ls list files in current directory ls -l long list, each file on a new line with information ls * files in directory and immediate subdiretories ls -a # show hidden files ls -A # show hidden files, exclude . and ..
ls -m1 -m fill width with a comma separated list of entries ?? ls --format single-column column of names only ls -l | grep - | awk '{print $9}' using awk to show the 9th word (name). strips colour. ls -l | cut -f9 -s -d" " using cut to cut from the 9th word, using space as a delimiter. strips colour.
ls | cat # separate words into new lines
- Greg's Wiki: ParsingLs - Why you shouldn't parse the output of ls(1)
- https://github.com/hackerb9/lsix - Like "ls", but for images. Shows thumbnails in terminal using sixel graphics.
exa
even-better-ls
- https://github.com/mnurzia/even-better-ls - LS + Icons + Formatting
stat
stat . display file or file system status stat -c "%n %a" * | column -t directory files + octal
chattr/lsattr
lsattr .
append only (a), compressed (c), no dump (d), immutable (i), data journaling (j), secure deletion (s), no tail-merging (t), undeletable (u), no atime updates (A), synchronous directory updates (D), synchronous updates (S), and top of directory hierarchy (T).
file
file [filename]
binwalk
- https://github.com/ReFirmLabs/binwalk - a fast, easy to use tool for analyzing, reverse engineering, and extracting firmware images.
pwd
cd
cd change/directory/path
Autojump
z
- https://github.com/rupa/z - jump around, Bash. [16]
- https://github.com/agkozak/zsh-z - Jump quickly to directories that you have visted "frecently." A native ZSH port of z.sh.
fasd
- https://github.com/clvv/fasd - Command-line productivity booster, offers quick access to files and directories, inspired by autojump, z and v. Bash. [17]
v def conf => vim /some/awkward/path/to/type/default.conf j abc => cd /hell/of/a/awkward/path/to/get/to/abcdef m movie => mplayer /whatever/whatever/whatever/awesome_movie.mp4 o eng paper => xdg-open /you/dont/remember/where/english_paper.pdf vim `f rc lo` => vim /etc/rc.local vim `f rc conf` => vim /etc/rc.conf
alias defaults;
alias a='fasd -a' # any alias s='fasd -si' # show / search / select alias d='fasd -d' # directory alias f='fasd -f' # file alias sd='fasd -sid' # interactive directory selection alias sf='fasd -sif' # interactive file selection alias z='fasd_cd -d' # cd, same functionality as j in autojump alias zz='fasd_cd -d -i' # cd with interactive selection
autojump
- https://github.com/joelthelion/autojump - very handy for navigation, Python.
pazi
- https://github.com/euank/pazi - An autojump "zap to directory" helper, Rust.
zoxide
- https://github.com/ajeetdsouza/zoxide - A fast cd command that learns your habits
z.lua
- https://github.com/skywind3000/z.lua - zap A new cd command that helps you navigate faster by learning your habits.
zfm
- https://github.com/pabloariasal/zfm - a minimal command line bookmark manager for zsh built on top of fzf. It lets you bookmark files and directories in your system and rapidly access them.It's intended to be a less intrusive alternative to z, autojump or fasd that doesn't pollute your prompt command or create bookmarks behind the scenes: you have full control over what gets bookmarked and when, like bookmarks on a web browser.
zz
cx
- https://github.com/colfrog/cx - a very simple directory history utility written in C, a shell-agnostic clone of z [19]
to sort
- http://jeroenjanssens.com/2013/08/16/quickly-navigate-your-filesystem-from-the-command-line.html [20]
- https://aur.archlinux.org/packages/z.go-git - baskerville
- https://github.com/skywind3000/z.lua - A new cd command that helps you navigate faster by learning your habits zap
- https://github.com/jamesob/desk - Lightweight workspace manager for the shell. Desk makes it easy to flip back and forth between different project contexts in your favorite shell. [23]
- https://github.com/Peltoche/lsd - next gen ls command
- https://gitlab.com/chanceboudreaux/rx - Open files and surf through directories with ease.
Creating files
touch
touch filename create a file or update timestamp
mkdir
mkdir directory mkdir directory -p no error if existing, make parent directories as needed
ln
symlink
ln -s {target-filename} ln -s {target-filename} {symbolic-filename} create soft link
fallocate
- fallocate - used to manipulate the allocated disk space for a file, either to deallocate or preallocate it. For filesystems which support the fallocate system call, preallocation is done quickly by allocating blocks and marking them as uninitialized, requiring no IO to the data blocks. This is much faster than creating a file by filling it with zeroes. The exit code returned by fallocate is 0 on success and 1 on failure.
Test files
yes abcdefghijklmnopqrstuvwxyz0123456789 > largefile # about 10 times faster than running dd if=/dev/urandom of=largefile, about as fast as using dd if=/dev/zero of=filename bs=1M. [24]
Editing files
echo "hello" >> greetings.txt cat temp.txt >> data.txt To append the contents of the file temp.txt to file data.txt date >> dates.txt To append the current date/time timestamp to the file dates.txt
Patching files
- patch(1) - apply a diff file to an original
- https://github.com/loyso/Scarab - A system to patch your content files. Basically, Scarab is a C++ library to build and apply binary diff package for directories. Scarab is built to be embeddable into your application or service.
Viewing files
<filename
cat
cat filename output file to screen cat -n filename output file to screen w/ line numbers cat filename1 filename2 output two files (concatinate) cat filename1 > filename2 overwrite filename2 with filename1 cat filename1 >> filename2 append filename1 to filename2 cat filename{1,2} > filename2 add filename1 and filename2 together into filename3
head
head filename top 10 lines of file head -23 filename top 23 lines of file
tail
tail filename bottom 10 lines of file tail -23 filename bottom 23 lines of file
Other
sed -n 20,30p filename print lines 20..30 of file [26]
bat
- https://github.com/sharkdp/bat - A cat(1) clone with syntax highlighting and Git integration.
look
- look(1) - displays any lines in file which contain string as a prefix. As look performs a binary search, the lines in file must be sorted. If file is not specified, the file /usr/share/dict/words is used, only alphanumeric characters are compared and the case of alphabetic charac- ters is ignored.
twf
- https://github.com/wvanlint/twf - a standalone tree view explorer inspired by fzf.
File pagers
more
more is a filter for paging through text one screenful at a time. This version is especially primitive. Users should realize that less(1) provides more(1) emulation plus extensive enhancements.
less
less is an improvement on more and a funny name.
- lesspipe.sh - a preprocessor for less
most
vimpager
other
Moving files
mv
mv position1 ~/position2 basic move
- http://superuser.com/questions/187866/unix-shell-scripting-how-to-recursively-move-files-up-one-directory
- http://serverfault.com/questions/122233/how-to-recursively-move-all-files-including-hidden-in-a-subfolder-into-a-paren
- https://github.com/jarun/advcpmv - A patch for GNU Core Utilities cp, mv to add progress bars
renameutils
- renameutils - a set of programs designed to make renaming of files faster and less cumbersome. The file renaming utilities consists of five programs - qmv, qcp, imv, icp and deurlname.
The qmv ("quick move") program allows file names to be edited in a text editor. The names of all files in a directory are written to a text file, which is then edited by the user. The text file is read and parsed, and the changes are applied to the files.
The qcp ("quick cp") program works like qmv, but copies files instead of moving them.
The imv ("interactive move") program, is trivial but useful when you are too lazy to type (or even complete) the name of the file to rename twice. It allows a file name to be edited in the terminal using the GNU Readline library. icp copies files.
The deurlname program removes URL encoded characters (such as %20 representing space) from file names. Some programs such as w3m tend to keep those characters encoded in saved files.
Métamorphose
- Métamorphose - a batch renamer, a program to rename large sets of files and folders quickly and easily.With its extensive feature set, flexibility and powerful interface, Métamorphose is a profesional's tool. A must-have for those that need to rename many files and/or folders on a regular basis.In addition to general usage renaming, it is very useful for photo and music collections, webmasters, programmers, legal and clerical, etc.
nmly
- https://github.com/Usbac/nmly - Mass file rename utility with useful functions and written in C
Detox
- Detox - a utility designed to clean up filenames. It replaces difficult to work with characters, such as spaces, with standard equivalents. It will also clean up filenames with UTF-8 or Latin-1 (or CP-1252) characters in them.
Removing files
rm
rm file rm -rf directory
find * -maxdepth 0 -name 'keepthis' -prune -o -exec rm -rf '{}' ';' # remove all but keepthis [30]
fd somefiletype | xargs rm # remove all files returned by fd
scrub
- https://github.com/romanhargrave/scrub - a utility that behaves like rm -r, except that rather than simply removing everything it will only remove what you tell it to.
Managing files
See also File managers
mc
ranger
- http://ranger.nongnu.org/ - ranger is a console file manager with VI key bindings. It provides a minimalistic and nice curses interface with a view on the directory hierarchy. It ships with "rifle", a file launcher that is good at automatically finding out which program to use for what file type.
setup ~/.config/ranger/ with defaults;
ranger --copy-config=all
Vifm
- Vifm - an ncurses based file manager with vi like keybindings/modes/options/commands/configuration, which also borrows some useful ideas from mutt.
If you use vi, Vifm gives you complete keyboard control over your files without having to learn a new set of commands.
- https://github.com/cirala/vifmimg - Image previews using Überzug for Vifm (vi file manager)
deer
- https://github.com/Vifon/deer - a file navigator for zsh heavily inspired by ranger.
lscd
- https://github.com/hut/lscd - ranger-clone in POSIX shell
nnn
noice
- noice - small file browser
hunter
- https://github.com/rabite0/hunter - a fast and lag-free file browser/manager for the terminal. It features a heavily asynchronous and multi-threaded design and all disk IO happens off the main thread in a non-blocking fashion, so that hunter will always stay responsive, even under heavy load on a slow spinning rust disk, even with all the previews enabled.
cfm
- https://github.com/WillEccles/cfm - a TUI file manager with some features. It's quick and light. Whether or not you should use it depends on whether or not you like the name and/or the dev, like with all software.
enhancd
- https://github.com/b4b4r07/enhancd - next-generation cd command with an interactive filter
filet
- https://github.com/buffet/filet - a blazingly fast, lightweight file manager, with a focus on a clear and easy to understand code base.
lf
- https://github.com/gokcehan/lf - a terminal file manager written in Go. It is heavily inspired by ranger with some missing and extra features. Some of the missing features are deliberately omitted since they are better handled by external tools. See faq for more information and tutorial for a gentle introduction with screencasts.
Other
- Worker file manager - MC clone for X11
- https://github.com/D630/fzf-fs - acts like a very simple and configurable file browser/navigator for the command line by taking advantage of the general-purpose fuzzy finder fzf. Although coming without Miller columns, fzf-fs is inspired by tools like lscd and deer, which both follow the example set by ranger. [31]
- https://github.com/psprint/zsh-navigation-tools - Curses-based tools for ZSH
Finding files
ls | sort -f | uniq -i -d # list duplicate files taking upper/lower case into account [32]
whereis
GNU findutils
GNU Find Utilities are the basic directory searching utilities of the GNU operating system. These programs are typically used in conjunction with other programs to provide modular and powerful directory search and file locating capabilities to other commands.
The tools supplied with this package are:
- find - search for files in a directory hierarchy
- locate - list files in databases that match a pattern
- updatedb - update a file name database
- xargs - build and execute command lines from standard input
find
find /usr/share -name README find ~/Journalism -name '*.txt' find ~/Programming -path '*/src/*.c'
find ~/Journalism -name '*.txt' -exec cat {} ; exec command on result path (aliases don't work in exec argument)
find ~/Images/Screenshots -size +500k -iname '*.jpg' find ~/Journalism -name '*.txt' -print0 | xargs -0 cat (faster than above) find / -group [group] find / -user [user]
find . -mtime -[n] File's data was last modified n*24 hours ago find . -mtime +5 -exec rm {} \; remove files older than 5 days
find . -type f -links +1 list hard links
xargs
xargs reads items from the standard input, delimited by blanks (which can be protected with double or single quotes or a backslash) or newlines, and executes the command (default is /bin/echo) one or more times with any initial-arguments followed by items read from standard input. Blank lines on the standard input are ignored.
Because Unix filenames can contain blanks and newlines, this default behaviour is often problematic; filenames containing blanks and/or newlines are in‐correctly processed by xargs. In these situations it is better to use the -0 option, which prevents such problems. When using this option you will need to ensure that the program which produces the input for xargs also uses a null character as a separator. If that program is GNU find for example, the -print0 option does this for you.
If any invocation of the command exits with a status of 255, xargs will stop immediately without reading any further input. An error message is issued on stderr when this happens.
echo 'one two three' | xargs mkdir ls $ one two three echo 'one two three' | xargs -t rm $ rm one two three
find . -name '*.py' | xargs wc -l # Recursively find all Python files and count the number of lines find . -name '*~' | xargs -0 rm # Recursively find all Emacs backup files and remove them, alloging for filenames with whitespace find . -name '*.py' | xargs grep 'import' # Recursively find all Python files and search them for the word ‘import’
find . -type f -print0 | xargs -0 stat -c "%y %s %n" # prints permissions in octal (0775, etc.)
find . -name "*.ext" -print0 | xargs -n 1000 -I '{}' mv '{}' ../.. # for a number of files to high for the mv command (move to two directories up)
bfs
- https://github.com/tavianator/bfs - a variant of the UNIX find command that operates breadth-first rather than depth-first. It is otherwise intended to be compatible with many versions of find, including POSIX find, GNU find, {Free,Open,Net}BSD find, macOS find
fd
- https://github.com/sharkdp/fd - a simple, fast and user-friendly alternative to find. While it does not seek to mirror all of find's powerful functionality, it provides sensible (opinionated) defaults for 80% of the use cases.
hf
- https://github.com/hugows/hf - a command line utility to quickly find files and execute a command - something like Helm/Anything/CtrlP for the terminal. It tries to find the best match, like other fuzzy finders (Sublime, ido, Helm). [34]
locate
locate fileordirectory locate / locate / | xargs -i echo 'test -f "{}" && echo "{}"' | sh # only files locate / | xargs -i echo 'test -f "{}" && echo "{}"' | sh # only directories [35]
"Although in other distros locate and updatedb are in the findutils package, they are no longer present in Arch's package. To use it, install the mlocate package. mlocate is a newer implementation of the tool, but is used in exactly the same way."
mlocate is a locate/updatedb implementation. The 'm' stands for "merging": updatedb reuses the existing database to avoid rereading most of the file system, which makes updatedb faster and does not trash the system caches as much. The locate(1) utility is intended to be completely compatible to slocate. It also attempts to be compatible to GNU locate, when it does not conflict with slocate compatibility.
Before locate can be used, the database will need to be created. To do this, simply run updatedb as root.
sudo updatedb # creates/updates a db file of paths that is queried by locate
/etc/updatedb.conf
PRUNE_BIND_MOUNTS = "yes" PRUNEFS = "9p afs anon_inodefs auto autofs bdev binfmt_misc cgroup cifs coda configfs cpuset cramfs debugfs devpts devtmpfs ecryptfs exofs ftpfs fuse fuse.encfs fuse.sshfs fusectl gfs gfs2 hugetlbfs inotifyfs iso9660 jffs2 lustre mqueue ncpfs nfs nfs4 nfsd pipefs proc ramfs rootfs rpc_pipefs securityfs selinuxfs sfs shfs smbfs sockfs sshfs sysfs tmpfs ubifs udf usbfs vboxsf" PRUNENAMES = ".git .hg .svn" PRUNEPATHS = "/afs /media /mnt /net /sfs /tmp /udev /var/cache /var/lib/pacman/local /var/lock /var/run /var/spool /var/tmp"
/var/lib/mlocate/mlocate.db
strings /var/lib/mlocate/mlocate.db | grep -E '^/.*config' # query db directly, needs sudo or sudoers or acl
fselect
- https://github.com/jhspetersson/fselect - Find files with SQL-like queries
to sort
Diffing files
File and directory comparison.
- https://github.com/spcau/godiff - A File/Directory diff-like comparison tool with HTML output.
- diffoscope - in-depth comparison of files, archives, and directories. diffoscope will try to get to the bottom of what makes files or directories different. It will recursively unpack archives of many kinds and transform various binary formats into more human readable form to compare them. It can compare two tarballs, ISO images, or PDF just as easily.
- https://github.com/renardchien/Differnator - a tool for comparing two similar codebases leveraging diff command operations and other comparative operations. Differnator can do large scale comparisons across several directories or numerous pairs of similar codebases.
GUI
- Krusader - Twin panel file management for your desktop - an advanced twin panel (commander style) file manager for KDE Plasma and other desktops in the *nix world, similar to Midnight or Total Commander.
- xxdiff - a graphical file and directories comparator and merge tool, provided under the GNU GPL open source license. It has reached stable state, and is known to run on many popular unices, including IRIX, Linux, Solaris, HP/UX, DEC Tru64. It has been deployed inside many large organizations and is being actively maintained by its author (Martin Blais).
- Diffuse - graphical tool for merging and comparing text files
Finding duplicate files
- md5deep and hashdeep - md5deep is a set of programs to compute MD5, SHA-1, SHA-256, Tiger, or Whirlpool message digests on an arbitrary number of files. md5deep is similar to the md5sum program found in the GNU Coreutils package, but has the following additional features: Recursive operation - md5deep is able to recursive examine an entire directory tree. That is, compute the MD5 for every file in a directory and for every file in every subdirectory. Comparison mode - md5deep can accept a list of known hashes and compare them to a set of input files. The program can display either those input files that match the list of known hashes or those that do not match. Hashes sets can be drawn from Encase, the National Software Reference Library, iLook Investigator, Hashkeeper, md5sum, BSD md5, and other generic hash generating programs. Users are welcome to add functionality to read other formats too! Time estimation - md5deep can produce a time estimate when it's processing very large files. Piecewise hashing - Hash input files in arbitrary sized blocks File type mode - md5deep can process only files of a certain type, such as regular files, block devices, etc. hashdeep is a program to compute, match, and audit hashsets. With traditional matching, programs report if an input file matched one in a set of knows or if the input file did not match. It's hard to get a complete sense of the state of the input files compared to the set of knowns. It's possible to have matched files, missing files, files that have moved in the set, and to find new files not in the set. Hashdeep can report all of these conditions. It can even spot hash collisions, when an input file matches a known file in one hash algorithm but not in others. The results are displayed in an audit report.
- FreeFileSync - a folder comparison and synchronization software that creates and manages backup copies of all your important files. Instead of copying every file every time, FreeFileSync determines the differences between a source and a target folder and transfers only the minimum amount of data needed. FreeFileSync is Open Source software, available for Windows, Linux and macOS.
- dupeGuru - a cross-platform (Linux, OS X, Windows) GUI tool to find duplicate files in a system. It’s written mostly in Python 3 and has the peculiarity of using multiple GUI toolkits, all using the same core Python code. On OS X, the UI layer is written in Objective-C and uses Cocoa. On Linux 7 Windows, it’s written in Python and uses Qt5. dupeGuru is a tool to find duplicate files on your computer. It can scan either filenames or contents. The filename scan features a fuzzy matching algorithm that can find duplicate filenames even when they are not exactly the same. dupeGuru runs on Mac OS X and Linux. dupeGuru is efficient. Find your duplicate files in minutes, thanks to its quick fuzzy matching algorithm. dupeGuru not only finds filenames that are the same, but it also finds similar filenames.
- https://github.com/adrianlopezroche/fdupes - a program for identifying or deleting duplicate files residing within specified directories.
- https://github.com/jbruchon/jdupes - A powerful duplicate file finder and an enhanced fork of 'fdupes'. What jdupes is not: a similar (but not identical) file finding tool.
- https://github.com/sahib/rmlint - finds space waste and other broken things on your filesystem and offers to remove it.
- Duff - a Unix command-line utility for quickly finding duplicates in a given set of files.Duff is written in C, uses gettext where available, is licensed under the zlib/libpng license and should compile on most modern Unices.
- https://freshmeat.sourceforge.net/projects/ftwin - a tool useful to find duplicate files or pictures according to their content on your file system.
- dupd - A fast and convenient CLI tool for finding duplicate files
- https://github.com/IgnorantGuru/rmdupe - uses standard linux commands to search within specified folders for duplicate files, regardless of filename or extension. Before duplicate candidates are removed they are compared byte-for-byte. rmdupe can also check duplicates against one or more reference folders, can trash files instead of removing them, allows for a custom removal command, and can limit its search to files of specified size. rmdupe includes a simulation mode which reports what will be done for a given command without actually removing any files.
- https://github.com/sebastien/sink - Swiss army knife for directory comparison and synchronization. Python.
- https://github.com/tbores/fiche - Directory comparison tools. Fiche uses hashes based comparison. Python.
- https://github.com/stephen322/dircmp - directory comparison utility. D.
- https://github.com/l00g33k/dirtree - This is a command line directory utility that works on both Windows and Linux. It scans the directory information and store in a file. This allows two computers to be scanned at a different time and even in two different country and allows the comparison to be carried out at a different time and place. This allows terabytes of disk with millions…
- https://github.com/bmaia/binwally - Binary and Directory tree comparison tool using Fuzzy Hashing
- https://github.com/drrlvn/fscmp - Directory/file comparison utility. Rust.
- https://github.com/lordmulder/DoubleFilerScanner - Scan for duplicate files.
Archiving
ar
- https://en.wikipedia.org/wiki/ar_(Unix) - a Unix utility that maintains groups of files as a single archive file. Today, ar is generally used only to create and update static library files that the link editor or linker uses and for generating .deb packages for the Debian family; it can be used to create archives for any purpose, but has been largely replaced by tar for purposes other than static libraries. An implementation of ar is included as one of the GNU Binutils.
shar
- http://en.wikipedia.org/wiki/shar - an abbreviation of shell archive, is an archive format. A shar file is a shell script, and executing it will recreate the files. This is a type of self-extracting archive file. It can be created with the Unix shar utility. To extract the files, only the standard Unix Bourne shell sh is usually required. Note that shar is not specified by the Single Unix Specification, so it is not formally a component of Unix, but a legacy utility.
unshar programs have been written for other operating systems but are not always reliable; shar files are shell scripts and can theoretically do anything that a shell script can do (including using incompatible features of enhanced or workalike shells), limiting their utility outside the Unix world.
The drawback of self-extracting shell scripts (any kind, not just shar) is that they rely on a particular implementation of programs; shell archives created with older versions of makeself
GNU paxutils
cpio
- https://en.wikipedia.org/wiki/cpio - The cpio archive format has several basic limitations: It does not store user and group names, only numbers. As a result, it cannot be reliably used to transfer files between systems with dissimilar user and group numbering.
cd /old-dir find . -name '*.mov' -print | cpio -pvdumB /new-dir [36]
tar
-z: Compress archive using gzip program -c: Create archive -v: Verbose i.e display progress while creating archive -f: Archive File name
tar -zcvf archive-name.tar.gz directory-name tar -cjf foo.tar.bz2 bar/ create bzipped tar archive of the directory bar called foo.tar.bz2
tar -xvf foo.tar verbosely extract foo.tar tar -xzf foo.tar.gz extract gzipped foo.tar.gz tar -xjf foo.tar.bz2 -C bar/ extract bzipped foo.tar.bz2 after changing directory to bar tar -xzf foo.tar.gz blah.txt extract the file blah.txt from foo.tar.gz
- https://github.com/rxi/microtar - A lightweight tar library written in ANSI C
pax
- pax will read, write, and list the members of an archive file, and will copy directory hierarchies. pax operation is independent of the specific archive format, and supports a wide variety of different archive formats. A list of supported archive formats can be found under the description of the -x option. [38]
mkdir newdir cd olddir pax -rw . newdir
libarchive
- libarchive - Multi-format archive and compression library
Afio
- Afio - makes cpio-format archives. It deals somewhat gracefully with input data corruption, supports multi-volume archives during interactive operation, and can make compressed archives that are much safer than compressed tar or cpio archives. Afio is best used as an `archive engine' in a backup script.
Trash can
Caches
- https://github.com/xtaran/unburden-home-dir - Automatically unburden $HOME from caches, etc. Useful for $HOME on SSDs, small disks or slow NFS homes. Can be triggered via an hook in /etc/X11/Xsession.d/.
Storage space
CLI tools
du (disk usage)
du -sh size of a folder du -S size of files in a folder du -aB1m|awk '$1 >= 100' everything over 100Mb
cd / | sudo du -khs * show root folder size sudo du -a --max-depth=1 /usr/lib | sort -n -r | head -n 20 size of program folders /usr/lib du -sk ./* | sort -nr | awk 'BEGIN{ pref[1]="K"; pref[2]="M"; pref[3]="G";} { total = total + $1; x = $1; y = 1; while( x > 1024 ) { x = (x + 1023)/1024; y++; } printf("%g%s\t%s\n",int(x*10)/10,pref[y],$2); } END { y = 1; while( total > 1024 ) { total = (total + 1023)/1024; y++; } printf("Total: %g%s\n",int(total*10)/10,pref[y]); }'
diskus
- https://github.com/sharkdp/diskus - a very simple program that computes the total size of the current directory. It is a parallelized version of du -sh. On my 8-core laptop, it is about ten times faster than du with a cold disk cache and more than three times faster with a warm disk cache. [39]
dust
- https://github.com/bootandy/dust - A more intuitive version of du in rust [40]
df
- df - report file system disk space usage
- http://en.wikipedia.org/wiki/Df_(Unix) - a standard Unix command used to display the amount of available disk space for file systems on which the invoking user has appropriate read access. df is typically implemented using the statfs or statvfs system calls.
df -h human readable
di
- di - a disk information utility, displaying everything (and more) that your 'df' command does. It features the ability to display your disk usage in whatever format you prefer. It also checks the user and group quotas, so that the user sees the space available for their use, not the system wide disk space.
dfc
- https://github.com/Rolinh/dfc - a tool to report file system space usage information. When the output is a terminal, it uses color and graphs by default. It has a lot of features such as HTML, JSON and CSV export, multiple filtering options, the ability to show mount options and so on.
dv
- https://github.com/ARM-DOE/dv - a command line tool for visualizing disk data usage on systems that don't have access to a graphical environment. After scanning a directory, dv generates an interactive webpage that displays useful information about the disk's contents. This webpage is a sunburst partition layout, where each subdirectory is displayed as a portion of the scanned directory. Hovering over a directory shows its size and percentage of the whole. dv is multiprocessed, and can be used to quickly visualize what is taking up space on a system.
TUI
tdu
- tdu - a text-mode disk-usage utility
cdu
- cdu - Color du, a perl script which calls du and display a pretty histogram with optional colors which allow to imediatly see the directories which take disk space.
ncdu
- ncdu - ncurses disk usage
ncdu / --exclude /home --exclude /media --exclude /run/media check everything apart from home and external drives ncdu / --exclude /home --exclude /media --exclude /run/media check everything apart from external drives
ncdu / --exclude /home --exclude /media --exclude /run/media --exclude /boot --exclude /tmp --exclude /dev --exclude /proc just the root partition
godu
- https://github.com/viktomas/godu - Simple golang utility helping to discover large files/folders.
gdu
- https://github.com/dundee/gdu - Disk usage analyzer with console interface written in Go
Broot
- https://github.com/Canop/broot - An interactive tree view, a fuzzy search, a balanced BFS descent and customizable commands. [41]
ohmu
- https://gitlab.com/paul-nechifor/ohmu - View space usage in your terminal.
duff
- https://github.com/muesli/duf - Disk Usage/Free Utility (Linux, BSD & macOS) [42]
GUI
Baobab
- Baobab - aka Disk Usage Analyzer, is a graphical, menu-driven viewer that you can use to view and monitor your disk usage and folder structure. It is part of every GNOME desktop.
QDirStat
- https://github.com/shundhammer/qdirstat - a graphical application to show where your disk space has gone and to help you to clean it up. This is a Qt-only port of the old Qt3/KDE3-based KDirStat, now based on the latest Qt 5. It does not need any KDE libs or infrastructure. It runs on every X11-based desktop on Linux, BSD and other Unix-like systems.
Filelight
- Filelight - an application to visualize the disk usage on your computerFeatures: Scan local, remote or removable disks; Configurable color schemes; File system navigation by mouse clicks; Information about files and directories on hovering; Files and directories can be copied or removed directly from the context menu Integration into Konqueror and Krusader
xdiskusage
- xdiskusage - a user-friendly program to show you what is using up all your disk space. It is based on the design of xdu written by Phillip C. Dykstra <dykstra at ieee dot org>. Changes have been made so it runs "du" for you, and can display the free space left on the disk, and produce a PostScript version of the display.
Filelight
- Filelight - creates an interactive map of concentric, segmented rings that help visualise disk usage on your computer.
Multi
Duc
- Duc - a collection of tools for indexing, inspecting and visualizing disk usage. Duc maintains a database of accumulated sizes of directories of the file system, and allows you to query this database with some tools, or create fancy graphs showing you where your bytes are.
Web
diskover
- diskover - File system crawler, storage search engine and storage analytics software powered by Elasticsearch to help visualize and manage your disk space usage.
Tagging
- TMSU - a tool for tagging your files. It provides a simple command-line tool for applying tags and a virtual filesystem so that you can get a tag-based view of your files from within any other program.TMSU does not alter your files in any way: they remain unchanged on disk, or on the network, wherever you put them. TMSU maintains its own database and you simply gain an additional view, which you can mount, based upon the tags you set up. The only commitment required is your time and there's absolutely no lock-in.
- https://bitbucket.org/majo/oyepa/src/default/oyepa - a way to organize (and locate) your documents through ----- the use of tags