Compression
Jump to navigation
Jump to search
General
See also Storage/Files#Archiving
- https://en.wikipedia.org/wiki/pack_(compression) - a (now deprecated) Unix shell compression program based on Huffman coding. The unpack utility will restore files to their original state after they have been compressed using the pack utility. If no files are specified, the standard input will be uncompressed to the standard output.
- http://tukaani.org/lzma/benchmarks.html - use xz
- https://news.ycombinator.com/item?id=6973501 - linux now uses xz
- http://imoverclocked.blogspot.nl/2015/12/for-love-of-bits-stop-using-gzip.html
- https://github.com/temisu/ancient - Decompression routines for ancient formats
LZ
- https://en.wikipedia.org/wiki/LZ77_and_LZ78 - the two lossless data compression algorithms published in papers by Abraham Lempel and Jacob Ziv in 1977 and 1978. They are also known as LZ1 and LZ2 respectively. These two algorithms form the basis for many variations including LZW, LZSS, LZMA and others. Besides their academic influence, these algorithms formed the basis of several ubiquitous compression schemes, including GIF and the DEFLATE algorithm used in PNG and ZIP.
- https://en.wikipedia.org/wiki/Lempel–Ziv–Welch - (LZW) is a universal lossless data compression algorithm created by Abraham Lempel, Jacob Ziv, and Terry Welch. It was published by Welch in 1984 as an improved implementation of the LZ78 algorithm published by Lempel and Ziv in 1978. The algorithm is simple to implement and has the potential for very high throughput in hardware implementations. It is the algorithm of the widely used Unix file compression utility compress and is used in the GIF image format.
- https://en.wikipedia.org/wiki/compress a Unix shell compression program based on the LZW compression algorithm. Compared to more modern compression utilities such as gzip and bzip2, compress performs faster and with less memory usage, at the cost of a significantly lower compression ratio. The uncompress utility will restore files to their original state after they have been compressed using the compress utility. If no files are specified, the standard input will be uncompressed to the standard output.
ZIP
unzip -l archive.zip # list files in archive for z in *.zip; do unzip $z; done # unzip all in folder, overwrite files automatically
- http://stackoverflow.com/questions/20762094/how-are-zlib-gzip-and-zip-related-what-do-they-have-in-common-and-how-are-they/20765054#20765054 [7]
- https://github.com/ckolivas/lrzip - A compression utility that excels at compressing large files (usually > 10-50 MB). Larger files and/or more free RAM means that the utility will be able to more effectively compress your files (ie: faster / smaller size), especially if the filesize(s) exceed 100 MB. You can either choose to optimise for speed (fast compression / decompression) or size, but not both.
gzip
- gzip, gunzip, zcat - compress or expand files
- http://www.infinitepartitions.com/art001.html
- http://jvns.ca/blog/2013/10/24/day-16-gzip-plus-poetry-equals-awesome/
- https://news.ycombinator.com/item?id=11944525
bzip2
LZ4
LZMA
7z
- 7-Zip - a file archiver with the highest compression ratio. The program supports 7z (that implements LZMA compression algorithm), ZIP, CAB, ARJ, GZIP, BZIP2, TAR, CPIO, RPM and DEB formats. Compression ratio in the new 7z format is 30-50% better than ratio in ZIP format.
- p7zip is a port of 7za.exe for POSIX systems like Unix (Linux, Solaris, OpenBSD, FreeBSD, Cygwin, AIX, ...), MacOS X and also for BeOS and Amiga. 7za.exe is the command line version of 7-zip, see http://www.7-zip.org/. 7-Zip is a file archiver with highest compression ratio.
- man z7 (p7zip)
- p7zip-light in AUR
7z x filename extract archive with directories 7z a myzip ./MyFolder/* add a folder to an archive
- https://github.com/jinfeihan57/p7zip -A new p7zip fork with additional codecs and improvements (forked from https://sourceforge.net/projects/p7zip/).
LZMA2
xz
tar -cvJf filename.tar.xz directory/* morefiles.. # create verbose xz filearchive
- https://github.com/conor42/fxz - a fork of XZ Utils. It adds a multi-threaded radix match finder and optimized encoder
Brotli
- https://tools.ietf.org/html/draft-alakuijala-brotli-07
- http://calendar.perfplanet.com/2015/new-years-diet-brotli-compression/ [11]
- http://caniuse.com/#search=brotli
- https://en.wikipedia.org/wiki/Brotli - a data format specification for data streams compressed with a specific combination of the general-purpose LZ77 lossless compression algorithm, Huffman coding and 2nd order context modelling. Brotli was initially developed to decrease the size of transmissions of WOFF2 web fonts, and in that context was a continuation of the development of zopfli, which is a zlib-compatible implementation of the standard gzip and deflate specifications.
Zopfli
- https://github.com/google/zopfli - a compression library programmed in C to perform very good, but slow, deflate or zlib compression.
- https://en.wikipedia.org/wiki/Zopfli - data compression software that encodes data into DEFLATE, gzip and zlib formats. It achieves higher compression than other DEFLATE/zlib implementations, but takes much longer to perform the compression. It was first released in February 2013 by Google as a free software programming library under the Apache License, Version 2.0. The name Zöpfli is the Swiss German diminutive of “Zopf”, an unsweetened type of Hefezopf.
Zstandard
- https://en.wikipedia.org/wiki/Zstandard - or zstd, is a lossless data compression algorithm developed by Yann Collet at Facebook. Zstd is the reference implementation in C. Version 1 of this implementation was released as free software on 31 August 2016.
Snappy
- Snappy - a compression/decompression library. It does not aim for maximum compression, or compatibility with any other compression library; instead, it aims for very high speeds and reasonable compression. For instance, compared to the fastest mode of zlib, Snappy is an order of magnitude faster for most inputs, but the resulting compressed files are anywhere from 20% to 100% bigger. On a single core of a Core i7 processor in 64-bit mode, Snappy compresses at about 250 MB/sec or more and decompresses at about 500 MB/sec or more.
cmix
- https://github.com/byronknoll/cmix -lossless data compression program aimed at optimizing compression ratio at the cost of high CPU/memory usage.
Helpers
# Extract Files extract() { if [ -f $1 ] ; then case $1 in *.tar.bz2) tar xvjf $1 ;; *.tar.gz) tar xvzf $1 ;; *.tar.xz) tar xvJf $1 ;; *.bz2) bunzip2 $1 ;; *.rar) unrar x $1 ;; *.gz) gunzip $1 ;; *.tar) tar xvf $1 ;; *.tbz2) tar xvjf $1 ;; *.tgz) tar xvzf $1 ;; *.zip) unzip $1 ;; *.Z) uncompress $1 ;; *.7z) 7z x $1 ;; *.xz) unxz $1 ;; *.exe) cabextract $1 ;; *) echo "\`$1': unrecognized file compression" ;; esac else echo "\`$1' is not a valid file" fi }
atool
- atool - a script for managing file archives of various types (tar, tar+gzip, zip etc). The main command is aunpack which extracts files from an archive. Did you ever extract files from an archive, not checking whether the files were located in a subdirectory or in the top directory of the archive, resulting in files scattered all over the place? aunpack overcomes this problem by first extracting to a new directory. If there was only a single file in the archive, that file is moved to the original directory. aunpack also prevents local files from being overwritten by mistake.
atool archive.tar.gz # extract archive to subdir, or current dir if only one file atool -D archive.tar.gz # extract archive to subdir
dtrx
- dtrx - extracts archives in a number of different formats; it currently supports tar, zip (including self-extracting .exe files), cpio, rpm, deb, gem, 7z, cab, rar, lzh, and InstallShield files. It can also decompress files compressed with gzip, bzip2, lzma, xz, or compress. In addition to providing one command to handle many different archive types, dtrx also aids the user by extracting contents consistently. By default, everything will be written to a dedicated directory that's named after the archive. dtrx will also change the permissions to ensure that the owner can read and write all those files.
patool
- patool - a portable archive file manager
unp
- https://github.com/mitsuhiko/unp - a command line tool that can unpack archives easily. It mainly acts as a wrapper around other shell tools that you can find on various POSIX systems. It figures out how to invoke an unpacker to achieve the desired result. In addition to that it will safely unpack files when an archive contains more than one top level item. In those cases it will wrap the resulting file in a folder so that your working directory does not get messed up.
archiver
- https://github.com/mholt/archiver - Easily create and extract .zip, .tar, .tar.gz, .tar.bz2, .tar.xz, .tar.lz4, .tar.sz, and .rar (extract-only) files with Go
Pagers
Access
- https://github.com/cybernoid/archivemount - A fuse filesystem for mounting archives in formats supported by libarchive.
- fuse-zip - a FUSE file system to navigate, extract, create and modify ZIP and ZIP64 archives based on libzip implemented in C++.
- AVFS - A Virtual File System - a system, which enables all programs to look inside archived or compressed files, or access remote files without recompiling the programs or changing the kernel.At the moment it supports floppies, tar and gzip files, zip, bzip2, ar and rar files, ftp sessions, http, webdav, rsh/rcp, ssh/scp. Quite a few other handlers are implemented with the Midnight Commander's external FS.
- https://github.com/mxmlnkn/ratarmount - Random Access Read-Only Tar Mount1