Compression

From Things and Stuff Wiki
Revision as of 03:22, 13 January 2020 by Milk (talk | contribs) (→‎zopfli)
Jump to navigation Jump to search


General

See also Storage/Files#Archiving



  • https://en.wikipedia.org/wiki/pack_(compression) - a (now deprecated) Unix shell compression program based on Huffman coding. The unpack utility will restore files to their original state after they have been compressed using the pack utility. If no files are specified, the standard input will be uncompressed to the standard output.






LZ

  • https://en.wikipedia.org/wiki/LZ77_and_LZ78 - the two lossless data compression algorithms published in papers by Abraham Lempel and Jacob Ziv in 1977 and 1978. They are also known as LZ1 and LZ2 respectively. These two algorithms form the basis for many variations including LZW, LZSS, LZMA and others. Besides their academic influence, these algorithms formed the basis of several ubiquitous compression schemes, including GIF and the DEFLATE algorithm used in PNG and ZIP.


  • https://en.wikipedia.org/wiki/Lempel–Ziv–Welch - (LZW) is a universal lossless data compression algorithm created by Abraham Lempel, Jacob Ziv, and Terry Welch. It was published by Welch in 1984 as an improved implementation of the LZ78 algorithm published by Lempel and Ziv in 1978. The algorithm is simple to implement and has the potential for very high throughput in hardware implementations. It is the algorithm of the widely used Unix file compression utility compress and is used in the GIF image format.




  • https://en.wikipedia.org/wiki/compress a Unix shell compression program based on the LZW compression algorithm. Compared to more modern compression utilities such as gzip and bzip2, compress performs faster and with less memory usage, at the cost of a significantly lower compression ratio. The uncompress utility will restore files to their original state after they have been compressed using the compress utility. If no files are specified, the standard input will be uncompressed to the standard output.


ZIP



unzip -l archive.zip
  # list files in archive

for z in *.zip; do unzip $z; done
  # unzip all in folder, overwrite files automatically






gzip

  • gzip, gunzip, zcat - compress or expand files

bzip2



LZ4

LZMA

7z

  • 7-Zip - a file archiver with the highest compression ratio. The program supports 7z (that implements LZMA compression algorithm), ZIP, CAB, ARJ, GZIP, BZIP2, TAR, CPIO, RPM and DEB formats. Compression ratio in the new 7z format is 30-50% better than ratio in ZIP format.
    • p7zip is a port of 7za.exe for POSIX systems like Unix (Linux, Solaris, OpenBSD, FreeBSD, Cygwin, AIX, ...), MacOS X and also for BeOS and Amiga. 7za.exe is the command line version of 7-zip, see http://www.7-zip.org/. 7-Zip is a file archiver with highest compression ratio.
    • man z7 (p7zip)
    • p7zip-light in AUR
7z x filename
  extract archive with directories

7z a myzip ./MyFolder/*
  add a folder to an archive

LZMA2

xz

tar -cvJf filename.tar.xz directory/* morefiles..
  # create verbose xz filearchive



Brotli

  • https://en.wikipedia.org/wiki/Brotli - a data format specification for data streams compressed with a specific combination of the general-purpose LZ77 lossless compression algorithm, Huffman coding and 2nd order context modelling. Brotli was initially developed to decrease the size of transmissions of WOFF2 web fonts, and in that context was a continuation of the development of zopfli, which is a zlib-compatible implementation of the standard gzip and deflate specifications.

zopfli

ZSTD


Snappy

  • Snappy - a compression/decompression library. It does not aim for maximum compression, or compatibility with any other compression library; instead, it aims for very high speeds and reasonable compression. For instance, compared to the fastest mode of zlib, Snappy is an order of magnitude faster for most inputs, but the resulting compressed files are anywhere from 20% to 100% bigger. On a single core of a Core i7 processor in 64-bit mode, Snappy compresses at about 250 MB/sec or more and decompresses at about 500 MB/sec or more.

Helpers

# Extract Files
extract() {
 if [ -f $1 ] ; then
     case $1 in
         *.tar.bz2)   tar xvjf $1    ;;
         *.tar.gz)    tar xvzf $1    ;;
         *.tar.xz)    tar xvJf $1    ;;
         *.bz2)       bunzip2 $1     ;;
         *.rar)       unrar x $1     ;;
         *.gz)        gunzip $1      ;;
         *.tar)       tar xvf $1     ;;
         *.tbz2)      tar xvjf $1    ;;
         *.tgz)       tar xvzf $1    ;;
         *.zip)       unzip $1       ;;
         *.Z)         uncompress $1  ;;
         *.7z)        7z x $1        ;;
         *.xz)        unxz $1        ;;
         *.exe)       cabextract $1  ;;
         *)           echo "\`$1': unrecognized file compression" ;;
     esac
 else
     echo "\`$1' is not a valid file"
 fi
}

atool

  • atool - a script for managing file archives of various types (tar, tar+gzip, zip etc). The main command is aunpack which extracts files from an archive. Did you ever extract files from an archive, not checking whether the files were located in a subdirectory or in the top directory of the archive, resulting in files scattered all over the place? aunpack overcomes this problem by first extracting to a new directory. If there was only a single file in the archive, that file is moved to the original directory. aunpack also prevents local files from being overwritten by mistake.
atool archive.tar.gz
  # extract archive to subdir, or current dir if only one file

atool -D archive.tar.gz
  # extract archive to subdir


dtrx

  • dtrx - extracts archives in a number of different formats; it currently supports tar, zip (including self-extracting .exe files), cpio, rpm, deb, gem, 7z, cab, rar, lzh, and InstallShield files. It can also decompress files compressed with gzip, bzip2, lzma, xz, or compress. In addition to providing one command to handle many different archive types, dtrx also aids the user by extracting contents consistently. By default, everything will be written to a dedicated directory that's named after the archive. dtrx will also change the permissions to ensure that the owner can read and write all those files.

patool

  • patool - a portable archive file manager

unp

  • https://github.com/mitsuhiko/unp - a command line tool that can unpack archives easily. It mainly acts as a wrapper around other shell tools that you can find on various POSIX systems. It figures out how to invoke an unpacker to achieve the desired result. In addition to that it will safely unpack files when an archive contains more than one top level item. In those cases it will wrap the resulting file in a folder so that your working directory does not get messed up.

archiver

Pagers