From Things and Stuff Wiki
Jump to: navigation, search

Things and Stuff Wiki - An organically evolving personal wiki knowledge base with an on-the-fly taxonomy containing topic outlines, descriptions, notes and breadcrumbs, with links to sites, systems, software, manuals, organisations, people, articles, guides, slides, papers, books, comments, videos, screencasts, webcasts, scratchpads and more. Quality varies drastically. Use the Table of Contents to navigate long pages, use the Small-ToC and Tiny-ToC header links on longer pages. Not that mobile friendly atm. #tnswiki on freenode IRC for feedback chat, or see About for login and further information. / et / em



to finish rearranging.

Mostly Linux, mostly free software.

See also the related articles in the menu, plus Playback, Dataflow, Pure Data, Distros#Media


mess, to merge with parts of Music

  • - a vibration that propagates as a typically audible mechanical wave of pressure and displacement, through a medium such as air or water. In physiology and psychology, sound is the reception of such waves and their perception by the brain.

  • - the interdisciplinary science that deals with the study of all mechanical waves in gases, liquids, and solids including topics such as vibration, sound, ultrasound and infrasound. A scientist who works in the field of acoustics is an acoustician while someone working in the field of acoustics technology may be called an acoustical engineer. The application of acoustics is present in almost all aspects of modern society with the most obvious being the audio and noise control industries.

  • - a form of energy associated with the vibration of matter. The SI unit of sound energy is the joule (J). Sound is a mechanical wave and as such consists physically in oscillatory elastic compression and in oscillatory displacement of a fluid. Therefore, the medium acts as storage for both potential and kinetic energy as well.
  • - the distance travelled per unit time by a sound wave as it propagates through an elastic medium. In dry air at 20 °C (68 °F), the speed of sound is 343.2 metres per second (1,126 ft/s; 1,236 km/h; 768 mph; 667 kn), or a kilometre in 2.914 s or a mile in 4.689 s.

  • - or acoustic power is the rate at which sound energy is emitted, reflected, transmitted or received, per unit time. The SI unit of sound power is the watt (W). It is the power of the sound force on a surface of the medium of propagation of the sound wave. For a sound source, unlike sound pressure, sound power is neither room-dependent nor distance-dependent. Sound pressure is a measurement at a point in space near the source, while the sound power of a source is the total power emitted by that source in all directions. Sound power passing through an area is sometimes called sound flux or acoustic flux through that area.

  • - also known as acoustic intensity is defined as the sound power per unit area. The SI unit of sound intensity is the watt per square meter (W/m2). The usual context is the noise measurement of sound intensity in the air at a listener's location as a sound energy quantity. Sound intensity is not the same physical quantity as sound pressure. Hearing is directly sensitive to sound pressure which is related to sound intensity. In consumer audio electronics, the level differences are called "intensity" differences, but sound intensity is a specifically defined quantity and cannot be sensed by a simple microphone. Sound energy passing per second through a unit area held perpendicular to the direction of propagation of sound waves is called intensity of sound.

  • - or acoustic pressure is the local pressure deviation from the ambient (average, or equilibrium) atmospheric pressure, caused by a sound wave. In air, sound pressure can be measured using a microphone, and in water with a hydrophone. The SI unit of sound pressure is the pascal (Pa).
  • - a logarithmic unit used to express the ratio of two values of a physical quantity, often power or intensity. One of these values is often a standard reference value, in which case the decibel is used to express the level of the other value relative to this reference. The number of decibels is ten times the logarithm to base 10 of the ratio of two power quantities, or of the ratio of the squares of two field amplitude quantities.

The decibel is commonly used in acoustics as a unit of sound pressure level. The reference pressure in air is set at the typical threshold of perception of an average human and there are common comparisons used to illustrate different levels of sound pressure.

  • - the beginning of a musical note or other sound, in which the amplitude rises from zero to an initial peak. It is related to (but different from) the concept of a transient: all musical notes have an onset, but do not necessarily include an initial transient.
  • - a high amplitude, short-duration sound at the beginning of a waveform that occurs in phenomena such as musical sounds, noises or speech. It can sometimes contain a high degree of non-periodic components and a higher magnitude of high frequencies than the harmonic content of that sound. Transients do not necessarily directly depend on the frequency of the tone they initiate. Transients are more difficult to encode with many audio compression algorithms, causing pre-echo.

  • - a measure of the energy loss of sound propagation in media. Most media have viscosity, and are therefore not ideal media. When sound propagates in such media, there is always thermal consumption of energy caused by viscosity. For inhomogeneous media, besides media viscosity, acoustic scattering is another main reason for removal of acoustic energy. Acoustic attenuation in a lossy medium plays an important role in many scientific researches and engineering fields, such as medical ultrasonography, vibration and noise reduction.
  • - refers to the process by which a material, structure, or object takes in sound energy when sound waves are encountered, as opposed to reflecting the energy. Part of the absorbed energy is transformed into heat and part is transmitted through the absorbing body. The energy transformed into heat is said to have been 'lost'.

When sound from a loudspeaker collides with the walls of a room part of the sound's energy is reflected, part is transmitted, and part is absorbed into the walls. As the waves travel through the wall they deform the material thereof (just like they deformed the air before). This deformation causes mechanical losses via conversion of part of the sound energy into heat, resulting in acoustic attenuation, mostly due to the wall's viscosity. Similar attenuation mechanisms apply for the air and any other medium through which sound travels.

The fraction of sound absorbed is governed by the acoustic impedances of both media and is a function of frequency and the incident angle. Size and shape can influence the sound wave's behavior if they interact with its wavelength, giving rise to wave phenomena such as standing waves and diffraction. Acoustic absorption is of particular interest in soundproofing. Soundproofing aims to absorb as much sound energy (often in particular frequencies) as possible converting it into heat or transmitting it away from a certain location. In general, soft, pliable, or porous materials (like cloths) serve as good acoustic insulators - absorbing most sound, whereas dense, hard, impenetrable materials (such as metals) reflect most.

  • - AF or audible frequency, is characterized as a periodic vibration whose frequency is audible to the average human. The SI unit of audio frequency is the hertz (Hz). It is the property of sound that most determines pitch. The generally accepted standard range of audible frequencies is 20 to 20,000 Hz, although the range of frequencies individuals hear is greatly influenced by environmental factors. Frequencies below 20 Hz are generally felt rather than heard, assuming the amplitude of the vibration is great enough. Frequencies above 20,000 Hz can sometimes be sensed by young people. High frequencies are the first to be affected by hearing loss due to age and/or prolonged exposure to very loud noises.

See Music

S.S. Stevens, "The relation of pitch to intensity", Vol. 6, 1935, pp. 150-154.

W.B. Snow, "Changes of pitch with loudness at low frequencies", Vol. 8, 1936, pp. 14-19.

Performing Musician Magazine:


  • - the subjective perception of sound pressure. More formally, it is defined as, "That attribute of auditory sensation in terms of which sounds can be ordered on a scale extending from quiet to loud." The relation of physical attributes of sound to perceived loudness consists of physical, physiological and psychological components. The study of apparent loudness is included in the topic of psychoacoustics and employs methods of psychophysics.

In different industries, loudness may have different meanings and different measurement standards. Some definitions such as LKFS refer to relative loudness of different segments of electronically reproduced sounds such as for broadcasting and cinema. Others, such as ISO 532A (Stevens loudness, measured in sones), ISO 532B (Zwicker loudness), DIN 45631 and ASA/ANSI S3.4, have a more general scope and are often used to characterize loudness of environmental noise.

Loudness, a subjective measure, often confused with physical measures of sound strength such as sound pressure, sound pressure level (in decibels), sound intensity or sound power. Filters such as A-weighting and ITU-R BS.1770 attempt to compensate measurements to correspond to loudness as perceived by the typical human.

  • - a unit of loudness level for pure tones. Its purpose is to compensate for the effect of frequency on the perceived loudness of tones. By definition, the number of phon of a sound is the dB SPL of a sound at a frequency of 1 kHz that sounds just as loud. This implies that 0 phon is the limit of perception, and inaudible sounds have negative phon levels. The equal-loudness contours are a way of mapping the dB SPL of a pure tone to the perceived loudness level (LN) in phons. These are now defined in the international standard ISO 226:2003, and the research on which this document is based concluded that earlier Fletcher–Munson curves and Robinson–Dadson curves were in error. The phon unit is not an SI unit in metrology. It is used as a unit of loudness level by the American National Standards Institute.

  • - a measure of sound pressure (dB SPL), over the frequency spectrum, for which a listener perceives a constant loudness when presented with pure steady tones. The unit of measurement for loudness levels is the phon, and is arrived at by reference to equal-loudness contours. By definition, two sine waves of differing frequencies are said to have equal-loudness level measured in phons if they are perceived as equally loud by the average young person without significant hearing impairment. Equal-loudness contours are often referred to as "Fletcher-Munson" curves, after the earliest researchers, but those studies have been superseded and incorporated into newer standards. The definitive curves are those defined in the international standard ISO 226:2003, which are based on a review of modern determinations made in various countries.

  •–Munson_curves - ne of many sets of equal-loudness contours for the human ear, determined experimentally by Harvey Fletcher and Wilden A. Munson, and reported in a 1933 paper entitled "Loudness, its definition, measurement and calculation" in the Journal of the Acoustical Society of America.

  • - a loudness standard designed to enable normalization of audio levels for delivery of broadcast TV and other video. Loudness units relative to full scale (LUFS) is a synonym for LKFS that was introduced in EBU R128. Loudness units (LU) is an additional unit used in EBU R128. It describes Lk without direct absolute reference and therefore describes loudness level differences. LKFS is standardized in ITU-R BS.1770.
  • YouTube: EBU R128 Introduction - Florian Camerer - an introduction to the European Broadcasting Union's R128 Broadcast Standard and speaks in general about perceived loudness, peak normalization, loudness normalization, etc.

  • - a unit of how loud a sound is perceived. The sone scale is linear. Doubling the perceived loudness doubles the sone value. Proposed by Stanley Smith Stevens in 1936, it is a non-SI unit. In acoustics, loudness is the subjective perception of sound pressure. The study of apparent loudness is included in the topic of psychoacoustics and employs methods of psychophysics.


  • - the scientific study of sound perception. More specifically, it is the branch of science studying the psychological and physiological responses associated with sound (including speech and music). It can be further categorized as a branch of psychophysics.


* I-Simpa: I-Simpa - an open software dedicated to the modelling of sound propagation in 3D complex domains. It is a perfect tool for experts (i.e. acousticians), for teachers and students, as well as for researchers, in their projects (room acoustics, urban acoustics, industrial spaces, acoustic courses...).


See Music#Training

  • RealSimple Project - musical acoustics laboratory exercises integrating both hands-on laboratory experience and computer-based simulation.




  • - a representation of sound, typically as an electrical voltage. Audio signals have frequencies in the audio frequency range of roughly 20 to 20,000 Hz (the limits of human hearing). Audio signals may be synthesized directly, or may originate at a transducer such as a microphone, musical instrument pickup, phonograph cartridge, or tape head. Loudspeakers or headphones convert an electrical audio signal into sound. Digital representations of audio signals exist in a variety of formats. An audio channel or audio track is an audio signal communications channel in a storage device, used in operations such as multi-track recording and sound reinforcement.

  • - the electrical power transferred from an audio amplifier to a loudspeaker, measured in watts. The electrical power delivered to the loudspeaker, together with its efficiency, determines the sound power generated (with the rest of the electrical power being converted to heat). Amplifiers are limited in the electrical energy they can output, while loudspeakers are limited in the electrical energy they can convert to sound energy without being damaged or distorting the audio signal. These limits, or power ratings, are important to consumers finding compatible products and comparing competitors.


  • SRC Comparisons - "We have organized the testing of some of the objective parameters of SRC algorithms in the 96 kHz - 44.1 kHz conversion mode. This mode is considered "hard" because of its fractional resampling ratio. The set of test signals has been discussed among engineers from Weiss Engineering, Alexey Lukin and members of Glenn Meadows' Mastering Web-Board. The test files were available in a variety of resolutions (32-bit int, 32-bit float, 24-bit), and the best supported resolution has been used for each of the SRC algorithms tested. The resulting graphs have been drawn by a modified version of the RightMark Audio Analyzer (RMAA) and some specially developed analysis software."


See also Music

Synth programming, sequencer programming, etc.







see sound on sound, etc.

  • Schematic Vault - This collection of pro audio schematics and reference materials has been amassed both from my private stock and from various internet resources. All materials have been formatted as multi-page pdf files for ease of use. Please feel free to email me (address on home page) with any material you'd care to add.



  • - also known as phone jack, audio jack, headphone jack or quarter inch jack plug, is a family of electrical connectors typically used for analog audio signals. The phone connector was invented for use in telephone switchboards in the 19th century and is still widely used. The phone connector is cylindrical in shape, with a grooved tip to retain it. In its original audio configuration, it typically has two, three, four and, occasionally, five contacts. Three-contact versions are known as TRS connectors, where T stands for "tip", R stands for "ring" and S stands for "sleeve". Ring contacts are typically the same diameter as the sleeve, the long shank. Similarly, two-, four- and five- contact versions are called TS, TRRS and TRRRS connectors respectively. The outside diameter of the "sleeve" conductor is 1⁄4 inch (6.35 millimetres).

The "mini" connector has a diameter of 3.5 mm (0.14 in) and the "sub-mini" connector has a diameter of 2.5 mm (0.098 in).

  • - a style of electrical connector, primarily found on professional audio, video, and stage lighting equipment. The connectors are circular in design and have between 3 and 7 pins. They are most commonly associated with balanced audio interconnection, including AES3 digital audio, but are also used for lighting control, low-voltage power supplies, and other applications. XLR connectors are available from a number of manufacturers and are covered by an international standard for dimensions, IEC 61076-2-103.[1] They are superficially similar to the older and smaller DIN connector range, but are not physically compatible with them.

Patch bay

"Full-Normal : Each jack on the top-row is connected to the jack under it on the bottom-row. This allows the audio or video signal to “pass-through” the patchbay without using a patch cable. When we want to change the “normal” signal path we can use a patch cable to change the destination of the signal. Placing a patch cable into the either row breaks the signal path. The signal follows the patch cable to where it is patched.

"Half-Normal: ...Placing a patch cable into the bottom-row breaks the signal path. Placing a patch cable into the top-row allows the signal to still go to the jack under it on the bottom-row (without breaking the normal) and also follows the patch cable."




  • RD/MPCTools - Extension Toolkit for Martin M-Series Software


For amp to be twice as loud as the 10 watt RMS amp you need a 100 watt RMS amp, and then for a amp to be twice as loud as the 100 watt RMS amp you need 1,000 watt RMS amp.

  • Class A - 100% of the input signal is used (conduction angle Θ = 360°). The active element remains conducting all of the time.
  • Class B - 50% of the input signal is used (Θ = 180°); the active element carries current half of each cycle, and is turned off for the other half.
  • Class AB - Class AB is intermediate between class A and B, the two active elements conduct more than half of the time.
  • Class C - Less than 50% of the input signal is used (conduction angle Θ < 180°).
  • Class D - uses some form of pulse-width modulation to control the output devices; the conduction angle of each device is no longer related directly to the input signal but instead varies in pulse width. These are sometimes called "digital" amplifiers because the output device is switched fully on or off, and not carrying current proportional to the signal amplitude.

  • YouTube: Is equipment burn in real? - "all capacitors have to form, you put voltage on them and little microscopic holes are filled and it changes the equivalent series resistance and a number of other characteristics"



See Speaker



See also #Synthesis, #Audio programming


  • Elektor 10 Channel Vocoder - "I'm a synth & electronics passionate fan and live near Antwerp, Belgium. In the mid 1990's, really young and inexperienced, I decided to build the Elektor 10 channel vocoder as described in the Dutch Elektor magazine from the early 1980's."

Drum machine

  • LXR Drum Synthesizer - The LXR is a full fledged digital drum machine with integrated sequencer. Its sound engine provides 6 different instruments, each with over 30 parameters to tweak. It can produce a wide variety of sounds, ranging from classic analogue emulations to crunchy digital mayhem.

  • eXaDrums - Electronic drums for Linux. The goal project is to use a Raspberry Pi to make a drum module. As far as the software goes, it is written in C++ and uses Gtkmm to display a nice graphical user interface (GUI) on a Raspberry Pi official 7" touchscreen. The hardware consists of some accelerometers or piezos connected to an analog to digital converter (ADC).


  • - An embedded linux system that plays back audio samples for musical performance. The development board used was a Texas Instruments (TI) Beagle Bone Black based on an AM335x 1GHz ARM® Cortex-A8 processer. For high-quality audio, a TI PCM5102A DAC was used. The program leveraged the c++ library: JUCE and Linux sound server: JACK.

Sound module

  • - an electronic musical instrument without a human-playable interface such as a piano-style musical keyboard. Sound modules have to be operated using an externally connected device, which is often a MIDI controller, of which the most common type is the musical keyboard (although wind controllers, guitar controllers and electronic drum pads are also used). Controllers are devices that provide the human-playable interface and which may or may not produce sounds of its own. Another common way of controlling a sound module is through a sequencer, which is computer hardware or software designed to record and play back control information for sound-generating hardware (e.g., a DJ may program a bassline and use the sound module to produce the sound). Connections between sound modules, controllers, and sequencers are generally made with MIDI (Musical Instrument Digital Interface), which is a standardized protocol designed for this purpose, which includes special ports (jacks) and cables.


Sound chip/card

  •'97 - Audio Codec '97; also MC'97 for Modem Codec '97, is an audio codec standard developed by Intel Architecture Labs in 1997. The standard was used in motherboards, modems, and sound cards. Audio components integrated into chipsets consist of two component classes: an AC'97 digital controller (DC97), which is built into the southbridge of the chipset, and AC'97 audio and modem codecs, which are the analog components of the architecture. AC'97 defines a high-quality, 16- or 20-bit audio architecture with surround sound support for the PC. AC'97 supports a 96 kHz sampling rate at 20-bit stereo resolution and a 48 kHz sampling rate at 20-bit stereo resolution for multichannel recording and playback. AC97 defines a maximum of 6 channels of analog audio output.



  • envy24control - alsa-utils


  • - was an add-on MIDI-synthesizer for Creative Sound Blaster 16 and Sound Blaster AWE32 family of PC soundcards. It was a sample-based synthesis General MIDI compliant synthesizer. For General MIDI scores, the Wave Blaster's wavetable-engine produced more realistic instrumental music than the SB16's onboard Yamaha-OPL3.

Music workstaion

Digital Audio Workstation


  • - an interface standard for a serial bus for high-speed communications and isochronous real-time data transfer. It was developed in the late 1980s and early 1990s by Apple, which called it FireWire. The 1394 interface is also known by the brands i.LINK (Sony), and Lynx (Texas Instruments). The copper cable it uses in its most common implementation can be up to 4.5 metres (15 ft) long. Power is also carried over this cable allowing devices with moderate power requirements to operate without a separate power supply. FireWire is also available in Cat 5 and optical fiber versions. The 1394 interface is comparable to USB, though USB requires a master controller and has greater market share.

  • FFADO project aims to provide a generic, open-source solution for the support of FireWire based audio devices for the Linux platform. It is the successor of the FreeBoB project.


  • - or Multichannel Audio Digital Interface or AES10, is an Audio Engineering Society (AES) standard electronic communications protocol that defines the data format and electrical characteristics of an interface that carries multiple channels of digital audio. The AES first documented the MADI standard in AES10-1991, and updated it in AES10-2003 and AES10-2008. The MADI standard includes a bit-level description and has features in common with the two-channel format of AES3. It supports serial digital transmission over coaxial cable or fibre-optic lines of 28, 56, or 64 channels; and sampling rates of up to 96 kHz with resolution of up to 24 bits per channel. Like AES3 or ADAT it is a Uni-directional interface (one sender and one receiver).




  • Jesusonic - A dynamic text mode live FX processor
  • - a working proof-of-concept modular IDE for generating DSP code for the JesuSonic platform, written using the Flex and AIR SDKs. JesuSonic has a standalone version and is also part of Reaper as a plugin.


  • The OWL - an open source, programmable audio platform made for musicians, hackers and programmers alike. Users can program their own effects, or download ready-made patches from our growing online patch library. It is available both as a guitar fx pedal and a Eurorack synthesizer module. OWL stands for Open Ware Laboratory which refers to the fact that the entire project is open source in both hardware and software. Being open source is an important issue for us in terms of making all of the technology completely accessible to the end user.

to sort

Game controller

Eye tracking

  • - an open-source eye-controlled music composition environment. This was developed during for my summer internship with the Enable Group at Microsoft Research. The source code can be found on github under the official project name Microsoft Hands-Free Sound Jam. EyeJam is cross-platform, with suport for Windows, Mac, and Linux. Eye-control is only available on Windows. On the other platforms, eye control is simulated using the mouse cursor.

Raspberry PI


  • Chimaera - a poly-magneto-phonic-theremin (we had to come up with this new subcategory in the domain of electronic instruments, as the Chimaera did not fit anywhere else). Other terms that would describe it well could be: a general-purpose-continuous-music-controller, a multi-touch-less-ribbon-controller or a possible offspring of a mating experiment of a keyboard and violin. Think of it as an invisible string that is excitable by an arbitrary number of magnetic sources. Depending on where the magnetic sources are located on the string and depending on how strong (or how near) they are, the device outputs different event signals. These general-purpose event signals then can be used to e.g. drive a synthesizer, an effects processing unit or some other hardware.

  • - making digital instruments with analog interfaces. Our mission is to make sounds tangible. That's why we are developing instruments with haptic interfaces for electronic sound - both analog and software synthesis. Our Instruments have excitable surfaces that you can scratch, hit or bow. A very limited run of our developer edition will soon be available here.


Music roll

  • - a storage medium used to operate a mechanical musical instrument. They are used for the player piano, mechanical organ, electronic carillon and various types of orchestrion. The vast majority of music rolls are made of paper. Other materials that have been utilized include thin card (Imhof-system), thin sheet brass (Telektra-system), composite multi-layered electro-conductive aluminium and paper roll (Triste-system) and, in the modern era, thin plastic or PET film. The music data is stored by means of perforations. The mechanism of the instrument reads these as the roll unwinds, using a pneumatic, mechanical or electrical sensing device called a tracker bar, and the mechanism subsequently plays the instrument. After a roll is played, it is necessary for it to be rewound before it can be played again. This necessitates a break in a musical performance. To overcome this problem, some instruments were built with two player mechanisms allowing one roll to play while the other rewinds. A piano roll is a specific type of music roll, and is designed to operate an automatic piano like the player piano or the reproducing piano.

  • - a music storage medium used to operate a player piano, piano player or reproducing piano. A piano roll is a continuous roll of paper with perforations (holes) punched into it. The perforations represent note control data. The roll moves over a reading system known as a 'tracker bar' and the playing cycle for each musical note is triggered when a perforation crosses the bar and is read. A rollography is a listing of piano rolls, especially made by a single performer, analogous to a discography.

Piano rolls were in continuous mass production from around 1896 to 2008, and are still available today, with QRS Music claiming to have 45,000 titles available with "new titles being added on a regular basis". Largely replacing piano rolls, which are no longer mass-produced today, MIDI files represent a modern way in which musical performance data can be stored. MIDI files accomplish digitally and electronically what piano rolls do mechanically. Software for editing a performance stored as MIDI data often has a feature to show the music in a piano roll representation.

  • Midimusic eplayWin32 - Estey and Wurlitzer e-roll player for Hauptwerk, Miditzer, GrandOrgue & eplayOrgan. This graphical player will play Estey e-rolls on any Hauptwerk or Miditzer organ and Wurlitzer Band Organ e-rolls on eplayOrgan (Windows, iMac and Linux) It will automatically operate the manuals, pedals, stops, couplers and swell. As supplied this version plays the Hauptwerk St. Annes Moseley and Paramount 310 plus the Miditzer 160, 216 or 260 organs. It also plays Wurlitzer 125, 150 and 165 organs. Other Hauptwerk or Miditzer organs can be played by adding their data via the menus. It also plays my new eplayOrgan and most other organs which can be played from midi keyboards, including GrandOrgue, Viscount and jOrgan.


PA system

  • - used in popular music and sound reinforcement system contexts to refer to electronic audio amplification equipment and speaker enclosures that are placed behind the band or the rhythm section on stage, including amplifiers and speaker cabinets for guitars, bass guitars and keyboards. In the US and Canada, the term has expanded to include many of the musical instruments that the rhythm section musicians play, including pianos, Hammond organs, drum kits and various percussion instruments such as congas and bongos.

Sound system

  • - the combination of microphones, signal processors, amplifiers, and loudspeakers in enclosures all controlled by a mixing console that makes live or pre-recorded sounds louder and may also distribute those sounds to a larger or more distant audience. In many situations, a sound reinforcement system is also used to enhance or alter the sound of the sources on the stage, typically by using electronic effects, such as reverb, as opposed to simply amplifying the sources unaltered.

A sound reinforcement system for a rock concert in a stadium may be very complex, including hundreds of microphones, complex live sound mixing and signal processing systems, tens of thousands of watts of amplifier power, and multiple loudspeaker arrays, all overseen by a team of audio engineers and technicians. On the other hand, a sound reinforcement system can be as simple as a small public address (PA) system, consisting of, for example, a single microphone connected to a 100 watt amplified loudspeaker for a singer-guitarist playing in a small coffeehouse. In both cases, these systems reinforce sound to make it louder or distribute it to a wider audience.

Some audio engineers and others in the professional audio industry disagree over whether these audio systems should be called sound reinforcement (SR) systems or PA systems. Distinguishing between the two terms by technology and capability is common, while others distinguish by intended use (e.g., SR systems are for live event support and PA systems are for reproduction of speech and recorded music in buildings and institutions). In some regions or markets, the distinction between the two terms is important, though the terms are considered interchangeable in many professional circles.

  • - a group of DJs and audio engineers contributing and working together as one, playing and producing music over a large PA system or sound reinforcement system, typically for a dance event or party.


"there are many good reasons to consider Linux [for pro audio]: low- or no-cost software; a high-performance sound system; great audio routing; powerful sound synthesis; high-quality score notation; it's extremely customisable; more of your CPU power may be used for audio processing; it avoids many patent/license restriction pitfalls; it avoids costly tie-ins to specific product ranges; software may be (legally) modified to suit your individual needs; modular software allows you to configure your software studio the way that you want; and it's driven by passion before profit." [6]

News and communities

  • LinuxMusicians forum - mission: to facilitate discussion, learning, and discovery of music making on the Linux platform.

  • Libre Music Production is a community-driven online resource, focused on promoting musical creation and composition using free and open source (FLOSS) software. By providing hands-on material submitted by the community, such as guides, tutorials, articles and news updates, we want to show not only that there is great FLOSS audio software out there, but also how to practically use that software to make music.

Mailing lists



  • #lau
  • #lad
  • #opensourcemusicians
  • #linuxmusicians
  • #linuxmao - francophone
  • - germanophone
  • #jack
  • #pulseaudio
  • #ardour
  • #ingen
  • #non
  • #lmms
  • ##zynsubaddfx
  • #dataflow - pure data
  • #lilypond
  • #lv2
  • #Juce
  • #kxstudio
  • #archaudio
  • #archlinux-aur
  • ##dsp
  • #kvr
  • #RedditAudio
  • ##audio
  • ##music
  • ##


  • #debian-multimedia



See Distros#Audio/visual, Playback#Operating System

Software lists

An in-depth look at programming sound using Linux — jannewmarch

Real time

See also *nix#linux-rt

Audio systems

In a Unix-like operating system, a sound server mixes different data streams and sends out a single unified audio to an output device. The mixing is usually done by software, or by hardware if there is a supported sound card.

The "sound stack" can be visualized as follows, with programs in the upper layers calling elements in the lower layers:

  • Applications (e.g. mp3 player, web video)
  • Sound server (e.g. aRts, ESD, JACK, PulseAudio)
  • Sound subsystem (described as kernel modules or drivers; e.g. OSS, ALSA)
  • Operating system kernel (e.g. Linux, Unix)

"If we were drawing the [internet] OSI model used to describe the networking framework that connects your machine to every other machine on the network, we'd find clear strata, each with its own domain of processes and functionality. There's very little overlap in layers, and you certainly don't find end-user processes in layer seven messing with the electrical impulses of the raw bitstreams in layer one.

"Yet this is exactly what can happen with the Linux audio framework. There isn't even a clearly defined bottom level, with several audio technologies messing around with the kernel and your hardware independently. Linux's audio architecture is more like the layers of the Earth's crust than the network model, with lower levels occasionally erupting on to the surface, causing confusion and distress, and upper layers moving to displace the underlying technology that was originally hidden."

"ALSA itself has a kernel level stack and a higher API for programmers to use, mixing drivers and hardware properties with the ability to play back surround sound or an MP3 codec. Most distributions stick PulseAudio and GStreamer on top[,] ... The deeper the layer, the closer to the hardware it is." [7]



The API is designed to use the traditional Unix framework of open(), read(), write(), and ioctl(), via special devices. For instance, the default device for sound input and output is /dev/dsp. Examples using the shell:

cat /dev/random > /dev/dsp
  # plays white noise through the speaker

cat /dev/dsp > a.a
  # reads data from the microphone and copies it to file a.a

  • - OSS Proxy uses CUSE (extension of FUSE allowing character devices to be implemented in userspace) to implement OSS interface - /dev/dsp, /dev/adsp and /dev/mixer. From the POV of the applications, these devices are proper character devices and behave exactly the same way so it can be made quite versatile.


"ALSA is responsible for translating your audio hardware's capabilities into a software API that the rest of your system uses to manipulate sound. It was designed to tackle many of the shortcomings of OSS (and most other sound drivers at the time), the most notable of which was that only one application could access the hardware at a time. This is why a software component in ALSA needs to manages audio requests and understand your hardware's capabilities.

"ALSA was designed to replace OSS. However, OSS isn't really dead, thanks to a compatibility layer in ALSA designed to enable older, OSS-only applications to run. It's easiest to think of ALSA as the device driver layer of the Linux sound system. Your audio hardware needs a corresponding kernel module, prefixed with snd_, and this needs to be loaded and running for anything to happen. This is why you need an ALSA kernel driver for any sound to be heard on your system, and why your laptop was mute for so long before someone thought of creating a driver for it. Fortunately, most distros will configure your devices and modules automatically. [8]

re official wiki


less /proc/asound/card0/pcm0p/sub0/hw_params
  # current hardware info

cat /proc/asound/cards
  # List audio hardware

cat /proc/asound/card0
  # List card info

cat /proc/asound/devices
  # List audio hardware

aplay -L
  # List all PCMs defined

modinfo soundcore
  # Kernel sound module info

lsmod | grep snd

lspci -v | grep -i audio
  # show some kernel info

  • alsacap - ALSA device capability lister. alsacap - ALSA device capability lister. scans soundcards known to ALSA for devices and subdevices. displays ranges of configuration parameters for the given ALSA device.
  • QasConfig is a graphical browser for the configuration tree and can help to analyze and debug an ALSA setup.


ALSA settings are stored in file 'asound.state', location can vary depending on distribution

  # advanced controls for ALSA soundcard driver

alsactl init
  # initiate basic configure

alsactl store
  # storae configuration

  • - Neither the user-side .asoundrc nor the asound.conf configuration files are required for ALSA to work properly. Most applications will work without them. These files are used to allow extra functionality, such as routing and sample-rate conversion, through the alsa-lib layer.

The keyword default is defined in the ALSA lib API and will always access hw:0,0 — the default device on the default soundcard. Specifying the !default name supersedes the one defined in the ALSA lib API.

pcm.NAME { 
	type hw               # Kernel PCM 
	card INT/STR          # Card name or number
  	[device] INT          # Device number (default 0)     
	[subdevice] INT       # Subdevice number, -1 first available (default -1)
	mmap_emulation BOOL   # enable mmap emulation for ro/wo devices

  • PCM (digital audio) plugins - these extend functionality and features of PCM devices. The plugins take care about various sample conversions, sample copying among channels and so on.

Plugin: hw

This plugin communicates directly with the ALSA kernel driver. It is a raw communication without any conversions. The emulation of mmap access can be optionally enabled, but expect worse latency in the case.

The nonblock option specifies whether the device is opened in a non-blocking manner. Note that the blocking behavior for read/write access won't be changed by this option. This influences only on the blocking behavior at opening the device. If you would like to keep the compatibility with the older ALSA stuff, turn this option off.

Plugin: file

This plugin stores contents of a PCM stream to file or pipes the stream to a command, and optionally uses an existing file as an input data source (i.e., "virtual mic")

  • - Mixing enables multiple applications to output sound at the same time. Most discrete sound cards support hardware mixing, which is enabled by default if available. Integrated motherboard sound cards (such as Intel HD Audio), usually do not support hardware mixing. On such cards, software mixing is done by an ALSA plugin called dmix. This feature is enabled automatically if hardware mixing is unavailable.

  • - the equivalent of the dmix plugin, but for recording sound. The dsnoop plugin allows several applications to record from the same device simultaneously.

  • - allows create a PCM loopback between a PCM capture device and a PCM playback device, supports multiple soundcards, adaptive clock synchronization, adaptive rate resampling using the samplerate library (if available in the system). Also, mixer controls can be redirected from one card to another (for example Master and PCM).

speaker-test -c 2
  # Using 16 octaves of pink noise, alsa-utils


  • PulseAudio is a sound system for POSIX OSes, meaning that it is a proxy for your sound applications. It allows you to do advanced operations on your sound data as it passes between your application and your hardware. Things like transferring the audio to a different machine, changing the sample format or channel count and mixing several sounds into one are easily achieved using a sound server.


user specific pulseaudio config;

  # to load modules and define defaults
  # to configure a client for the sound server
  # to define sample rates and buffers

To avoid .pulse-cookie in home folder, set the following in /etc/pulse/client.conf [10]

cookie-file = /tmp/pulse-cookie

By default, pulseaudio changes master and application volume at the same time. To disable this, edit /etc/pulse/daemon.conf or ~/.config/pulse/daemon.conf with:

flat-volumes = no


man pulse-cli-syntax
  # pulseaudio commandline help

  # control a running PulseAudio sound server

  # reconfigure a PulseAudio sound server during runtime

pacmd list-cards

pacmd dump


 h/j/k/l, arrows               navigation, volume change
 H/L, Shift+Left/Shift+Right   change volume by 10
 1/2/3/4/5/6/7/8/9/0           set volume to 10%-100%
 m                             mute/unmute
 Space                         lock/unlock channels together
 Enter                         context menu
 F1/F2/F3                      change modes
 Tab                           go to next mode
 Mouse left click              select device or mode
 Mouse wheel                   volume change
 q/Esc/^C                      quit

  • - Interactive python/ncurses UI to control volume of pulse streams, kinda like alsamixer, focused not on sink volume levels (which can actually be controlled via alsamixer, with alsa-pulse plugin), but rather on volume of individual streams, so you can tune down the music to hear the stuff from games, mumble, skype or browser.



pasuspender -- audacity
  # temporaraly suspend pulseaudio and launch audacity, for when PA gets in the way and config yak shaving isn't an option
  • - PulseAudio OSS Wrapper. starts the specified program and redirects its access to OSS compatible audio devices (/dev/dsp and auxiliary devices) to a PulseAudio sound server. padsp uses the $LD_PRELOAD environment variable that is interpreted by and thus does not work for SUID binaries and statically built executables. Equivalent to using padsp is starting an application with $LD_PRELOAD set to
cat /dev/urandom | padsp tee /dev/audio > /dev/null


See also #Jack configuration

  • JACK is system for handling real-time, low latency audio (and MIDI). It runs on GNU/Linux, Solaris, FreeBSD, OS X and Windows (and can be ported to other POSIX-conformant platforms). It can connect a number of different applications to an audio device, as well as allowing them to share audio between themselves. Its clients can run in their own processes (ie. as normal applications), or can they can run within the JACK server (ie. as a "plugin"). JACK also has support for distributing audio processing across a network, both fast & reliable LANs as well as slower, less reliable WANs.

  • Sound Engineers Guide to Jackd - This page attempts to collect in one place various issues surrounding the design and abilities of Paul Davis's wonderful Jackd from the point of view of a technical user. It is not an introduction or a usage howto.

  • q=0: SRC_LINEAR



  • sndio is a small audio and MIDI framework part of the OpenBSD project. It provides an lightweight audio & MIDI server and a fully documented user-space API to access either the server or directly the hardware in a uniform way. Sndio is designed to work for desktop applications, but pays special attention to synchronization mechanisms and reliability required by music applications. Reliability through simplicity are part of the project goals.


  • CRAS: ChromeOS Audio Server - allows for sound to be routed dynamically to newly attached audio-capable monitors (DisplayPort and HDMI), USB webcam, USB speakers, bluetooth headsets, etc., and in a way that requires as little CPU as possible and that adds little or no latency.



See #aRts_2


Enlightened Sound Daemon



  • Setting up Jack Audio for GStreamer, Flash, and VLC - First of all I need to say that I won't be mentioning Pulseaudio so if that is what you're here for then you are at the wrong place because I don't use Pulseaudio at all, Pulseaudio can be run ontop of Jack but doing so will increase CPU load (a very tiny amount on modern systems).

  • - Make JACK Work With PulseAudio. This script is intended to be invoked via QjackCtl to start up and shut down JACK on a system running PulseAudio. It handles the necessary setup to make the two work together, so PulseAudio clients get transparently routed through JACK while the latter is running, or if pulseaudio is suspend by pasuspender, do nothing [12]


  • - is Microsoft's most modern method for talking with audio devices. It is available in Windows Vista, Windows 7, and later versions of Windows. It allows delivering an unmodified bitstream to a sound device, and provides benefits similar to those provided by ASIO drivers. One of the other main benefits of WASAPI is that it provides applications with exclusive access to audio devices, bypassing the system mixer, default settings, and any typically any effects provided by the audio driver. WASAPI is the recommended Audio Output Mode for Windows unless your audio device has a well-behaved ASIO driver, and it effectively replaces all legacy output modes including Kernel Streaming and Direct Sound.

  • - a deprecated software component of the Microsoft DirectX library for the Windows operating system. DirectSound provides a low-latency interface to sound card drivers written for Windows 95 through Windows XP and can handle the mixing and recording of multiple audio streams.
  • - a deprecated component of the Microsoft DirectX API that allows music and sound effects to be composed and played and provides flexible interactive control over the way they are played. Architecturally, DirectMusic is a high-level set of objects, built on top of DirectSound, that allow the programmer to play sound and music without needing to get quite as low-level as DirectSound. DirectSound allows for the capture and playback of digital sound samples, whereas DirectMusic works with message-based musical data. Music can be synthesized either in hardware, in the Microsoft GS Wavetable SW Synth, or in a custom synthesizer.
  • - a lower-level audio API for Microsoft Windows, Xbox 360 and Windows Phone 8, the successor to DirectSound on Windows and a supplement to the original XAudio on the Xbox 360. XAudio2 operates through the XAudio API on the Xbox 360, through DirectSound on Windows XP, and through the low-level audio mixer WASAPI on Windows Vista and higher.

  • ASIO4ALL - Universal ASIO Driver For WDM Audio

Jack configuration

pasuspender -- jackd
  # temporaraly suspend pulseaudio and start jack (needed for jack1 without PA patch)
 jackd -R -P89 -s -dalsa -dhw:0 -r48000 -p256 -njack-server
  # start jackd, realtime priority 89, ALSA engine soundcard hw:0, sample rate of 48k, 256 max ports, instancename

  • jack_iodelay - will create one input and one output port, and then measures the latency (signal delay) between them. For this to work, the output port must be connected to its input port. The measurement is accurate to a resolution of greater than 1 sample.
  • - heavily based on powernowd. Instead of taking CPU load as parameter for deciding on the CPU frequency jackfreqd uses JACK DSP-load and jackfreqd only supports the powernowd's aggressive mode 1). Optionally jackfreqd can also take CPU load into account which comes in handy when the JACK-daemon is temporarily unavailable or if frequency-scaling should also be done for on non-audio processes.

# jack2 package commands

jack_connect fluidsynth:l_00 system:playback_3 jack_connect fluidsynth:r_00 system:playback_4

  # list jack ports

jack_lsp -c
  # list jack port connections (sinks indented)


D-Bus control via python2-dbus

jack_control start
  # starts the jack server

jack_control stop
  # stops the jack server

jack_control status
  # check whether jack server is started, return value is 0 if running and 1 otherwise
jack_control dg
  # current driver

jack_control dp
  # current driver paramaters
jack_control dl
  # drivers list

jack_control ds alsa
  # selects alsa as the driver (backend)
jack_control sm
  # switch master to currently selected driver
jack_control eps realtime True
  # set engine parameters, such as realtime

jack_control dps period 256
  # set the driver parameter period to 256
  help                       - print this help text
  dpd <param>                - get long description for driver parameter
  dps <param> <value>        - set driver parameter
  dpr <param>                - reset driver parameter to its default value
  asd <driver>               - add slave driver
  rsd <driver>               - remove slave driver
  il                         - get list of available internals
  ip <name>                  - get parameters of given internal
  ipd <name> <param>         - get long description for internal parameter
  ips <name> <param> <value> - set internal parameter
  ipr <name> <param>         - reset internal parameter to its default value
  iload <name>               - load internal
  iunload <name>             - unload internal
  ep                         - get engine parameters
  epd <param>                - get long description for engine parameter
  eps <param> <value>        - set engine parameter
  epr <param>                - reset engine parameter to its default value



Session/config management

LASH Audio Session Handler

  • LASH - a session management system for GNU/Linux audio applications. It allows you to save and restore audio sessions consisting of multiple interconneced applications, restoring program state (ie loaded patches) and the connections between them.

Dead. Inflexible and underused.

  • GLASHCtl - a simple applet for controlling the LASH Audio Session Handler. When you run it it will appear as a small LASH icon in your "notification area" or "system tray".


  • ladish - LADI Session Handler or simply ladish is a session management system for JACK applications on GNU/Linux using Dbus. Its aim is to allow you to have many different audio programs running at once, to save their setup, close them down and then easily reload the setup at some other time. ladish doesn't deal with any kind of audio or MIDI data itself; it just runs programs, deals with saving/loading (arbitrary) data and connects JACK ports together. It can also be used to move entire sessions between computers, or post sessions on the Internet for download.
    • - LADI Session Handler, a rewrite of LASH. a session management system for JACK applications on GNU/Linux. Its aim is to allow you to have many different audio programs running at once, to save their setup, close them down and then easily reload the setup at some other time. ladish doesn't deal with any kind of audio or MIDI data itself; it just runs programs, deals with saving/loading (arbitrary) data and connects JACK ports together. It can also be used to move entire sessions between computers, or post sessions on the Internet for download. ladish has GUI frontend, gladish, based on lpatchage (LADI Patchage). and the ladish_control command line app for headless operation. LADI Tools is set of apps that interface with ladish, JACK server and a2jmidid
LADI Tools
  • - LADI Tools, forked from LADI/laditools, is a set of tools aiming to achieve the goals of the LADI project to improve desktop integration and user workflow of Linux audio system based on JACK and LADISH. Those tools take advantage of the DBus interfaces of JACK2 and LADISH to ease the configuration and use of your software studio.

In a near future, it should also be possible to use laditools to control JACK through an OSC interface.

You will find in this suite:

  • laditools - python module
  • ladi-system-tray - a system tray icon that allows you to start, stop and monitor JACK, as well as start some JACK related apps (log viewer, connections...)
  • wmladi - a controller as a Window Maker dockapp which uses a menu similar to ladi-system-tray's
  • ladi-system-log - a JACK, LADISH and a2jmidid log viewer
  • ladi-control-center - a GUI to setup JACK's and laditools' configuration
  • ladi-player - compact front-end to that allows users to start, stop and monitor a LADI system.
  • g15ladi - a JACK monitor for g15 keyboards

JACK Session

  • originally jack 1 (not D-Bus)
  • Saving a session will save the state of all 'JACK Session'-supported apps plus their JACK connections
  • Opening a session will automatically launch those apps, restoring their state and JACK connections
  • Supported apps can be told to save/load their state to/from a specific location


  • QjackCtl - JACK Audio Connection Kit - Qt GUI Interface

QjackCtl holds its settings and configuration state per user, in a file located as $HOME/.config/ Normally, there's no need to edit this file, as it is recreated and rewritten everytime qjackctl is run.

  • D-Bus control and Jack 2 D-Bus control
  • Connection and JACK Session manager


  • Cadence - a set of tools useful for audio production. Cadence itself is also an application (the main one), which this page will document. There are other applications that are part of the Cadence suite, they are usually named as the "Cadence tools". They are: Catarina (simple patching), Catia (patching), Claudia (LADISH)
    • Cadence - controls and monitors various Linux sound systems as well as audio-related system settings
cadence --minimized &
  • Claudia - a LADISH frontend; it's just like Catia, but focused at session management through LADISH.
  • jack2 (dbus)
  • Claudia-Launcher is a multimedia application launcher with LADISH support. It searches for installed packages (not binaries), and displays the respective content as a launcher. The content is got through an hardcoded database, created and/or modified to suit the target distribution.

Non Session Manager

  • Non Session Manager is a graphical interface to the NSM Daemon (nsmd). By default, running the command non-session-manager will start both the GUI and an instance of the daemon. NSM manages clients together in a session. That's it. NSM doesn't know or care what Window Manager or audio subsystem those clients use--nor should it. Specific clients must be written to persist these environmental factors, and added to sessions when required.

For saving and restoring the JACK connection graph, a simple headless client named jackpatch has been developed and included in the NSM distribution. Simply add jackpatch do your basic template session and all the sessions you base on it will have their JACK connection graphs automatically saved and restored.

non-session-manager -- --session-root path

  • - a simple NSM client for wrapping non-NSM capable programs. It enables the use of programs supporting LADISH Level 0 and 1, and programs which accept their configuration via command-line arguments.

  • - bridges Non Session Manager and JACK Session. This allows programs that support JACK Session (say, jalv) to run inside nsm-proxy and have their data saved using njsm.

  • - makes git a little easier to use with non session manager sessions. creates a git repository in the current session and commits all untracked and unstaged files to it whenever save is pressed. nsm-git also reads the session.nsm file and deletes any saved applications that are not listed in the session. This program is meant to be executed within NSM.

  • - a GNU/Linux session manager for audio programs as Ardour, Carla, QTractor, Non-Timeline, etc... It uses the same API as Non Session Manager, so programs compatible with NSM are also compatible with Ray Session. As Non Session Manager, the principle is to load together audio programs, then be able to save or close all documents together.


  • MonoMultiJack - a program for managing, starting and stopping Jackd and music programs. Another feature is connecting and disconnecting Jack audio and MIDI ports, as well as ALSA MIDI ports. It is programmed in Mono using GTK# as GUI toolkit.


Preselected applications.


  • chino - a 'special-purpose session manager', requiring customisation to cover one or more similar setups. Once customised, using it is dead simple. Perhaps it is best to not overstress the term "session management", instead describing chino as a framework and toolset to build and manage a meta-application consisting of the user's favorite modular Jack audio and Midi tools, each started and interconnected in predefined ways.
chino -n newproject
  # start newproject in current directory

chino -o existingproject
  # open existingproject

Broken? Depends on listlib which depends on anch which depends on tml which depends on flex. None of these were in the AUR. Then tml didn't build.


  • QJackConnect - a QT based patchbay for the JACK Audio Connection Kit.


  • Catia is a JACK Patchbay, with some neat features like A2J bridge support and JACK Transport.


  • Patchage is a modular patch bay for audio and MIDI systems based on Jack and Alsa.


  • Patchmatrix - PatchMatrix gives the best user experience with JACK1, as it makes intensive use of JACK's metadata API, which JACK2 still lacks an implementation of.


  • jsweeper will be a programmable port connection manager for ALSA sequencer and JACK audio and midi ports. Ports are laid out in a matrix so that connecting or disconnecting a port or a group of ports is just one mouse click or keypress.


  • - Elm signal routing matrix, mainly for JACK audio connections but possibly generalizable to other usecases. frontend sort of works, JACK backend definitely WIP and is therefore not yet pushed




Routing snapshots


  • aj-snapshot - a small program that can be used to make snapshots of the connections made between JACK and/or ALSA clients. Because JACK can provide both audio and MIDI support to programs, aj-snapshot can store both types of connections for JACK. ALSA, on the other hand, only provides routing facilities for MIDI clients. You can also run aj-snapshot in daemon mode if you want to have your connections continually restored.
aj-snapshot filename
  # make a snapshot

aj-snapshot -r filename
  # restore a snapshot

aj-snapshot -d filename &
  # run in daemon mode

  • Robust Session Management - QJackCTL and Patchage for setup, diagnostics, and testing, aj-snapshot for management of Jack and ALSA MIDI connections, The DBus version of Jack2


echo "connect system:capture_1 system:playback_1

> disconnect system:capture_2 system:playback_2" | ./autocable

./autocable yourdirectory/


  • - JMess - A utility to save your audio connections (mess). JMess can save an XML file with all the current Jack Audio connections. This same file can be loaded to connect everything again. The XML file can also be edited. It also also has the option to disconnect all the clients.
jmess -s filename.xml
  # save

jmess -c filename.xml
  # load

jmess -d -c filename.xml
  # disconnect all then load

jmess -d
  # disconnect all


  • jack_snapshot - a little tool for storing/restoring jack connection states. it does this by writing/reading the names of the connected ports into/from a simple textfile. and here is also one weakness: some jack clients don't use the same jack name on each run, but dynamically assign one [like meterbridge] but most of them can be told to use a specific name, so this isn't really a problem. at least not for me. some pattern matching might be added in the future..


  • jack-plumbing maintains a set of port connection rules and manages these as clients register ports with JACK- Port names are implicitly bounded regular expressions and support sub-expression patterns.


  • jack-matchmaker - a small command line utility that listens to JACK port registrations by clients and connects them when they match one of the port pattern pairs given on the command line at startup. jack-matchmaker never disconnects any ports. The port name patterns are specified as pairs of positional arguments or read from a file (see below) and are interpreted as Python regular expressions


  • - Tiny application that reacts on port registrations by clients and connects them. The port names are interpreted as regular expressions and more than one pair can be defined upon calling.

Jack Sanity

jacklistener etc.


  • jackstdio - jack-stdout writes JACK audio-sample data to buffered standard output. jack-stdin reads raw audio data from standard-input and writes it to a JACK audio port.


alsa_in / alsa_out

alsa_in -j "Description" -d prefix:Name -q 1 2>&1 1> /dev/null &
  # used to send ALSA microphone input to an JACK input device
  # -d = device name, hw:2
  # -q = quality of resampler, 1-4
  # -c = channels, automatic default
  # -r 48000 = sample rate, automatic default
  # can automatically detect and open an available soundcard (what type? doesn't work for usb mic)
arecord -l
  card 2: AK5370 [AK5370], device 0: USB Audio [USB Audio]
  Subdevices: 0/1
  Subdevice #0: subdevice #0

alsa_in  -dhw:2 -jusb-mic
  # or
alsa_in -dhw:AK5370 -j "USB Mic" 

alsa_out -j "Description" -d prefix:Name -q 1 2>&1 1> /dev/null &
  # used to send JACK output to an ALSA device, like a speaker or headphones

If you get "Capture open error: Device or resource busy", some other program has control of the playback interface.

To see what application has control of the interface:

fuser -u /dev/snd/pcmC0D0p
  # this is card 0, device 0, pcm playback

If it's pulseaudio, launch pavucontrol, go to the Configuration tab and select Off for the device(s).


  • Zita-ajbridge - provides two applications, zita-a2j and zita-j2a. They allow to use an ALSA device as a Jack client, to provide additional capture (a2j) or playback (j2a) channels. Functionally these are equivalent to the alsa_in and alsa_out clients that come with Jack, but they provide much better audio quality. The resampling ratio will typically be stable within 1 PPM and change only very smoothly. Delay will be stable as well even under worse case conditions, e.g. the Jack client running near the end of the cycle.
cat /proc/asound/cards

zita-a2j -dhw:3,0 -jwebcam


JACK Transport

  • JACK Transport Design - The JACK Audio Connection Kit provides simple transport interfaces for starting, stopping and repositioning a set of clients. This document describes the overall design of these interfaces, their detailed specifications are in <jack/transport.h>

jack_transport> ?
  activate	Call jack_activate().
  exit		Exit transport program.
  deactivate	Call jack_deactivate().
  help		Display help text [<command>].
  locate	Locate to frame <position>.
  master	Become timebase master [<conditionally>].
  play		Start transport rolling.
  quit		Synonym for `exit'.
  release	Release timebase.
  stop		Stop transport.
  tempo         Set beat tempo <beats_per_min>.
  timeout	Set sync timeout in <seconds>.
  ?  		Synonym for `help'.
echo play |jack_transport
  # pass command to execute
  # tempo change doesn't work via this method

  • JackDirector is a Linux app that lets you control Jack Audio Connection Kit's transport play/pause using midi commands (noteon) and let you assign bpm changes and other commands to midi program changes. This program plays a metronome thru 2 audio outputs exposed in Jack.

  • gjacktransport - a standalone application that provides access to the jack audio connection kit‘s, JACK transport mechanism via a dynamic graphical slider. in other words: this software allows to seek Audio/Video media files when they are played along jack transport. Intended for audio-engineers or A/V editors that work with arodour, ecasound, hydrogen and/or xjadeo. Additionally it provides 'gjackclock'. A "Big Clock" display for jack-transport.
  • cabestan is a small GTK+ program that interfaces with the jack audio connection kit to play, rewind, or fast forward the stream via the jack transport interface.
  • jack-transport is a minimalist Jack transport control interface using ncurses. It displays the transport state and current time, and provides standard operating keys.

  • QJackMMC - a Qt based program that can connect to a device or program that emits MIDI Machine Control (MMC) and allow it to drive JACK transport, which in turn can control other programs. JackCtlMMC is a slightly simpler command-line version of QJackMMC.

  • jack-osc - publishes the transport state of the local JACK server as OSC packets over a UDP connection. jack-osc allows any OSC enabled application to act as a JACK transport client, receiving sample accurate pulse stream timing data, and monitoring and initiating transport state change.

  • InConcert - a MIDI-controlled application that allows a musician to control the tempo and synchronization of a MIDI sequence. It features a tap tempo to adjust the beat (and synchronize the beat) and the ability to skip beats or insert beats. It works by controlling the Jack Audio Connection Kit's transport. InConcert depends on Jack and ALSA, and therefore only runs on Linux.

Doesn't work??

  • TapStart - measures a tempo you tap. But: It sends OSC-messages with the tempo or delay to customizable hosts and paths. It updates the Jack tempo on each click (=new averaged tempo). It can start the Jack transport after tapping a defined number of beats.
  • jack-trans2midi - a utility that converts jack transport into midi clock messages

Ableton Link



  • Drumstick Metronome (kmetronome) is a MIDI based metronome using the ALSA sequencer. Intended for musicians and music students, it is a tool to keep the rhythm while playing musical instruments.
    • No decimal BPM, not MIDI driven



  • klick is an advanced command-line based metronome for JACK. It allows you to define complex tempo maps for entire songs or performances.

JACK transport connect but not driven by it? BPM argument required, doesn't change when transport master runs.

  • gtklick - a GTK frontend to klick. It's written in Python and communicates with klick via OSC.
klick -o 12345 60 &
gtklick -q osc.udp://localhost:12345



  • GTick - an audio metronome application written for GNU/Linux and other UN*X-like operting systems supporting different meters (Even, 2/4, 3/4, 4/4 and more) and speeds ranging from 10 to 1000 bpm. It utilizes GTK+ and OSS (ALSA compatible).


  • Hubcap - a fairly simple metronome *nix app with a tempo fader and both auditory and visual feedback on a beat.
    • Audio only, no MIDI


  • Accelerando - a musical metronome that can speed up, allowing you to practice your music at progressively faster tempos. For example, you could set it to play 60 beats per minute for 4 bars, then automatically speed up by 10 beats per minute, and so on. It runs on Unix.



  • ctronome - a very simple yet powerful ;) programmable console metronome software.
    • OSS Audio only, no MIDI

Click Tracker

  • Click Tracker - a program designed for composers, conductors and instrumentalists working with modern music. The main goal of the software is to prepare a click track of any score, no matter how complex it is. This software runs in Windows, OSX and Linux under the open source program Pure Data, and can be used either by conductors in concert, by musicians for practice purposes, by composers while composing.

Graphical metronomes

  • - graphical metronome and instructions display that interfaces with JACK-midi applications. The display can be completely driven by MIDI events and it can also send MIDI events. It can also be self-driven and hence run without jackd although this is somewhat less interesting since it becomes just a visual metronome.

  • - BeatViz shows a 4x4 grid that represents additive groupings of beats. (Beat here meaning a single atomic tick, equal to a 16th note within the author's Csound Live Code system. UDP controlled.

Web metronomes

  • Chrome Web Store: Dr. Beat - developed as a part of the HackTime ( project from GDG Chrome Korea. It's a metro style metronome app. It helps you to keep the beats.


  • Open Metronome - Windows only. User definable BPM; Measure can be set to any length, with emphasis on any beat(s); Each beat can be one or more of over forty voices, with the supplied Samples covering the complete General MIDI percussion set, or custom samples; Visual indicator as well as audible output;


  • Qrest is a musician toolkit aimed at helping composers, performers, recordists and mixers : Find out the tempo of a musical piece, Calculate delay times, Calculate LFO frequencies (i.e., timing conversions)

  • RASP - "RASP Aids Song Production" is a set of utilities for song production, supplementing functions missing in some DAWs. Features: Tap Tempo, Delay/Hz Calculator, Song Time Calculator, Note-to-Frequency Conversion, Simple Frequency Generator (v2), Metronome (v2)


  • - the use of an Ethernet-based network to distribute real-time digital audio. AoE replaces bulky snake cables or audio-specific installed low-voltage wiring with standard network structured cabling in a facility. AoE provides a reliable backbone for any audio application, such as for large-scale sound reinforcement in stadiums, airports and convention centers, multiple studios or stages.

While AoE bears a resemblance to voice over IP (VoIP) and audio over IP (AoIP), AoE is intended for high-fidelity, low-latency professional audio. Because of the fidelity and latency constraints, AoE systems generally do not utilize audio data compression. AoE systems use a much higher bit rate (typically 1 Mbit/s per channel) and much lower latency (typically less than 10 milliseconds) than VoIP. AoE requires a high-performance network. Performance requirements may be met through use of a dedicated local area network (LAN) or virtual LAN (VLAN), overprovisioning or quality of service features. Some AoE systems use proprietary protocols (at the higher OSI layers) which create Ethernet frames that are transmitted directly onto the Ethernet (layer 2) for efficiency and reduced overhead. The word clock may be provided by broadcast packets.

See also Networking#ISDN

  • - short for Music Local Area Network, is a transport level protocol for synchronized transmission and management of multi-channel digital audio, video, control signals and multi-port MIDI over a network. The mLAN protocol was originally developed by Yamaha Corporation, and publicly introduced in January 2000. It was available under a royalty-free license to anyone interested in utilizing the technology. mLAN exploits several features of the IEEE 1394 (FireWire) standard such as isochronous transfer and intelligent connection management. There are two versions of the mLAN protocol. Version 1 requires S200 rate, while Version 2 requires S400 rate and supports synchronized streaming of digital audio at up to 24 bit word length and 192 kHz sample rate, MIDI and wordclock at a bitrate up to 400 Megabits per second. As of early 2008, mLAN appeared to have reached the end of its product life.


  • Netjack - a Realtime Audio Transport over a generic IP Network. It is fully integrated into JACK. Syncs all Clients to one Soundcard so no resampling or glitches in the whole network. Packet loss is now also handled gracefully. By using the celt codec, its even possible, that single packet losses get masked by the Packet Loss Concealment Code.

  • is a Linux and Mac OS X-based system used for multi-machine network performance over the Internet. It supports any number of channels (as many as the computer/network can handle) of bidirectional, high quality, uncompressed audio signal streaming.


  • Zita-njbridge Command line Jack clients to transmit full quality multichannel audio over a local IP network, with adaptive resampling by the receiver(s). Zita-njbridge can be used for a one-to-one connection (using UDP) or in a one-to-many system (using multicast). Sender and receiver(s) can each have their own sample rate and period size, and no word clock sync between them is assumed. Up 64 channels can be transmitted, receivers can select any combination of these. On a lightly loaded or dedicated network zita-njbridge can provide low latency (same as for an analog connection). Additional buffering can be specified in case there is significant network delay jitter. IPv6 is fully supported.

jack_audio_send / jack_audio_receive

native JACK 32 bit float audio data on the network using UDP OSC messages.


  • - a fully operational demo of a framework to increase available audio DSP power available to JACK within a single multicore motherboard, using multiple JACK processes in concert, connected via IP transport.

Compared to jack2??

Audio programming

Less GUI, more code.

See also Dataflow, Computing#Programming, Notation

  • - typically consist of an audio programming language (which may be graphical) and a user environment to design/run the language in. Although many of these environments are comparable in their abilities to produce high-quality audio, their differences and specialties are what draw users to a particular platform. This article compares noteworthy audio synthesis environments, and enumerates basic issues associated with their use.

by Allen B. Downey

  • Fiview - freeware application for Windows, Linux and Mac OSX that can be used to design and view digital filters. It also makes it very easy to compare different filters by allowing you to switch between them using the digit keys, and it generates efficient and readable public domain example code that can be used directly in an application. It is released under the GNU GPL. Much of the underlying filter design code was based on mkfilter from Tony Fisher -- see the source code for details. The resulting filters were improved by splitting them into separate stages, which improves the accuracy and stability of them enormously, especially for higher-order Bessel and Butterworth filters. The source also includes a library, fidlib (now with its own page), which can be used to design filters at run-time. The fiview utility generates fast generic compiler-optimisable example C code both using the frequencies provided, and also in a form that allows the frequencies to be provided at run-time via a call to fidlib. This permits applications the flexibility to do things like generating banks of similar filters at run-time according to run-time parameters.

  • FFmpegSource (usually known as FFMS or FFMS2) is a cross-platform wrapper library around FFmpeg/libav. It gives you an easy, convenient way to say "open and decompress this media file for me, I don't care how you do it" and get frame- and sample-accurate access (usually), without having to bother with the sometimes less than straightforward and less than perfectly documented libav API.

  • Echo Nest Remix is the Internet Synthesizer. Make amazing things from music, automatically. Turn any music or video into Python or JavaScript code.

  • Polarbear - a tool for designing filters in the complex domain. Filters can be designed by placing any number of poles and zeros on the z plane. From this the filter coefficients are calculated, and the filter can be applied in real time on an audio stream.

  • Designing Sound by Andy Farnel
  • Welsh's Synthesizer Cookbook: Synthesizer Programming, Sound Analysis, and Universal Patch Book

  • Making Computers Sing - When most people think of computer music or sounds, “blips” and “bleeps” often are the first things to come to mind. Let’s dive in and look and at some of the ways a computer musician would make this sound Comparing Max MSP and PureData, Csound, Supercollider and ChucK, Faust.


cat /dev/urandom | hexdump -v -e '/1 "%u\n"' | awk '{ split("0,2,4,5,7,9,11,12",a,","); for (i = 0; i < 1; i+= 0.0001) printf("%08X\n", 100*sin(1382*exp((a[$1 % 8]/12)*log(2))*i)) }' | xxd -r -p | aplay -c 2 -f S32_LE -r 16000




  • - refers to a family of computer music programs and programming languages descended from or influenced by MUSIC, a program written by Max Mathews in 1957 at Bell Labs. MUSIC was the first computer program for generating digital audio waveforms through direct synthesis. It was one of the first programs for making music (in actuality, sound) on a digital computer, and was certainly the first program to gain wide acceptance in the music research community as viable for that task.

The world's first computer-controlled music was generated in Australia by programmer Geoff Hill on the CSIRAC computer which was designed and built by Trevor Pearcey and Maston Beard. However, CSIRAC produced sound by sending raw pulses to the speaker, it did not produce standard digital audio with PCM samples, like the MUSIC-series of programs.

Less obviously, MUSIC can be seen as the parent program for: RTSKED (a later RealTime Scheduling language by Max Mathews), Max/MSP Pure Data, AudioMulch, SuperCollider, JSyn, Common Lisp Music, ChucK, or any other computer synthesis language that relies on a modular system (e.g. Reaktor).


Ported to a PDP10, Music V became the Mus10 music compiler system and played scores composed in Leland Smith's SCORE language.

  • SCORE-11 was originally designed for Vercoe’s MUSIC11 system. MUSIC11, which was written in PDP-11 assembly language, was replaced in 1986 by CSOUND, a version of the program written in the programming language C, and which runs on many different computer systems. SCORE-11 works well with either version of Vercoe’s program, which will be referred to as CSOUND in this manual except where the distinction is important.


In designing the system, we decided early on to adopt a highly interactive approach to the design of the human interface. Batch processing as in Music V (Mathews: 1969) is an alternative, but one which widely separates the composer and the program, causing serious delays in the feedback loop. We feel a score editor must be interactive because there are facets of the task which demand control and aesthetic judgment by the composer in an interactive and exploratory manner. Several modes of interaction have previously been used in music systems, such as alphanumeric text as in MUSlC10 (Smith: 1978), voice recognition (Tucker, Bates, Frykberg, Howrath Kennedy, Lamb, Vaughan: 1977), and piano -type keyboard (New England Digital Corp.: 1978). In our work we have adopted a bias towards graphics -based interaction (Baecker: 1979; Newman and Sproull: 1979) in the belief that this approach can make a significant contribution towards an effective human interface. First, music lends itself well to representations in the visual domain. Second, the task of editing music is complex in the sense that there are many parameters and commands to be manipulated and controlled this complexity can be reduced by the graphic representation of information. Third, previous work (Pulfer: 1972; Tanner: 1972; Vercoe: 1975) indicates that more congenial interfaces can be constructed using dynamic graphics techniques.



The compiler was replaced in 1977 with dedicated synthesis hardware in the form of the Systems Concepts Digital Synthesizer (built by Peter Samson and known as the ``Samson Box). The Samson Box was capable of utilizing many types of synthesis techniques such as additive synthesis, frequency modulation, digital filtering and some analysis-based synthesis methods. The PLA language, written by Bill Schottstaedt, allowed composers to specify parametric data for the Samson Box as well as for other sound processing procedures on the PDP10 mainframe (and on its eventual replacement, a Foonly F4). On April 3, 1992, the Foonly and Samson Box were officially retired.


  • - a music programming language written in the 1980s by Larry Polansky, Phil Burk, and David Rosenboom at Mills College. Written on top of Forth, it allowed for the creation of real-time interactive music performance systems, algorithmic composition software, and any other kind of program that requires a high degree of musical informatics. It was distributed by Frog Peak Music, and runs with a very light memory footprint (~1 megabyte) on Macintosh and Amiga systems.

Unlike CSound and other languages for audio synthesis, HMSL is primarily a language for making music. As such, it interfaces with sound-making devices through built-in MIDI classes. However, it has a high degree of built-in understanding of music performance practice, tuning systems, and score reading. Its main interface for the manipulation of musical parameters is through the metaphor of shapes, which can be created, altered, and combined to create a musical texture, either by themselves or in response to real-time or scheduled events in a score.

HMSL has been widely used by composers working in algorithmic composition for over twenty years. In addition to the authors (who are also composers), HMSL has been used in pieces by Nick Didkovsky, The Hub, James Tenney, Tom Erbe, and Pauline Oliveros. A Java port of HMSL was developed by Nick Didkovsky under the name JMSL, and is designed to interface to the JSyn API.

Music Mouse

  • - an algorithmic musical composition software developed by Laurie Spiegel. Spiegel's best known and most widely used software, "Music Mouse - An Intelligent Instrument" (1986) is for Macintosh, Amiga and Atari computers. The "intelligent instrument" name refers to the program's built-in knowledge of chord and scale convention and stylistic constraints. Automating these processes allows the user to focus on other aspects of the music in real time. In addition to improvisations using this software, Spiegel composed several works for "Music Mouse", including Cavis muris in 1986, Three Sonic Spaces in 1989, and Sound Zones in 1990. She continued to update the program through Macintosh OS 9, and as of 2012, it remained available for purchase or demo download from her website.

Common Music


  • Common Music (CM) - a music composition system that transforms high-level algorithmic representations of musical processes and structure into a variety of control protocols for sound synthesis and display. Its main user application is Grace (Graphical Realtime Algorithmic Composition Environment) a drag-and-drop, cross-platform app implemented in JUCE (C++) and S7 Scheme. In Grace musical algorithms can run in real time, or faster-than-real time when doing file-based composition. Grace provides two coding languages for designing musical algorithms: S7 Scheme, and SAL, an easy-to-learn but expressive algol-like language.

Common Music can write scores in several different syntaxes (currently CLM, CMN, Music Kit, MIDI, CSound and Paul Lansky's real-time mixing program, RT). The scores can then be rendered on workstations using any of the target synthesis programs. For example, CLM (Common Lisp Music, written by Bill Schottstaedt) is a widely used and fast software synthesis and signal processing package that can make use of multiple Motorola 56001 DSPs.

Pla is the intellectual ancestor of CM.

Common Lisp Music


  • CLM (originally an acronym for Common Lisp Music) is a sound synthesis package in the Music V family. It provides much the same functionality as Stk, Csound, SuperCollider, PD, CMix, cmusic, and Arctic — a collection of functions that create and manipulate sounds, aimed primarily at composers (in CLM's case anyway). The instrument builder plugs together these functions (called generators here), along with general programming glue to make computer instruments. These are then called in a note list or through some user interface (provided by Snd, for example).


  • Csound - a sound and music computing system which was originally developed by Barry Vercoe in 1985 at MIT Media Lab. Since the 90s, it has been developed by a group of core developers. Although Csound has a strong tradition as a tool for composing electro-acoustic pieces, it is used by composers and musicians for any kind of music that can be made with the help of the computer. Csound has tradtionally being used in a non-interactive score driven context, but nowadays it is mostly used in in a real-time context. Csound can run on a host of different platforms incuding all major operating systems as well as Android and iOS. Csound can also be called through other programming languages such as Python, Lua, C/C++, Java, etc.
  • - Csound was originally written at MIT by Barry Vercoe, based on his earlier system called Music 11, which in its turn followed the MUSIC-N model initiated by Max Mathews at the Bell Labs. Its development continued throughout the 1990s and 2000s, led by John ffitch at the University of Bath. The first documented version 5 release is version 5.01 on March 18, 2006.


  • CsoundQt - a frontend for Csound featuring a highlighting editor with autocomplete, interactive widgets and integrated help. It is a cross-platform and aims to be a simple yet powerful and complete development environment for Csound. It can open files created by MacCsound. Csound is a musical programming language with a very long history, with roots in the origins of computer music. It is still being maintained by an active community and despite its age, is still one of the most powerful tools for sound processing and synthesis. CsoundQt hopes to bring the power of Csound to a larger group of people, by reducing Csound's intial learning curve, and by giving users more immediate control of their sound. It hopes to be both a simple tool for the beginner, as well as a powerful tool for experienced users.

Cabbage Studio

  • Cabbage Studio - a Csound based DAW with a fully functional patching interface and development environment. Develop, prototype and test Csound based audio instruments on the fly using an integrated development solution that includes an embedded source code editor and rapid GUI designer. Cabbage Studio isn't just for users familiar with Csound, it can load a number of different plugin formats including VST, AU, and LADSPA and comes with over 100 high end audio plugins ready to use out of the box.


  • PWCsound is a tool for software synthesis control implemented in Pwgl (a visual programming environment based on Lisp and Clos), provides a graphical interface to Csound6 programming language. PWCsound is, of course, inspired by several historical Patchwork libraries (in particular by Csound/Edit-sco for Patchwork, better know as "PW-Csound", 1993) and Open Music. The connection between the techniques of computer aided composition and sound synthesis has been explored within these software: Csound/Edit-sco, Pwcollider, Om2Csound, OmChroma.



  • CMask is a score file generator for Csound. Its main purpose is the generation of events to create textures of granular sounds. Versions for MacOS9, Win, Linux.

  • Hadron Particle Synthesizer - The Hadron particle synthesizer is the ultimate creative tool for granular synthesis. It is available in different plugin formats (VST, AU and Max For Live), even though the graphic appearance is slightly different for the different plugins wrappers, the audio functionality of Hadron is the same in all formats.


  • Cecilia - a graphic user interface for the sound synthesis and sound processing package Csound. Cecilia enables the user to build very quickly graphic interfaces with sliders and curves to control Csound intruments. It is also an editor to Csound with syntax highlighting and a built-in reference. Cecilia is also a great tool to explore the parameters of a new opcode in an interactive and intuitive way.


  • Blue - An Integrated Music Environment, powered by Csound. An open-source, cross-platform desktop application for composing music. Use visual tools together with text and code to create the music of your dreams.


  • - The goal of this project is to (re)implement the Clavia Nord Modular G2 sound engine in Csound, a well-known sound and music computing system.

Cmix / RTcmix

  • RTcmix - An Open-Source, Digital Signal Processing and Sound Synthesis Language, one of the MUSIC-N family of computer music programming languages. RTcmix is descended from the MIX program developed by Paul Lansky at Princeton University in 1978 to perform algorithmic composition using digital audio soundfiles on a IBM 3031 mainframe computer. After synthesis functions were added, the program was renamed Cmix in the 1980s. Real-time capability was added by Brad Garton and David Topper in the mid-1990s, with support for TCP socket connectivity, interactive control of the scheduler, and object-oriented embedding of the synthesis engine into fully featured applications.

RTcmix has a number of unique (or highly unusual) features when compared with other synthesis and signal processing languages. For one, it has a built-in MINC parser, which enables the user to write C-style code within the score file, extending its innate capability for algorithmic composition and making it closer in some respects to later music software such as SuperCollider and Max/MSP. It uses a single-script instruction file (the score file), and synthesis and signal processing routines (called instruments) exist as compile shared libraries. This is different from MUSIC-N languages such as Csound where the instruments exist in a second file written in a specification language that builds the routines out of simple building blocks (organized as opcodes or unit generators). RTcmix has similar functionality to Csound and other computer music languages, however, and their shared lineage means that scripts written for one language will be extremely familiar-looking (if not immediately comprehensible) to users of the other language.


  • Playrec - a Matlab and Octave utility (MEX file) that provides simple yet versatile access to soundcards using PortAudio, a free, open-source audio I/O library. It can be used on different platforms (Windows, Macintosh, Unix) and access the soundcard via different host API including ASIO, WMME and DirectSound under Windows.


  • Microtone - How tiny you ask? 128 bytes. The sound generation itself takes up just 14 lines of assembly, generating 31 bytes of machine code. The rest is taken up the ELF header and functions needed to open the sound device and outputting samples
  • Coding an Equalizer - This article expands the section on page 20 of my book The Audio Expert that describes briefly how digital equalizers work. As explained in the book, all equalizers are based on filters of various types. Computer code that implements a filter is called Digital Signal Processing, or DSP for short. Most digital filters emulate equivalent analog filters, and the common language for all filters is mathematics. Therefore, several trigonometry formulas are shown below, and there's no escaping this! But the basic operation of the computer code that implements an equalizer is not too difficult to follow, even if you don't understand the formulas. To keep this example as brief as possible, the code implements a simple high-pass filter having one pole (6 dB per octave). Formulas to implement other filter types including those used in parametric equalizers are shown on the Cookbook Formulae web page by Robert Bristow-Johnson.

ld_preload sounds

  • - Generates WAV output by hooking malloc() and read(). Adding support for other calls should be pretty easy, pull-requests are much welcomed! Also, it should go without saying... but I will say it anyway... this is experimental.


  • io.c (read and write sound file data)
  • headers.c (read and write sound file headers)
  • audio.c (read and write sound hardware ports)
  • sound.c (provide slightly higher level access to the preceding files)
  • sndlib.h (header for the preceding files)
  • sndlib2xen.c and sndlib-strings.h (tie preceding into s7, Ruby, or Forth)
  • clm.c and clm.h (Music V implementation)
  • clm2xen.c, vct.c and vct.h (tie clm.c into s7, Ruby, or Forth)
  • xen.h, xen.c (the embedded language support)


  • - a C library for reading and writing files containing sampled sound (such as MS Windows WAV and the Apple/SGI AIFF format) through one standard library interface. It is released in source code format under the Gnu Lesser General Public License.


  • TSP Lab - Reports / Software - a library of routines for signal processing. It also includes a number of general purpose routines useful for program development. Programs using this library for filtering, LPC analysis/synthesis and resampling are available as part of the AFsp package.


Audio File Library

  • Audio File Library is a C-based library for reading and writing audio files in many common formats. The Audio File Library provides a uniform API which abstracts away details of file formats and data formats. The same calls for opening a file, accessing and manipulating audio metadata (e.g. sample rate, sample format, textual information, MIDI parameters), and reading and writing sample data will work with any supported audio file format. The Audio File Library lets you work with audio data in whatever format is most convenient for you.



  • Lyd - an embeddable signal processing language and engine. Suitable for among other things realtime audio effect and instrument synthesis and mixing. It can form the audio core of games, virtual instruments, real-time audio mixing and editing experiments. Lyd is currently mostly an application programmers toy, out of the box at the moment lyd contains a mediocre autogenerated approximation of an OPL2 FM synthesizer in its patch set. When you launch lyd you can use it as a synthesizer with an attached MIDI keyboard or with MIDI sequencers supporting ALSA midi. Lyd should ble able to enable processing plug-ins for various audio synthesis standards exposing the lyd language and efficiency in interaction to allow live tweaking of the lyd code when composing.


  • Pyo - a Python module written in C to help DSP script creation. Pyo contains classes for a wide variety of audio signal processing. With pyo, the user will be able to include signal processing chains directly in Python scripts or projects, and to manipulate them in real time through the interpreter. Tools in the pyo module offer primitives, like mathematical operations on audio signals, basic signal processing (filters, delays, synthesis generators, etc.), but also complex algorithms to create sound granulation and other creative audio manipulations. pyo supports the OSC protocol (Open Sound Control) to ease communications between softwares, and the MIDI protocol for generating sound events and controlling process parameters. pyo allows the creation of sophisticated signal processing chains with all the benefits of a mature and widely used general programming language.


  • Allegro - a cross-platform library mainly aimed at video game and multimedia programming. It handles common, low-level tasks such as creating windows, accepting user input, loading data, drawing images, playing sounds, etc. and generally abstracting away the underlying platform. However, Allegro is not a game engine: you are free to design and structure your program as you like. Allegro 5 has the following additional features: Supported on Windows, Linux, Mac OSX, iPhone and Android, User-friendly, intuitive C API usable from C++ and many other languages, Hardware accelerated bitmap and graphical primitive drawing support (via OpenGL or Direct3D), Audio recording support, Font loading and drawing, Video playback, Abstractions over shaders and low-level polygon drawing


  • Soundpipe - a lightweight music DSP library written in C. It aims to provide a set of high-quality DSP modules for composers, sound designers, and creative coders. Soundpipe supports a wide range of synthesis and audio DSP techniques which include: Classic Filters (Moog, Butterworth, etc), High-precision and linearly interpolated wavetable oscillators, Bandlimited oscillators (square, saw, triangle), FM synthesis, Karplus-strong instruments, Variable delay lines, String resonators, Spectral Resynthesis, Partitioned Convolution, Physical modeling, Pitch tracking, Distortion, Reverberation, Samplers and sample playback, Padsynth algorithm [18]
  • Sporth - SoundPipe fORTH, is a small stack-based audio programming language. For composers, Sporth is a different approach to making sound. Stack based languages are somewhat novel in the world of musical languages, and lend themselves well to modular sound design scenarios. Sporth syntax is simple to learn, and rewarding to master. Sound designers and composers fluent in languages like Csound, ChucK, and Supercollider will find Sporth a new and refreshing take on the same basic concepts. For developers, Sporth has a simple API that allows it to be used inside of other applications. In addition to compiling Sporth code, the API has access to other features of Sporth that would otherwise be unavailable, such as audio-rate software channels, and user defined function callbacks.

  • - an ANSI C library for generating audio-rate line segments and curves for computer-based music. Libline can easily interface with sample-accurate audio libraries like Soundpipe, and a local copy is used inside of Sporth via Polysporth.

  • - Moons is an isorhythmic circular sequencer. It is written using a combination of C and C++. All visuals are created using OpenGl; All the sounds are snythesized in realtime using Sporth, Soundpipe, and RTaudio.

Sound Open Firmware

Kiss FFT

  • - A Fast Fourier Transform based up on the principle, "Keep It Simple, Stupid." Kiss FFT is a very small, reasonably efficient, mixed radix FFT library that can use either fixed or floating point data types.


  • - a simple library for playing and recording audio. It's focused on simplicity and has a very small number of APIs. C/C++, single file, public domain.


  • - Realtime Audio Graph Engine, provides an audio processing graph implementation tailored towards distributed audio workstations. The audio graph within RAGE is composed of processing elements and connections between elements which are managed by the RAGE host. Rage processing elements can be both traditional data transforms (effects, mixers, etc) as well as data producers (samplers, synthesizers, etc) and data consumers (recorders, analysis plugins). The RAGE host manages the scheduling of both the low-latency audio commutations and the higher latency subtasks such as loading a file from disk for sample replay. Additionally, RAGE provides session time interpolated parameters, freeing individual elements from this responsibility.


  • Suil is a lightweight C library for loading and wrapping LV2 plugin UIs.
  • Lilv - a C library to make the use of LV2 plugins as simple as possible for applications. Lilv is the successor to SLV2, rewritten to be significantly faster and have minimal dependencies. It is stable, well-tested software (the included test suite covers over 90% of the code) in use by several applications.

  • Ficus - Realtime audio sampler api for linux devs. This development library provides your application with both multichannel playback/capture of mono wav audio files and all the flexibility and routing possibilities of a JACK client.


  • YouTube: Ian Hobson - The use of std variant in realtime DSP (ADC'17) - Application Developer & Software Engineer, Ableton .C++17 introduces std::variant, a type-safe union class. A variant's value represents one of a fixed set of possible types, with C++'s type system ensuring that correct code paths are executed for the active type. This talk will explore the pros and cons of working with variants, with a special focus on DSP. Variants allow for well defined interfaces and minimal memory footprints, but what are they like to use in practice, and are they performant enough for realtime use?

Synthesis ToolKit in C++

  • Synthesis ToolKit in C++ (STK) is a set of open source audio signal processing and algorithmic synthesis classes written in the C++ programming language. STK was designed to facilitate rapid development of music synthesis and audio processing software, with an emphasis on cross-platform functionality, realtime control, ease of use, and educational example code. The Synthesis ToolKit is extremely portable (it's mostly platform-independent C and C++ code), and it's completely user-extensible (all source included, no unusual libraries, and no hidden drivers). We like to think that this increases the chances that our programs will still work in another 5-10 years. In fact, the ToolKit has been working continuously for nearly 20 years now. STK currently runs with realtime support (audio and MIDI) on Linux, Macintosh OS X, and Windows computer platforms. Generic, non-realtime support has been tested under NeXTStep, Sun, and other platforms and should work with any standard C++ compiler.


  • RtAudio - a set of C++ classes that provide a common API for realtime audio input/output across Linux (native ALSA, JACK, PulseAudio and OSS), Macintosh OS X (CoreAudio and JACK), and Windows (DirectSound, ASIO and WASAPI) operating systems.




  • - Fast and easy audio synthesis in C++. Prefer coding to patching? Love clean syntax? Care about performance? That's how we feel too, and why we made Tonic.
  • - Open Frameworks Addon for the Tonic audio synthesis Library. Tonic is an efficient, pure C++ patching tool with a refreshingly crisp and simple syntax.


  • CLAM - (C++ Library for Audio and Music) is a software framework for research and application development on the audio and music domain. It provides means to perform complex audio signal analysis, transformations and synthesis. It also provides a uniform interface to common tasks on audio applications such as accessing audio devices and audio files, thread safe communication with the user interface and DSP algorithms recombination and scaling.

You can use CLAM as a library to program your applications in C++ but you can also use graphical tools to build full applications without coding.

  • - spectral; analyzes, transforms and synthesizes back a given sound. For doing so, it uses the Sinusoidal plus Residual model (sometimes referred to as SMS but also known as HILN in the context of MPEG4).

DISTRHO Plugin Framework

  • - designed to make development of new plugins an easy and enjoyable task. It allows developers to create plugins with custom UIs using a simple C++ API. The framework facilitates exporting various different plugin formats from the same code-base. DPF can build for LADSPA, DSSI, LV2 and VST formats. All current plugin format implementations are complete. A JACK/Standalone mode is also available, allowing you to quickly test plugins. Plugin DSP and UI communication is done via key-value string pairs.

JUCE C++ Library

  • - Like many other frameworks (e.g., Qt, wxWidgets, GTK+, etc.), JUCE contains classes providing a range of functionality that covers user-interface elements, graphics, audio, XML and JSON parsing, networking, cryptography, multi-threading, an integrated interpreter that mimics ECMAScript's syntax, and various other commonly used features. Application developers needing several third-party libraries may thus be able to consolidate and use only the JUCE library, or at least reduce the number of third-party libraries they use. In this, the original inspiration was Java's JDK, and JUCE was intended to be "something similar for C++".

A notable feature of JUCE when compared to other similar frameworks is its large set of audio functionality; this is because JUCE was originally developed as a framework for Tracktion, an audio sequencer, before being split off into a standalone product. JUCE has support for audio devices (such as CoreAudio, ASIO, ALSA, JACK, WASAPI, DirectSound) and MIDI playback, polyphonic synthesizers, built-in readers for common audio file formats (such as WAV, AIFF, FLAC, MP3 and Vorbis), as well as wrappers for building various types of audio plugin, such as VST effects and instruments. This has led to its widespread use in the audio development community.

JUCE comes with wrapper classes for building audio and browser plugins. When building an audio plugin, a single binary is produced that supports multiple plugin formats (VST & VST3, RTAS, AAX, Audio Units). Since all the platform and format-specific code is contained in the wrapper, a user can build Mac and Windows VST/VST3/RTAS/AAX/AUs from a single codebase.

  • - a 3rd party JUCE module designed for rapid audio application development. It contains classes for audio processing and gui elements. Additionally there are several wrappers around 3rd party libraries including cURL, FFTReal and SoundTouch. dRowAudio is written in the strict JUCE style, closely following the style guide set out at JUCE Coding Standards.


  • Jamoma - a C++ platform for building dynamic and reflexive systems with an emphasis on audio and media. Platform is composed of a layered framework architecture that creates an object model and then specializes that object model for audio and matrix processing, and system automation and management. Jamoma makes use of polymorphic typing, dynamic binding, and introspection to create a cross-platform API pulling ideas from languages such as Smalltalk and Objective-C while remaining within the bounds of the portable and cross-platform C++ context. The implementations include modular environments for Max by Cycling '74 and Pd by Miller Puckette.



  • - A Collection of Useful C++ Classes for Digital Signal Processing. "Techniques for digital signal processing are well guarded and held close to the chest, as they have valuable applications for multimedia content. The black art of Infinite Impulse Response ("IIR") filtering has remained veiled in secrecy with little publicly available source code...until now."

Building on the work of cherished luminaries such as Sophocles Orfanidis, Andreas Antoniou, Martin Holters, and Udo Zolzer, this library harnesses the power of C++ templates to solve a useful problem in Digital Signal Processing: the realization of multichannel IIR filters of arbitrary order and prescribed specifications with various properties such as Butterworth, Chebyshev, Elliptic, and Optimum-L (Legendre) responses. The library is provided under the MIT license and is therefore fully compatible with proprietary usage.

Classes are designed as independent re-usable building blocks. Use some or all of the provided features, or extend the functionality by writing your own objects that plug into the robust framework. Only the code that you need will get linked into your application. Here's a list of features: Exclusive focus on IIR filters instead of boring FIR filters, Complete implementation of all "RBJ Biquad" Cookbook filter formulas, Butterworth, Chebyshev, Elliptic, Bessel, Legendre designs, Low Pass, High Pass, Band Pass, Band Stop transformations, Low, High, and Band Shelf filter implementations for most types, Smooth interpolation of filter settings, pole/zeros, and biquad coefficients to achieve seamless parameter changes, Representation of digital filters using poles and zeros, Realization using Direct Form I, Direct Form II, or user provided class, Fully factored to minimize template instantiations, "Design" layer provides runtime introspection into a filter, Utility template functions for manipulating buffers of sample data, No calls to malloc or new, great for embedded systems, No external dependencies, just the standard C++ library!



  • - Digital audio processing and synthesis library. The following operations are implemented: Amplification, Mixing, Panning, Bit depth conversion, Sample rate conversion, Pitch Shifting. The following oscillators are implemented: Sine wave, Pulse wave, Triangle wave, Sawtooth wave, White noise. The following sample formats are implemented out of the box (you can always add yours): 16 bit signed int, 24 bit signed int, 32 bit signed int, 32 bit float, 64 bit float. The following file formats are supported: Microsoft RIFF Wave. The following audio interfaces available for playback: ALSA.

  • - This library provides a convenient interface for manipulation of MIDI data as well as providing higher level abstractions for I/O.


  • - a set of audio filters. It helps assembling workflows for specific audio processing workloads. The audio workflow is split in independent components (without feedback loops) that consist of filters. Each filter has a set of synchronized input and output ports that can be connected together. All input ports must be connected, but not all output ports need to be. Sampling rate can be independent between input and output ports, but input sampling rates are identical, and output sampling rates are also identical.


  • - Realtime Audio Utility Library, is a C++ utility library primarily aimed at audio/musical applications. It is used by Ingen, Patchage, and Machina.



  • JackCpp - c++ classes which wrap the Jack audio io api and lock-free ring buffer. Works with Linux and OSX (thanks to Will Wolcott for OSX testing and example/test file comments).


  • Maximilian - an open source, MIT licensed C++ audio synthesis library. It’s designed to be cross platform and simple to use. The syntax and program structure are based on the popular ‘Processing’ environment. Maximilian provides standard waveforms, envelopes, sample playback, resonant filters, and delay lines. In addition, equal power stereo, quadraphonic and 8-channel ambisonic support is included. There’s also Granular synthesisers with Timestretching, FFTs and some Music Information Retrieval stuff.

Tracktion Engine


  • - a cross-platform c++ library for online estimation of the Continuous Wavelet Transform (CWT). The online estimation is based on a filterbank implementation of the CWT with minimal delay per scale and optimization based on multi-rate computation. The library also allows for offline estimation of the CWT using FFT.


  • LV2 programming for the complete idiot - an LV2 plugin programming guide for the complete idiot using a set of C++ classes. If you are not a complete idiot, you may want to read the LV2 spec and figure it out for yourself.
  • LV2 is an interface for writing audio processors, or plugins, in C/C++ which can be dynamically loaded into many applications, or hosts. This core specification is simple and minimal, but is designed so that extensions can be defined to add more advanced features, making it possibly to implement nearly any feature imaginable. API docs)




Java Sound


  • JSyn - allows you to develop interactive computer music programs in Java. You can run them as stand-alone applications, or as Applets in a web page. JSyn can be used to generate sound effects, audio environments, or music. JSyn is based on the traditional model of unit generators which can be connected together to form complex sounds. For example, you could create a wind sound by connecting a white noise generator to a low pass filter that is modulated by a random contour generator.


  • Beads is a software library written in Java for realtime audio. It was started by Ollie Bown in 2008. It is an open source project and has been developed with support from Monash University in Melbourne, via the Centre for Electronic Media Art‘s ARC Discovery Grant Project “Creative Ecosystems”, and a Small Grant for Early Career Researchers from the Faculty of Information Technology. Beads contributors includes Ollie Bown, Ben Porter and Benito.


  • JMSL is a Java API for music composition, interactive performance, and intelligent instrument design. With JMSL, the composer/programmer can create stand-alone musical applications or deploy applets on the web. JMSL supports JSyn, MidiShare, MidiPort, and JavaSound.



  • JFugue - Music Programming for Java™ and JVM Languages


  • JASS (Java Audio Synthesis System) is a unit generator based audio synthesis programming environment written in pure Java. Java 1.5 is required. The environment is based on a foundation structure consisting of a small number of Java interfaces and abstract classes, which implement the functionality needed to create filter-graphs, or "patches". Unit generators are created by extending the abstract classes and implementing a single method. Patches are created by linking together unit generators in arbitrary complex graph structures. Patches can be rendered in real-time with special unit generators that communicate with the audio hardware, which have been implemented using the JavaSound API and through JNI for some platforms.


  • jVSTwRapper - an easy and reliable wrapper to write audio plug-ins in Java. It enables you to develop VST (2.4), Audio Unit (AU) and LADSPA compatible audio plugins and virtual instruments plus user interfaces (Swing) and run them on Windows, Linux and Mac OSX. Five demo plugins (+src) are included.



  • PyJack - jack audio client module for python



  • python-wavefile - Pythonic libsndfile wrapper to read and write audio files.


  • PyWavelets is a free Open Source wavelet transform software for Python programming language. It is written in Python, Cython and C for a mix of easy and powerful high-level interface and the best performance.


  • Pippi - Computer music with python


  • Undulance is a Python software synthesis library. It mostly relies on PyPy to work at a sufficient speed, since CPython is too slow.


  • - Waveform and Audio Synthesis library in Go. Klangsynthese right now supports a number of features that will work regardless of OS, and a number of features specific to Windows where the hope is to move support to Linux and Darwin.



See Lua



1996 / OSS in 2002

  • SuperCollider is an environment and programming language for real time audio synthesis and algorithmic composition. It provides an interpreted object-oriented language which functions as a network client to a state of the art, realtime sound synthesis server.

  • Utopia is a SuperCollider library for the creation of networked music applications, and builds upon the work of the Republic Quark and other existing network systems in SuperCollider. It aims to be modular (features available largely 'à la carte'), secure (provides methods for authentication and encryption), and flexible (to the extent possible, it tries not to impose a particular design or architecture). It provides functionality for synchronisation, communication, code sharing, and data sharing.

  • Modality Toolkit - simplifies creation of highly personalised electronic instruments in SuperCollider by introducing a common code interface. This allows for uniform access to HID, MIDI, OSC and GUI-based conrollers, as well as switching of functionality, even at runtime.



  • Overtone is an open source audio environment being created to explore musical ideas from synthesis and sampling to instrument building, live-coding and collaborative jamming. We use the SuperCollider synth server as the audio engine, with Clojure being used to develop the APIs and the application. Synthesizers, effects, analyzers and musical generators can be programmed in Clojure.
lein repl
user=>(use '


  • - provides the basis for developing music systems. It is also designed so to scale to user needs, whether they are exploring and designing low-level signal processing algorithms, developing pre-written compositions, or creating interactive real-time systems. It offers a slim core engine designed to be highly customizable.


  • Mellite - an environment for creating experimental computer-based music and sound art. This system has been developed since 2012 by its author, Hanns Holger Rutz, and is made available under the GNU GPL open source license.



  • ChucK is a programming language for real-time sound synthesis and music creation. It is open-source and freely available on MacOS X, Windows, and Linux. ChucK presents a unique time-based, concurrent programming model that's precise and expressive (we call this strongly-timed), dynamic control rates, and the ability to add and modify code on-the-fly. In addition, ChucK supports MIDI, OpenSoundControl, HID device, and multi-channel audio. It's fun and easy to learn, and offers composers, researchers, and performers a powerful programming tool for building and experimenting with complex audio synthesis/analysis programs, and real-time interactive music. [23]

  • LiCK - Library for ChucK.


  • Processing is a flexible software sketchbook and a language for learning how to code within the context of the visual arts. Since 2001, Processing has promoted software literacy within the visual arts and visual literacy within technology. There are tens of thousands of students, artists, designers, researchers, and hobbyists who use Processing for learning and prototyping.



  • FAUST (Functional Audio Stream) is a functional programming language specifically designed for real-time signal processing and synthesis. FAUST targets high-performance signal processing applications and audio plug-ins for a variety of platforms and standards. Simply put, Faust lets one program dsp code once in a purely functional language, and compile it to various platforms including max/msp, supercollider, audio unit, vst, and more.


git clone git://


Grouping solution 1: [25]

a = vgroup("term1", nentry("a",1,0,10,1));
b = vgroup("term1", nentry("b",2,0,10,1));
x = vgroup("term2", nentry("x",3,0,10,1));
y = vgroup("term2", nentry("y",4,0,10,1));
process = a*x + b*y;

Solution 2:

a = nentry("v:term1/a",1,0,10,1);
b = nentry("v:term1/b",2,0,10,1);
x = nentry("v:term2/x",3,0,10,1);
y = nentry("v:term2/y",4,0,10,1);
process = a*x + b*y

process = _ : _;       // series combination (1 in, 1 out)
process = _ , _;       // parallel combination (2 ins, 2 outs)
process = +;           // summer (2 ins, 1 out)process = _,_ : +;     // same summer
process = _,_ : + : _; // same summer
process = -;           // signal subtractor
process = *;           // pointwise signal multiplier (nonlinear)7
process = /;           // pointwise signal divider (nonlinear)
process = mem;         // unit-sample delay
process = _, 1 : @;    // unit-sample delay
process = _,10 : @;    // ten-sample delay
process = a ~ b;       // feedback thru b around a
process = _ ~ _ ;      // feedback thru _ (generates 0)
process = mem ~ _;     // two-sample closed loop (generates 0)
process = + ~ _;       // digital integrator
process = _ <: _ , _;  // mono to stereo
process = _ <: _ , _, _, _;     // mono to quad
process = _ , _ <: _ , _, _, _; // stereo to quad (see diagram)
process = _ , _ :> _;           // stereo to mono [equiv to +]
process = _, _ , _ , _ :> _ ;   // quad to mono [equiv to +,+:+]

  • PDF: FAUST : an Efficient Functional Approach to DSP Programming - Yann Orlarey, Dominique Fober and Stephane Letz. FAUST is a programming language that provides a purely functional approach to signal processing while offering a high level of performance. FAUST aims at being complementary to existing audio languages by offering a viable and efficient alternative to C/C++to develop signal processing libraries, audio plug-ins or standalone applications. The language is based on a simple and well formed formal semantics. A FAUST program denotes a signal processor, a mathematical function that transforms input signals into output signals. Being able to know precisely what a program computes is important not only for programmers, but also for compilers needing to generate the best possible code. Moreover these semantics questions are crucial for the longterm preservation of music programs. The following paragraphs will give an overview of the language as well as a description of the compiler, including the generation of parallel code.

  • - provides an LV2 plugin architecture for the Faust programming language. The package contains the Faust architecture and templates for the needed LV2 manifest (ttl) files, a collection of sample plugins written in Faust, and a generic GNU Makefile as well as a shell script to compile plugins using the architecture.
  • - provides a VST plugin architecture for the Faust programming language. The package contains the Faust architecture, faustvst.cpp, the faust2faustvst helper script which provides a quick way to compile a plugin, a collection of sample plugins written in Faust, and a generic GNU Makefile for compiling and installing the plugins.

Snack Sound Toolkit

  • Snack Sound Toolkit - designed to be used with a scripting language such as Tcl/Tk or Python. Using Snack you can create powerful multi-platform audio applications with just a few lines of code. Snack has commands for basic sound handling, such as playback, recording, file and socket I/O. Snack also provides primitives for sound visualization, e.g. waveforms and spectrograms. It was developed mainly to handle digital recordings of speech, but is just as useful for general audio. Snack has also successfully been applied to other one-dimensional signals.

Music as Data


  • Extempore - a programming language and runtime environment designed to support 'cyberphysical programming'. Cyberphysical programming supports the notion of a human programmer operating as an active agent in a real-time distributed network of environmentally aware systems. The programmer interacts with the distributed real-time system procedurally by modifying code on-the-fly.

Central to the Extempore programming environment is a new systems programming language designed to support the programming of real-time systems in real-time. xtlang is designed to mix the high-level expressiveness of Lisp with the low-level expressiveness of C. xtlang uses an s-expression syntax common to Lisp, and more particularly to Scheme. xtlang also borrows many Lisp like semantics including first class closures, tail recursion and macros. However, xtlang also borrows heavily from systems languages like 'C' including static typing, low-level type expressivity, direct pointer manipulation and explicit memory managment (i.e. no GC). xtlang then extends these 'C' semantics with type-inferencing, ad-hoc polymorphism, reified generics, and zone/region based memory management.


  • Nyquist is a sound synthesis and composition language offering a Lisp syntax as well as an imperative language syntax (SAL) and a powerful integrated development environment.. Nyquist is an elegant and powerful system based on functional programming.


  • athenaCL system - an open-source, object-oriented composition tool written in Python. The system can be scripted and embedded, and includes integrated instrument libraries, post-tonal and microtonal pitch modeling tools, multiple-format graphical outputs, and musical output in Csound, SuperCollider, Pure Data, MIDI, audio file, XML, and text formats.

Musical parts are deployed as Textures, layered surface-defining objects containing numerous independent ParameterObjects to control pitch, tempo, rhythm, amplitude, panning, and instrument (Csound) parameters. The system includes an integrated library of Csound and SuperCollider instruments, and supports output for external Csound instruments, MIDI, and a variety of alternative formats. Over eighty specialized Generator, Rhythm, and Filter ParameterObjects provide tools for stochastic, chaotic, cellular automata based, Markov based, generative grammar and Lindenmayer system (L-system), wave-form, fractional noise (1/f), genetic, Xenakis sieve, linear and exponential break-point segments, masks, and various other algorithmic models. ParameterObjects can be embedded in other ParameterObjects to provide powerful dynamic and masked value generation. Textures can be combined and edited, and tuned with algorithmic Temperament objects. Texture Clones allow the filtering and processing of Texture events, performing transformations not possible with parameter generation alone.

Audio Programming Environment

  • Audio Programming Environment (APE) is an open-source audio plugin, that allows to directly script/code DSP real time, integrated in your signal chain. Utilizing a built-in code editor, compiler, console and a basic control surface API, testing and prototyping DSP is extremely easy and convenient.


  • AudioKit is a powerful audio toolkit for synthesizing, processing, and analyzing sounds. It contains several examples for iOS (iPhone & iPad) and Mac OSX, written in both Objective-C and in Swift. A test suite is provided for many of the operations included in AudioKit. A playground project can be used for trying out AudioKit instruments and for greatly speeding up the development of your own instruments and applications. [26]

The #MusicBricks Toolkit


  • Jacktube is an open source audio/MIDI processing program. It uses LADSPA and DSSI plugins to generate and process audio, and MIDI events to control its operation. The exact behavior is defined by using a simple scripting language to define rules. Even though Jacktube is primarily meant for audio work, it can be used in any signal processing application. The language has some superficial similarities to Perl, but the programming language is designed to be as small and efficient as possible for its purpose, namely setting up plugin graphs and responding to MIDI events.

Sonic Pi

  • Sonic Pi - The Live Coding Synth for Everyone. Simple enough for computing and music lessons. Powerful enough for professional musicians. Free to download with a friendly tutorial. Learn to code creatively by composing or performing music in an incredible range of styles from classical to algorave. Ruby DSL.



  • alda - Inspired by other music/audio programming languages such as PPMCK, LilyPond and ChucK, Alda aims to be a powerful and flexible programming language for the musician who wants to easily compose and generate music on the fly, using naught but a text editor. Alda is designed in a way that equally favors aesthetics, flexibility and ease of use, with (eventual) support for the text-based creation of all manner of music: classical, popular, chiptune, electroacoustic, and more! [27] [28] [29]

Platonic Music Engine

  • Platonic Music Engine takes an initial input from the user (like a name or number or random string of characters) and converts it using a non-random process into a piece of music, the Platonic Score. The software then allows you to manipulate this random-sounding music via the use of various style algorithms and quantizers into sounding like any style of music imaginable while still preserving the Platonic Score in its core.

Serpent / Aura

  • Serpent is the scripting language for Aura, a platform for computer music, animation, and interactive systems. Serpent was designed and implemented as a stand-alone, general purpose interpreter. Serpent is perhaps ideal as a game scripting language due to its real-time design and support for external C++ objects and C functions. It is open source, and I would be happy to share code as well as future design and development with others.
  • AuraRT is a software framework for creating interactive multimedia software, particularly advanced interactive music compositions. A subproject is AuraFX, a flexible signal processor configurable by end-users.


  • TMC - Tiny Music Compiler, a DSL (Domain-Specific Language) that describes a set of operations of audio files. It does not manipulate audio itself. Instead, it calls existing tools such as SoX.


  • bipscript - a simple programming environment for creating music. instantiate and connect LV2 plugins to create audio and MIDI networks, schedule MIDI and other control events directly on the plugins and system outputs, schedule logic to react to external events e.g. from a human performer.

"For this example we'll create two LV2 plugins: a software synthesizer and a reverb; we'll feed the output of the synth into the reverb and connect the reverb to the main system outputs. Also note that we set the initial value of the reverb amount to zero;"

local synth = Lv2.Plugin("", "Velo Bee")
local reverb = Lv2.Plugin("")
reverb.setControl("amount", 0.0)
local mainOutput = Audio.StereoOutput("main", "system:playback_1", "system:playback_2")



  • wcnt - Wav Composer Not Toilet is a not real time modular audio synthesis/sequencer/sampler application for GNU/Linux systems. It outputs audio into 8/16/24/32bit PCM or floating point format .WAV audio files. wcnt is commandline based and reads plain text files, within which definitions of modules and data objects are placed. Modules are where the synthesis/sampling/sequencing happens and operate on a sample by sample basis. Transmission of events between modules only occurrs at the time of the event. Occurrences of events are transparent, the data stream is continuous.


  • Tao is a software package for sound synthesis using physical models. It provides a virtual acoustic material constructed from masses and springs which can be used as the basis for building quite complex virtual musical instruments. Tao comes with a synthesis language for creating and playing instruments and a fully documented (eventually) C++ API for those who would like to use it as an object library.


  • Modus is an open source, cross-platform C++ library which allows you to handle music from code.



  • AeonWave is a low-level, hardware accelerated 4D spatialized audio library aimed at the professional simulation market. The software currently runs on Windows and Linux for ARM and x86 and tests have shown that AeonWave renders 3D audio between 450% and 1400% faster than any competing product depending on the hardware configuration. AeonWave started out as project Anaconda; a fast rendering new OpenAL implementation. After realizing this would not be good enough for spatialized 3D audio demands the library has been rewritten;

Game audio

  • - a high level 2D and 3D cross platform (Windows, Mac OS X, Linux) sound engine and audio library which plays WAV, MP3, OGG, FLAC, MOD, XM, IT, S3M and more file formats, and is usable in C++ and all .NET languages (C#, VisualBasic.NET, etc). It has all the features known from low level audio libraries as well as lots of useful features like a sophisticated streaming engine, extendable audio reading, single and multithreading modes, 3d audio emulation for low end hardware, a plugin system, multiple rolloff models and more. All this can be accessed via an extremely simple API.

Creative / live coding

to resort

  • - a type of computer programming in which the goal is to create something expressive instead of something functional. It is used to create live visuals and for VJing, as well as creating visual art and design, art installations, projections and projection mapping, sound art, advertising, product prototypes, and much more.


  • Fluxus -a rapid prototyping, playing and learning environment for 3D graphics, sound and games. Extends the Racket language with graphical commands and can be used within it’s own livecoding environment or from within the DrRacket IDE. Fluxus is crossplatform (Linux, Windows, OSX, Android, PS2), and is released under the GPL licence.


  • Impromptu - an OSX programming language and environment for composers, sound artists, VJ's and graphic artists with an interest in live or interactive programming. Impromptu is a Scheme language environment, a member of the Lisp family of languages. Impromptu is used by artist-programmers in livecoding performances around the globe.


  • ixi lang v3 live coding environment is an extremely simple and visual system, presenting a high entry level control over synth definitions and samples in SuperCollider. The core idea is to represent events in a spatial layout, thus merging musical code and musical scores. The score is active, i.e., if a method is performed upon the score, it changes in real time.. The development of ixi lang is part of a research involving human-machine interaction, the philosophy of technology and the culture of software use in music. In return for this free software we would like to ask you few questions regarding your experience of the software.


  • TidalCycles - or Tidal for short, is a language for live coding pattern. It allows you to make musical patterns with text, describing sequences and ways of transforming and combining them, exploring complex interactions between simple parts.

Tidal allows you to express music with very flexible timing, providing a little language for describing patterns as step sequences (which can be polyphonic and polymetric), some generators of continuous patterns (e.g. sinewaves, sawtooths) and a wide range of pattern transformations. Tidal is highly ‘composable’ in that pattern transformations can be easily combined together, allowing you to quickly create complex patterns from simple ingredients.


  • Livecodelab is a special secret place where you can make fancy "on-the-fly" 3d visuals and play awesomely offbeat (literally) sounds. "On-the-fly" meaning: as you type. Type just three letters: "box", and boom! a box appears. No clicking play, no waiting, no nothing. What are you waiting for? Try the magic. Press the button below and play with the examples.


  • Al-Jazari is livecoded entirely by gamepad, and employs a simple graphical language to allow robots to interact with each other and move over a terrain populated by audio triggers. The running code is displayed and edited in thought bubbles over each robot. For upcoming performance dates see this page.







  • Protoplug - a VST/AU plugin that lets you load and edit Lua scripts as audio effects and instruments. The scripts can process audio and MIDI, display their own interface, and use external libraries. Transform any music software into a live coding environment!



  • Klangmeister is a live coding environment for the browser. It lets you design synthesisers and compose music using computer code - without having to install anything on your own computer. Klangmeister works best in Chrome, because the synthesis features that it relies on have patchy support across the other browsers. [31]



  • Echolevel - Wulfcode - a MIDI live coding language and environment. The current release is written in Java and runs on MacOS, Windows and Linux, but I'm working on a new version that runs in a cross-platform Electron shell and uses Web Audio API and Web MIDI API - follow its progress here.
    • - Ostensibly for live-coding, but it’s dramatically simplified compared to the SuperCollider and Csound-based environments used by the live-coding community proper. It’s an object-oriented, text-based MIDI sequencer with its own syntax, and a repertoire of commands and structures that allow interesting looping motifs and polyrythmic phrases to be easily generated and manipulated on the fly.


  • FoxDot is a pre-processed Python-based programming language that provides a fast and user-friendly abstraction to SuperCollider. It also comes with its own IDE, which means it can be used straight out of the box and no fiddling around with config files.



  • - A wxWidgets launcher for Fragment Audio Server built for the Fragment Synthesizer, a web-based Collaborative Spectral Synthesizer. This program should compile on most platforms. This program is a simple native launcher which provide an easy to use interface to start the Fragment Audio Server, the launcher also provide a convenient way to configure the audio server for individual sessions and provide a direct way to launch the web. application pre-linked with the native audio server by passing ?fas=1 as a web. argument.

Fragment add sine waves together to produce sounds, the software gather frequencies from vertical slices containing the pixels data of a graphical WebGL powered canvas, each horizontal lines of the score is associated to a pure sine wave generator, all the vertical slices are grouped into one before being "fed" to the synthesis engine, the pixels data (red and green channel) determine the amplitude of the associated sine wave for each audio channels (it is stereophonic) and the vertical position of the pixel determine which sine wave generator is active, the synthesizer is mainly controlled by the visuals generated from the GLSL script.

The synthesizer support the WebMIDI API which is only supported by Chrome and Opera at the moment, it is possible to assign controllers to widgets and controls the GLSL script "uniform" variables.

Praxis Live

  • Praxis LIVE - an open-source hybrid visual environment for live creative coding. Praxis LIVE mixes intuitive real-time visual node editing, with a range of built-in components for audio, visual & data processing, together with an embedded compiler and editor for live-coding Processing, Java and GLSL.

While including specific support for audio and video processing, Praxis LIVE is designed to support other forms of cyber-physical coding.


  • Cinder - a C++ library for programming with aesthetic intent - the sort of development often called creative coding. This includes domains like graphics, audio, video, and computational geometry. Cinder is cross-platform, with official support for macOS, Windows, Linux, iOS, and Windows UWP. Cinder is production-proven, powerful enough to be the primary tool for professionals, but still suitable for learning and experimentation. Cinder is released under the 2-Clause BSD License.


JavaScript / Web Audio


  • Synthy - an online synthesiser and sequencer with live world output and colours made by Filip Hnízdo using the Web Audio API, the live server is powered by and Node.js. The database of patterns pushed to synthy is powered by the wonderful NeDB.



  • DSP.js is a comprehensive digital signal processing library for javascript. It includes many functions for signal analysis and generation, including Oscillators(sine, saw, square, triangle), Window functions (Hann, Hamming, etc), Envelopes(ADSR), IIR Filters(lowpass, highpass, bandpass, notch), FFT and DFT transforms, Delays, Reverb.


  • Tone.js is a Web Audio framework for creating interactive music in the browser. The architecture of Tone.js aims to be familiar to both musicians and audio programmers looking to create web-based audio applications. On the high-level, Tone offers common DAW (digital audio workstation) features like a global transport for scheduling and timing events and prebuilt synths and effects. For signal-processing programmers (coming from languages like Max/MSP), Tone provides a wealth of high performance, low latency building blocks and DSP modules to build your own synthesizers, effects, and complex control signals.


  • tonal is a modular, functional music theory library. Built from a collection of modules, it's able to create and manipulate tonal elements of music (pitches, chords, scales, keys). It deals with abstractions (not actual music) and while is designed for algorithmic composition and music generation, can be used to develop any kind of midi or audio software.


  • - a JavaScript library for real-time audio synthesis and composition from within the browser. It uses graph-based routing and pattern-based scheduling to make complex audio simple to program, and easy to understand.



  • - Wad is a Javascript library for manipulating audio using the new HTML5 Web Audio API. It greatly simplifies the process of creating, playing, and manipulating audio, either for real-time playback, or at scheduled intervals. Wad provides a simple interface to use many features one would find in a desktop DAW (digital audio workstation), but doesn't require the user to worry about sending XHR requests or setting up complex audio graphs.


Opimodus / OMN



Neural net

Audio formats

  • Xiph.Org's Digital Show & Tell- video on digital media explores multiple facets of digital audio signals and how they really behave in the real world.


  • - a method used to digitally represent sampled analog signals. It is the standard form of digital audio in computers, Compact Discs, digital telephony and other digital audio applications. In a PCM stream, the amplitude of the analog signal is sampled regularly at uniform intervals, and each sample is quantized to the nearest value within a range of digital steps.

Linear pulse-code modulation (LPCM) is a specific type of PCM where the quantization levels are linearly uniform. This is in contrast to PCM encodings where quantization levels vary as a function of amplitude (as with the A-law algorithm or the μ-law algorithm). Though PCM is a more general term, it is often used to describe data encoded as LPCM.

A PCM stream has two basic properties that determine the stream's fidelity to the original analog signal: the sampling rate, which is the number of times per second that samples are taken; and the bit depth, which determines the number of possible digital values that can be used to represent each sample.

  • - a signal encoder that uses the baseline of pulse-code modulation (PCM) but adds some functionalities based on the prediction of the samples of the signal. The input can be an analog signal or a digital signal. If the input is a continuous-time analog signal, it needs to be sampled first so that a discrete-time signal is the input to the DPCM encoder. DPCM was invented by C. Chapin Cutler at Bell Labs in 1950; his patent includes both methods.

Option 1: take the values of two consecutive samples; if they are analog samples, quantize them; calculate the difference between the first one and the next; the output is the difference, and it can be further entropy coded. Option 2: instead of taking a difference relative to a previous input sample, take the difference relative to the output of a local model of the decoder process; in this option, the difference can be quantized, which allows a good way to incorporate a controlled loss in the encoding. Applying one of these two processes, short-term redundancy (positive correlation of nearby values) of the signal is eliminated; compression ratios on the order of 2 to 4 can be achieved if differences are subsequently entropy coded, because the entropy of the difference signal is much smaller than that of the original discrete signal treated as independent samples.

  • - ADPCM) is a variant of differential pulse-code modulation (DPCM) that varies the size of the quantization step, to allow further reduction of the required data bandwidth for a given signal-to-noise ratio. Typically, the adaptation to signal statistics in ADPCM consists simply of an adaptive scale factor before quantizing the difference in the DPCM encoder. ADPCM was developed in the early 1970s at Bell Labs for voice coding, by P. Cummiskey, N. S. Jayant and James L. Flanagan


  • - a form of modulation used to represent an analog signal with a binary signal. In a PDM signal, specific amplitude values are not encoded into codewords of pulses of different weight as they would be in pulse-code modulation (PCM). Instead, it is the relative density of the pulses that corresponds to the analog signal's amplitude. The output of a 1-bit DAC is the same as the PDM encoding of the signal. Pulse-width modulation (PWM) is a special case of PDM where the switching frequency is fixed and all the pulses corresponding to one sample are contiguous in the digital signal. For a 50% voltage with a resolution of 8-bits, a PWM waveform will turn on for 128 clock cycles and then off for the remaining 128 cycles. With PDM and the same clock rate the signal would alternate between on and off every other cycle. The average is 50% for both waveforms, but the PDM signal switches more often. For 100% or 0% level, they are the same.


  • - a Microsoft and IBM audio file format standard for storing an audio bitstream on PCs. It is an application of the Resource Interchange File Format (RIFF) bitstream format method for storing data in "chunks", and thus is also close to the 8SVX and the AIFF format used on Amiga and Macintosh computers, respectively. It is the main format used on Windows systems for raw and typically uncompressed audio. The usual bitstream encoding is the linear pulse-code modulation (LPCM) format.

  • - an extension of the popular Microsoft WAV audio format and is the recording format of most file-based non-linear digital recorders used for motion picture, radio and television production. It was first specified by the European Broadcasting Union in 1997, and updated in 2001 and 2003. The purpose of this file format is the addition of metadata to facilitate the seamless exchange of sound data between different computer platforms and applications. It specifies the format of metadata, allowing audio processing elements to identify themselves, document their activities, and supports timecode to enable synchronization with other recordings. This metadata is stored as extension chunks in a standard digital audio WAV file.


  • MP3 (MPEG-1 or MPEG-2 Audio Layer III) is a patented encoding format for digital audio which uses a form of lossy data compression. It is a common audio format for consumer audio streaming or storage, as well as a de facto standard of digital audio compression for the transfer and playback of music on most digital audio players.


for f in *.wav ; do lame "$f" ; done

  • mp3fs is a read-only FUSE filesystem which transcodes between audio formats (currently FLAC to MP3) on the fly when files are opened and read. It can let you use a FLAC collection with software and/or hardware which only understands the MP3 format, or transcode files through simple drag-and-drop in a file browser. [41]


  • id3reader - a Python module that reads ID3 metadata tags in MP3 files. It can read ID3v1, ID3v2.2, ID3v2.3, or ID3v2.4 tags. It does not write tags at all.

  • - a Python module to handle audio metadata. It supports ASF, FLAC, MP4, Monkey's Audio, MP3, Musepack, Ogg Opus, Ogg FLAC, Ogg Speex, Ogg Theora, Ogg Vorbis, True Audio, WavPack, OptimFROG, and AIFF audio files. All versions of ID3v2 are supported, and all standard ID3v2.4 frames are parsed. It can read Xing headers to accurately calculate the bitrate and length of MP3s. ID3 and APEv2 tags can be edited regardless of audio format. It can also manipulate Ogg streams on an individual packet/page level.
  • eyeD3 - a Python tool for working with audio files, specifically mp3 files containing ID3 metadata (i.e. song info). It provides a command-line tool (eyeD3) and a Python library (import eyed3) that can be used to write your own applications or plugins that are callable from the command-line tool.


  • - AAC, is a standardized, lossy compression and encoding scheme for digital audio. Designed to be the successor of the MP3 format, AAC generally achieves better sound quality than MP3 at similar bit rates.


Container format.

  • - a metadata container used in the Vorbis, FLAC, Theora, Speex and Opus file formats. It allows information such as the title, artist, album, track number or other information about the file to be added to the file itself. However, as the official Ogg Vorbis documentation notes, “[the comment header] is meant for short, text comments, not arbitrary metadata; arbitrary metadata belongs in a separate logical bitstream (usually an XML stream type) that provides greater structure and machine parseability.”


  • Opus is a totally open, royalty-free, highly versatile audio codec. Opus is unmatched for interactive speech and music transmission over the Internet, but is also intended for storage and streaming applications. It is standardized by the Internet Engineering Task Force (IETF) as RFC 6716 which incorporated technology from Skype's SILK codec and Xiph.Org's CELT codec.

Best lossy audio format, replaces Vorbis and Speex.

ffmpeg -i input -acodec libopus -b:a bitrate -vbr on -compression_level 10 output



    • - a lossy audio compression format specifically tuned for the reproduction of human speech and also a free software speech codec that may be used on VoIP applications and podcasts. It is based on the CELP speech coding algorithm. Speex claims to be free of any patent restrictions and is licensed under the revised (3-clause) BSD license. It may be used with the Ogg container format or directly transmitted over UDP/RTP. It may also be used with the FLV container format. The Speex designers see their project as complementary to the Vorbis general-purpose audio compression project.

  • Codec 2 | Rowetel - Codec 2 is an open source speech codec designed for communications quality speech between 700 and 3200 bit/s. The main application is low bandwidth HF/VHF digital radio. It fills a gap in open source voice codecs beneath 5000 bit/s and is released under the GNU Lesser General Public License (LGPL).

The Codec 2 project also contains several modems (FDMDV, COHPSK and mFSK) carefully designed for digital voice over HF radio; GNU Octave simulation code to support the codec and modem development; and FreeDV – an open source digital voice protocol that integrates the modems, codecs, and FEC. FreeDV is available as a GUI application, an open source library (FreeDV API), and in hardware (the SM1000 FreeDV adaptor).


  • FLAC - stands for Free Lossless Audio Codec, an audio format similar to MP3, but lossless, meaning that audio is compressed in FLAC without any loss in quality. This is similar to how Zip works, except with FLAC you will get much better compression because it is designed specifically for audio, and you can play back compressed FLAC files in your favorite player (or your car or home stereo, see supported devices) just like you would an MP3 file. FLAC stands out as the fastest and most widely supported lossless audio codec, and the only one that at once is non-proprietary, is unencumbered by patents, has an open-source reference implementation, has a well documented format and API, and has several other independent implementations.


  • - a family of proprietary audio codec compression algorithms currently owned by Qualcomm. The original aptX algorithm was developed in the 1980s by Dr. Stephen Smyth as part of his Ph.D. research at Queen's University Belfast School of Electronics, Electrical Engineering and Computer Science; its design is based on time domain ADPCM principles without psychoacoustic auditory masking techniques.

Codec 2

  • Codec 2 - Codec 2 is an open source speech codec designed for communications quality speech between 700 and 3200 bit/s. The main application is low bandwidth HF/VHF digital radio. It fills a gap in open source voice codecs beneath 5000 bit/s and is released under the GNU Lesser General Public License (LGPL). The Codec 2 project also contains several modems (FDMDV, COHPSK and mFSK) carefully designed for digital voice over HF radio; GNU Octave simulation code to support the codec and modem development; and FreeDV – an open source digital voice protocol that integrates the modems, codecs, and FEC. FreeDV is available as a GUI application, an open source library (FreeDV API), and in hardware (the SM1000 FreeDV adaptor).


  • WavPack - a completely open audio compression format providing lossless, high-quality lossy, and a unique hybrid compression mode. For version 5.0.0, several new file formats and lossless DSD audio compression were added, making WavPack a universal audio archiving solution.


  • - DSD is the name of a trademark used by Sony and Philips for their system of digitally recreating audible signals for the Super Audio CD (SACD). DSD uses pulse-density modulation encoding—a technology to store audio signals on digital storage media that are used for the SACD. The signal is stored as delta-sigma modulated digital audio, a sequence of single-bit values at a sampling rate of 2.8224 MHz (64 times the CD audio sampling rate of 44.1 kHz, but only at 1/32768 of its 16-bit resolution). Noise shaping occurs by use of the 64-times oversampled signal to reduce noise and distortion caused by the inaccuracy of quantization of the audio signal to a single bit. Therefore, it is a topic of discussion whether it is possible to eliminate distortion in one-bit delta-sigma conversion.

  • YouTube: DSD Explained part 1 - At the 1996 AES Convention in Copenhagen the formerly CBS Research Lab introduced Direct Stream Digital, DSD for short, as a archive format that offered compact files while it contains sufficient information for conversion to sample rates as high as 352.8 kHz - unheard of in 1996. From there is went to SACD and now to DSD downloads. What is DSD all about? Hans Beekhuyzen explains.

  • Super Audio CD decoder - a command-line application which takes a Super Audio CD source and extracts a 24-bit high resolution wave file. It handles both DST and DSD streams. The application reads the following input: SACD image files (*.iso), Sony DSF files (*.dsf), Philips DSDIFF files (*.dff). Supported output sample rates: 88.2KHz, 96KHz, 176.4KHz, 192KHz

  • What is DoP (DSD over PCM)? - It involves taking groups of 16 adjacent 1-bit  samples from a DSD stream and packing them into the lower 16 bits of a 24/176.4 data stream. Data from the other channel of the stereo pair is packed the same way. A specific marker code in the top 8 bits identifies the data stream as DoP, rather than PCM. The resulting DoP stream can be transmitted through existing 24/192-capable USB, AES, Dual AES or SPDIF interfaces to a DoP-compatible DAC, which reassembles the original stereo DSD data stream COMPLETELY UNCHANGED. If something goes wrong and the data stream is decoded as PCM, the output will be low-level noise with faint music in the back ground, so it fails safely. This can happen if the computer erases the marker code by applying a volume adjustment.

  • - or DXD, a digital audio format that originally was developed for editing high-resolution recordings recorded in DSD, the audio standard used on Super Audio CD (SACD). As the 1-bit DSD format used on SACD is not suitable for editing, alternative formats such as DXD or DSD-Wide must be used during the mastering stage. In contrast with DSD-Wide or DSD Pure which offers level, EQ, and crossfade edits at the DSD sample rate (64fs, 2.822 MHz), DXD is a PCM signal with 24-bit resolution (8 bits more than the 16 bits used for Red Book CD) sampled at 352.8 kHz – eight times 44.1 kHz, the sampling frequency of Red Book CD. The data rate is 8.4672 Mbit/s per channel – three times that of DSD64.



Dolby Digital / AC-3

  • - the name for audio compression technologies developed by Dolby Laboratories. Originally named Dolby Stereo Digital until 1994, except for Dolby TrueHD, the audio compression is lossy. The first use of Dolby Digital was to provide digital sound in cinemas from 35mm film prints; today, it is now also used for other applications such as TV broadcast, radio broadcast via satellite, DVDs, Blu-ray discs and game consoles. This format has different names: Dolby Digital, DD (an abbreviation for Dolby Digital, often combined with channel count; for instance, DD 2.0, DD 5.1), AC-3 (Audio Codec 3, Advanced Codec 3, Acoustic Coder 3. [These are backronyms. Adaptive Transform Acoustic Coding 3 is a separate format developed by Sony.]). ATSC A/52 is name of the standard.

  • - also known as Enhanced AC-3 (and commonly abbreviated as DD+ or E-AC-3, or EC-3) is a digital audio compression scheme developed by Dolby Labs for transport and storage of multi-channel digital audio. It is a successor to Dolby Digital (AC-3), also developed by Dolby, and has a number of improvements including support for a wider range of data rates (32 Kbit/s to 6144 Kbit/s), increased channel count and multi-program support (via substreams), and additional tools (algorithms) for representing compressed data and counteracting artifacts. While Dolby Digital (AC-3) supports up to 5 full-bandwidth audio channels at a maximum bitrate of 640 Kbit/s, E-AC-3 supports up to 15 full-bandwidth audio channels at a maximum bitrate of 6.144 Mbit/s. The full set of technical specifications for E-AC-3 (and AC-3) are standardized and published in Annex E of ATSC A/52:2012, as well as Annex E of ETSI TS 102 366 V1.2.1 (2008–08), published by the Advanced Television Systems Committee.

Dolby AC-4

  • - Dolby AC-4 is an audio compression standard supporting multiple audio channels and/or audio objects. Support for 5.1 channel audio is mandatory and additional channels up to 7.1.4 are optional.[30] AC-4 provides a 50% reduction in bit rate over AC-3/Dolby Digital Plus

Dolby TrueHD

  • - a lossless multi-channel audio codec developed by Dolby Laboratories which is used in home-entertainment equipment such as Blu-ray Disc players and A/V receivers. It is one of the successors to the Dolby Digital (AC-3) surround sound codec, which is used as the audio standard for the DVD-Video format. In this application, Dolby TrueHD competes with DTS-HD Master Audio, a lossless codec from DTS.

Dolby TrueHD uses Meridian Lossless Packing (MLP) as its mathematical basis for compressing audio samples. MLP is also used in the DVD-Audio format, but details of Dolby TrueHD and the MLP Lossless format as used on DVD-Audio differ substantially. A Dolby TrueHD bitstream can carry up to 16 discrete audio channels. Sample depths up to 24 bits/sample and audio sample rates up to 192 kHz are supported. Like the more common legacy codec Dolby Digital, Dolby TrueHD bitstreams carry program metadata. Metadata is separate from the coding format and compressed audio samples, but stores relevant information about the audio waveform and provides control over the decoding process. For example, dialog normalization and dynamic range compression are controlled by metadata embedded in the Dolby TrueHD bitstream. Similarly, a Dolby Atmos encoded Dolby TrueHD stream contains metadata to extract and place the objects in relevant positions. Dolby TrueHD is a variable bit-rate codec.


  • - a family of proprietary audio compression algorithms developed by Sony. MiniDisc was the first commercial product to incorporate ATRAC in 1992. ATRAC allowed a relatively small disc like MiniDisc to have the same running time as CD while storing audio information with minimal loss in perceptible quality. Improvements to the codec in the form of ATRAC3, ATRAC3plus, and ATRAC Advanced Lossless followed in 1999, 2002, and 2006 respectively. Other MiniDisc manufacturers such as Sharp and Panasonic also implemented their own versions of the ATRAC codec. Sony has all but dropped the ATRAC related codecs in the USA and Europe and in their SonicStage powered 'Connect' Music Service (Sony's equivalent of iTunes) on 31 March 2008. However, it is being continued in Japan and various other countries.


  • - an early form of lossy compression for digital audio. It was originally developed in the early 1970s for point-to-point links within broadcasting networks. In the 1980s, broadcasters began to use NICAM compression for transmissions of stereo TV sound to the public.


  • - a device, invented in 1877, for the mechanical recording and reproduction of sound. In its later forms, it is also called a gramophone (as a trademark since 1887, as a generic name in the UK since 1910), or, since the 1940s, a record player. The sound vibration waveforms are recorded as corresponding physical deviations of a spiral groove engraved, etched, incised, or impressed into the surface of a rotating cylinder or disc, called a "record". To recreate the sound, the surface is similarly rotated while a playback stylus traces the groove and is therefore vibrated by it, very faintly reproducing the recorded sound. In early acoustic phonographs, the stylus vibrated a diaphragm which produced sound waves which were coupled to the open air through a flaring horn, or directly to the listener's ears through stethoscope-type earphones.

  • - the earliest commercial medium for recording and reproducing sound. Commonly known simply as "records" in their era of greatest popularity (c. 1896–1915), these hollow cylindrical objects have an audio recording engraved on the outside surface, which can be reproduced when they are played on a mechanical cylinder phonograph. In the 1910s, the competing disc record system triumphed in the marketplace to become the dominant commercial audio medium.
  • - also known as a gramophone record, especially in British English, or record) is an analog sound storage medium in the form of a flat disc with an inscribed, modulated spiral groove. The groove usually starts near the periphery and ends near the center of the disc. At first, the discs were commonly made from shellac; starting in the 1950s polyvinyl chloride became common. In recent decades, records have sometimes been called vinyl records, or simply vinyl, although this would exclude most records made until after World War II.

  • - more commonly called a phonograph cartridge or phono cartridge or (colloquially) a pickup, is an electromechanical transducer used in the playback of analog sound recordings called records on a record player, now commonly called a turntable because of its most prominent component but formally known as a phonograph in the US and a gramophone in the UK. The cartridge contains a removable or permanently mounted stylus, the tip - usually a gemstone like diamond or sapphire - of which makes physical contact with the record's groove. In popular usage and in disc jockey jargon, the stylus, and sometimes the entire cartridge, is often called the needle. As the stylus tracks the serrated groove, it vibrates a cantilever on which is mounted a permanent magnet which moves between the magnetic fields of sets of electromagnetic coils in the cartridge (or vice versa: the coils are mounted on the cantilever, and the magnets are in the cartridge). The shifting magnetic fields generate an electrical current in the coils. The electrical signal generated by the cartridge can be amplified and then converted into sound by a loudspeaker.

Playlist formats

playlist='play.m3u' ; if [ -f $playlist ]; then rm $playlist ; fi ; for f in *.mp3; do echo "$(pwd)/$f" >> "$playlist"; done
  # create m3u playlist with absolute file paths


  • - the direct digital-to-digital conversion of one encoding to another,[1] such as for movie data files (e.g., PAL, SECAM, NTSC), audio files (e.g., MP3, WAV), or character encoding (e.g., UTF-8, ISO/IEC 8859). This is usually done in cases where a target device (or workflow) does not support the format or has limited storage capacity that mandates a reduced file size,[2] or to convert incompatible or obsolete data to a better-supported or modern format.


  • SoundConverter - the leading audio file converter for the GNOME Desktop. It reads anything GStreamer can read (Ogg Vorbis, AAC, MP3, FLAC, WAV, AVI, MPEG, MOV, M4A, AC3, DTS, ALAC, MPC, Shorten, APE, SID, MOD, XM, S3M, etc...), and writes to Opus, Ogg Vorbis, FLAC, WAV, AAC, and MP3 files, or use any GNOME Audio Profile.

Perl Audio Converter

  • Perl Audio Converter - A tool for converting multiple audio types from one format to another. It supports the following audio formats: 3G2, 3GP, 8SVX, AAC, AC3, ADTS, AIFF, AL, AMB, AMR, APE, AU, AVR, BONK, CAF, CDR, CVU, DAT, DTS, DVMS, F32, F64, FAP, FLA, FLAC, FSSD, GSRT, HCOM, IMA, IRCAM, LA, MAT, MAUD, MAT4, MAT5, M4A, MP2, MP3, MP4, MPC, MPP, NIST, OFF, OFR, OFS, OPUS, OGA, OGG, PAF, PRC, PVF, RA, RAW, RF64, SD2, SF, SHN, SMP, SND, SOU, SPX, SRN, TAK, TTA, TXW, VOC, VMS, VQF, W64, WAV, WMA, and WV.

Secret Rabbit Code

  • Secret Rabbit Code - aka libsamplerate, is a Sample Rate Converter for audio. One example of where such a thing would be useful is converting audio from the CD sample rate of 44.1kHz to the 48kHz sample rate used by DAT players. SRC is capable of arbitrary and time varying conversions ; from downsampling by a factor of 256 to upsampling by the same factor. Arbitrary in this case means that the ratio of input and output sample rates can be an irrational number. The conversion ratio can also vary with time for speeding up and slowing down effects.

SRC provides a small set of converters to allow quality to be traded off against computation cost. The current best converter provides a signal-to-noise ratio of 145dB with -3dB passband extending from DC to 96% of the theoretical best bandwidth for a given pair of input and output sample rates. Since the library has few dependencies beyond that provided by the standard C library, it should compile and work on just about any operating system. It is known to work on Linux, MacOSX, Win32 and Solaris. With some relatively minor hacking it should also be relatively easy to port it to embedded systems and digital signal processors.


  • audiomap - a program which converts from any audio format to any other in a uniform & sane fashion so you don't have to learn all the options for conversion program x. It will preserve all tags in the process. It's goal is bullcrap free conversion. I wrote it because while there are plunty of shell scripts out there to convert things to/from a few formats, they suck at handling weird characters and often even spaces! On top of this, they usually do not properly preserve metadata. Audiomap works with funky chars, spaces, and preserves metadata - all whilst providing encoding/decoding through many formats.




  • AudioMove - a simple, easy to use GUI-based batch audio file copy-and-conversion program.



  • fmedia - fast media player/recorder/converter - a fast asynchronous media player/recorder/converter for Windows, Linux and FreeBSD. It provides smooth playback and recording even if devices are very slow. It's highly customizable and can be easily extended with additional plugins. Its low CPU & memory consumption saves energy when running on a notebook's battery. Play or convert audio files, record new audio tracks from microphone, save songs from Internet radio, and much more! fmedia is free and open-source project, and you can use it as a standalone application or as a library for your own software. fmedia can decode: .mp3, .ogg (Vorbis, Opus), .opus, .m4a/.mp4 (AAC, ALAC, MPEG), .mka/.mkv (AAC, ALAC, MPEG, Vorbis), .avi (AAC, MPEG), .aac, .mpc, .flac, .ape, .wv, .wav. fmedia can encode into: .mp3, .ogg, .opus, .m4a (AAC), .flac, .wav.


 arecord -D hw:0 -f cd test.wav



  • Ecasound is a software package designed for multitrack audio processing. It can be used for simple tasks like audio playback, recording and format conversions, as well as for multitrack effect processing, mixing, recording and signal recycling. Ecasound supports a wide range of audio inputs, outputs and effect algorithms. Effects and audio objects can be combined in various ways, and their parameters can be controlled by operator objects like oscillators and MIDI-CCs. A versatile console mode user-interface is included in the package.

  • Nama - manages multitrack recording, mixing and mastering using the Ecasound audio processing engine developed by Kai Vehmanen.
  • Ecasound Mastering Interface - a Python front end to ecasound. It looks a lot like Rackmount effect and can be used to create an Ecasound Chain Setup while playing with parameters in real time. It supports mixing, recording, filtering, and processing and can export to ECS files. It supports all ecasound options, chain operators, and controllers.
  • Visecas - a graphical user interface for Ecasound (, a software package written by Kai Vehmanen ( which is designed for multitrack audio processing. It starts Ecasound as a child process and communicates via a pipe using Ecasound's InterActive Mode (IAM) commands.


  • meterec works as a basic multi track tape recoder. The aim of this software is to minimise the interactions of the users with the computer and allow them to focus on their instrumental performance. For this reason meterec features are minimal. One of the main "limitation" is that meterec can only restart from time 0:00:00.00: if you srew one take, start it over again! rather than learning how to use a specific software to correct what you screw, meterec forces to learn and master your instrument. Good news is previous takes are kept in take history and if in the end, the first one was the best you could play, you can choose it in your final mix.


  • jack_capture is a program for recording soundfiles with jack. The default operation will record what you hear in your loudspeakers into a stereo wav file.


  • jrec2 - simple patched jack_capture, simple patch to the jack_capture example client, that implements silence detection and splitting of output files), can call hooks (invoke 3rd party software) upon detecting silence or audio. It include an optional random-playback control script that was used in an installation to record voice and if it detects silence plays back random snippets of previously recorded material.


  • jack-record is a light-weight JACK capture client to write an arbitrary number of channels to disk.



  • QJackRcd is a simple stereo recorder for Jack with few features as silence processing for automatic pause, file splitting, background file post-processing.


  • JACK Timemachine - I used to always keep a minidisc recorder in my studio running in a mode where when you pressed record it wrote the last 10 seconds of audio to the disk and then caught up to realtime and kept recording. The recorder died and haven't been able to replace it, so this is a simple jack app to do the same job. It has the advantage that it never clips and can be wired to any part of the jack graph.


  • Rotter is a Recording of Transmission / Audio Logger for JACK. It was designed for use by radio stations, who are legally required to keep a recording of all their output. Rotter runs continuously, writing to a new file every hour. Rotter can output files in servaral different strutures, including all files in a single directory or create a directory structure.The advantage of using a folder hierarchy is that you can store related files in the hour's directory.


See also Playback


  • - jplay2 is a command-line audio player, gluing JACK, libsamplerate and liblo (OSC control), it plays a single file (no playlist), but with ffmpeg & libsndfile it plays every file one throws at it (even DVD-vobs or timemachine-w64 ;-) ). Once started, it's only possible to interact with jplay2 via OSC or jack-transport.

Random Parallel Player


Command-line mixers


  • amixer - a command-line program for controlling the mixer in the ALSA soundcard driver. amixer supports multiple soundcards
amixer -c 0 | pcregrep "control"
  # shows all audio channels


  • JackMiniMix - a simple mixer for the Jack Audio Connection Kit with an OSC based control interface. It supports a user configurable number of stereo inputs, which can then be queried and controlled by sending it OSC messages. It is released under the GPL license.


  • - Listen on a keyboard event device and respond to key combinations with jack midi volume messages for use with a midi-aware jack mixer, like jackmix or jack_mixer.


ncurses mixers



Graphical mixers


  • Kmix - an application to allow you to change the volume of your sound card. Though small, it is full-featured, and it supports several platforms and sound drivers. Features: Support for ALSA and OSS sound systems, Plasma Desktop integrated on-screen-display for volume changes



  • QasMixer - a desktop mixer application for ALSA's "Simple Mixer Interface".


  • QasHctl - a mixer for ALSA's more complex "High level Control Interface".


  • alsamixergui - a FLTK based frontend for alsamixer. It is written directly on top of the alsamixer source, leaving the original source intact, only adding a couple of ifdefs, and some calls to the gui part, so it provides exactly the same functionality, but with a graphical userinterface.



Buggy, forgets card sometimes, menu/preferences don't open.

Jack mixers

Non Mixer


See also non-midi-mapper [46]


  • - uses QJackAudio, works with JACK. 24 channels routed to 8 subgroups each with direct out. Three-band parametric EQ for each channel. Aux send/return for each channel, so you can hook in other effects processors. Save and restore complete EQ states. Clean source code and free sofware licensed under GPL. Using latest Qt5, which means it runs on all major platforms.



  • - a GTK+ JACK audio mixer app with a look similar to its hardware counterpart. input to output, and send outputs. only MIDI control of one level and stereo balance! manual send groups, solo and mute.


  • jackmixdesk - an audio mixer for JACK with an OSC control interface and LASH support. It has a configurable number of inputs and pre/post sends/outs which can be controlled by sending it OSC messages. There is a XML config file and a GTK interface.


  • JackMaster - "Master Console" for the jack-audio-connection-kit. Number of inputs/subs and screen layout can only be changed by recompiling.



  • MU1 - a simple Jack app used to organise stereo monitoring. It was written originally for use with Ardour2, but still useful with Ardour3 as it provides some extra functions.

LV2 mixers



  • - for stereo balance control with optional per channel delay. balance.lv2 facilitates adjusting stereo-microphone recordings (X-Y, A-B, ORTF). But it also generally useful as "Input Channel Conditioner". It allows for attenuating the signal on one of the channels as well as delaying the signals (move away from the microphone). To round off the feature-set channels can be swapped or the signal can be downmixed to mono after the delay.


  • Kn0ck0ut - takes two mono 44.1KHz inputs and spectrally subtracts one from the other. It can be used to help create 'acapellas' - to extract vocals from a track - if an instrumental version (or section) of the track is available.


  • - A simple 4-channel stereo mixer. The main goal is to use it as a submixer on a 4 channel track, but you can use it everywhere you need a small 4 channel stereo mixer.

BalanceGain / BalanceWidth



Other mixers



  • faderratic - brings you cross-fading of 2 stereo inputs, but with a mind of its own and a ton of options to change the fade shape, length, limits, frequency and probability. faderratic works by generating a pulse on a tempo-sync frequency, and depending on the probability it may trigger a cross-fade event. You can optionally make the fader auto-return to either side and if you feel like it, trigger a fade manually or control the fader movement totally manually. Windows VST.

Spatial audio

  • - an audio microphone composed of four closely spaced subcardioid or cardioid (unidirectional) microphone capsules arranged in a tetrahedron. It was invented by Michael Gerzon and Peter Craven, and is a part of, but not exclusive to, Ambisonics, a surround sound technology. It can function as a mono, stereo or surround sound microphone, optionally including height information.

  • SpatDIF - Spatial Sound Description Interchange Format. SpatDIF is a format that describes spatial sound information in a structured way, in order to support real-time and non-real-time applications. The format serves to describe, store and share spatial audio scenes across audio applications and concert venues.
  • - a plugin (Mac AU/VST and VST Windows format) designed to compose multichannel space. It allows the user to spatialize the sound in 2D (up to 16 speakers) or in 3D (up to 128 speakers) under a dome of speakers (with the ServerGRIS, under development). SpatGRIS is a fusion of two former plugins by the GRIS: OctoGRIS and ZirkOSC with a lot of new festures.


  • - or, more commonly, stereo, is a method of sound reproduction that creates an illusion of multi-directional audible perspective. This is usually achieved by using two or more independent audio channels through a configuration of two or more loudspeakers (or stereo headphones) in such a way as to create the impression of sound heard from various directions, as in natural hearing. Thus the term "stereophonic" applies to so-called "quadraphonic" and "surround-sound" systems as well as the more common two-channel, two-speaker systems. It is often contrasted with monophonic, or "mono" sound, where audio is heard as coming from one position, often centered in the sound field (analogous to a visual field). In the 2000s, stereo sound is common in entertainment systems such as broadcast radio and TV, recorded music and the cinema.
  • - the aspect of sound recording and reproduction concerning the perceived spatial locations of the sound source(s), both laterally and in depth. An image is considered to be good if the location of the performers can be clearly located; the image is considered to be poor if the location of the performers is difficult to locate. A well-made stereo recording, properly reproduced, can provide good imaging within the front quadrant; a well-made Ambisonic recording, properly reproduced, can offer good imaging all around the listener and even including height information.


  • - the process of blending the left and right channels of a stereo audio recording. It is generally used to reduce the extreme channel separation often featured in early stereo recordings (e.g., where instruments are panned entirely on one side or the other), or to make audio played through headphones sound more natural, as when listening to a pair of external speakers.
  • [48] Headphones have extreme stereo separation--the right ear doesn't get to hear much of what's going on on the left. This leads to the impression the music's coming from inside your head, and sounds especially weird when instruments are panned hard to one side or the other. Crossfeed filters aim to fix this by letting the channels mix a little, but in a controlled way. The goal is to mimic what happens naturally when listening to music on speakers.

Leslie speaker

  • - a combined amplifier and two-way loudspeaker that projects the signal from an electric or electronic instrument, while modifying the sound by rotating the loudspeakers. It is most commonly associated with the Hammond organ, though it was later used for the guitar and other instruments. A typical Leslie speaker contains an amplifier, and a treble and bass speaker—though specific components depend upon the model. A musician controls the Leslie speaker by either an external switch or pedal that alternates between a slow and fast speed setting, known as "chorale" and "tremolo".

to sort

  • BLS1 is a digital realisation of the 'Blumlein Shuffler', invented by Alan Blumlen in the early 1930s and analysed in detail by Michael Gerzon in a paper presented at the 1993 AES Convention in San Francisco.

  • MONSTR - a multiband stereo imaging plugin, available for Windows, Mac, and Linux. It allows the user to control the stereo width of a sound in 3 different frequency bands, and so can be used to perform common tasks such as narrowing the bass frequencies while adding width to the highs, allowing fine control over the stereo image of your mix.

  • Holophon - a set of tools for the programming and real-time manipulation of sound trajectories across different speakers. Its main development is the Holo-Edit trajectory editor. It’s a graphical and algorithmic editor of sound trajectories. Holo-Edit makes it possible to draw and graphically edit trajectories across a complex sound system. It’s also possible to program those trajectories with different automatic functions. HoloEdit is a set of graphical editors and algorithmic functions for creating and manipulating sounds in space. This software allows for the precise positioning of multiple sounds in time and space (defined by a set of speakers). In order to do so, it associates sounds to trajectories (a set of points defined by their position in space (x, y, z) and their date). HoloEdit also supports the SDIF format so that it is possible to generate/transform sound trajectories from SDIF data. masOS, plus Holoboule for pd-extended.

Windows / Mac

  • MStereoExpander - offers expansion based on either actual samples or on delay, and provides stereo field correction to increase or reduce the clarity of the spatial differences between channels. It is fully mono-compatible.
  • MStereoProcessor - an advanced mastering multiband stereo analyzer and enhancer plugin, which lets you easily control the stereo image and the necessary perception of depth and space.
  • A1StereoControl - expand or limit the STEREO WIDTH of your tracks using only one single knob. This powerful technique can be used on single tracks or groups tracks while mixing or even on a master bus in final mastering situations. Windows/Mac
  • Proximity - an easy to use distance “pan-pot” based on several psycho-acoustic models. The idea is to give mixing engineer a reliable tool which allows him to manipulate the “depth” of several sound source in a straight forward and convincing manner.
  • Voxengo MSED - a professional audio encoder-decoder plugin for mid-side processing which is able to encode (split) the incoming stereo signal into two components: mid-side pair, and vice versa: decode mid-side signal pair into stereo signal. MSED is also able to work in the “inline” mode with the ability to adjust mid and side channels’ gain and panning without the need of using two plugin instances in sequence. MSED can be used to flip the phase of the mid and side channels by 180 degrees, and swap the stereo channels, and to extract the mid or side channel. MSED features the “plasma” vector scope, stereo correlation and balance meters which make it easier to monitor the stereo information present in the audio signal.
  • Voxengo Stereo Touch - This professional audio plugin implements a classic technique of transforming a monophonic track into spacious stereophonic track by means of mid/side coding technique. Stereo Touch is most effective on monophonic sounds without overly sharp transients: it works great for both acoustic and electric/overdriven guitars, synthetic pad sounds and even vocals.  By means of this plugin you can easily get spacious and even “surround” sounding tracks, without utilizing a double-tracked recording technique.
  • Upstereo - A FREE stereo enhancer. Stereo width slider going from mono to wide, bringing the stereo image out and towards the listener. Loudness control boost. Loudness overdrive option. Subtle Air & Bass boosters to lift and help the audio 'breath'. Movable 3D interface, with changeable colours and light, positions. Very low CPU usage.


Surround sound

  • - PanJack implements a real-time surround-sound panorama mixer. one or more audio input(s), two or more audio outputs / speakers, control via OSC or MIDI. (optional) bcf2000 fader/pan control. Note: Jack-Transport needs to be rolling in order for panjack to process audio. It creates jack-audio input and output ports and routes audio with a latency of 1 jack-cycle, applying a amplifications depending on faders and panorama-gain. The panorama-gain settings can be adjusted manually for each output channel or be modified indirectly using built-in maths for 2D (angle, separation) or X/Y-distance panning. Furthermore there is built-in functionality to automate whirl/leslie-rotate effects. panjack itself does not provide sequencer capabilities. Yet this feature can be archived easily by controlling panjack via OSC and any OSC-sequencer


  • - a series of multichannel audio technologies owned by DTS, Inc. (formerly known as Digital Theater Systems, Inc.), an American company specializing in digital surround sound formats used for both commercial/theatrical and consumer grade applications. It was known as The Digital Experience until 1995. DTS licenses its technologies to consumer electronics manufacturers.

Dolby Atmos

  • - allows up to 128 audio tracks plus associated spatial audio description metadata (most notably, location or pan automation data) to be distributed to theaters for optimal, dynamic rendering to loudspeakers based on the theater capabilities. Each audio track can be assigned to an audio channel, the traditional format for distribution, or to an audio "object." Dolby Atmos by default, has a 10-channel 7.1.2 bed for ambience stems or center dialogue, leaving 118 tracks for objects. Dolby Atmos home theaters can be built upon traditional 5.1 and 7.1 layouts. For Dolby Atmos, the nomenclature differs slightly: a 7.1.4 Dolby Atmos system is a traditional 7.1 layout with four overhead or Dolby Atmos enabled speakers.


  • - a sound source in an Ambiophonic system, made by two closely spaced loudspeakers that ideally span 10-30 degrees. Thanks to the cross-talk cancellation method, a stereo dipole can render an acoustic stereo image nearly 180° wide (single stereo dipole) or 360° (dual or double stereo dipole).


  • - a full-sphere surround sound technique: in addition to the horizontal plane, it covers sound sources above and below the listener. Unlike other multichannel surround formats, its transmission channels do not carry speaker signals. Instead, they contain a speaker-independent representation of a sound field called B-format, which is then decoded to the listener's speaker setup. This extra step allows the producer to think in terms of source directions rather than loudspeaker positions, and offers the listener a considerable degree of flexibility as to the layout and number of speakers used for playback.

  • - Researchers working on very high-order systems found no straightforward way to extend the traditional formats to suit their needs. Furthermore, there was no widely accepted formulation of spherical harmonics for acoustics, so one was borrowed from chemistry, quantum mechanics, computer graphics, or other fields, each of which had subtly different conventions. This led to an unfortunate proliferation of mutually incompatible ad-hoc formats and much head-scratching.
  • Ambisonics Component Ordering - The two primary component ordering formats for ambisonics are Furse-Malham, commonly called FuMa, and Ambisonics Channel Number, commonly called ACN. As seen in the following image, the former uses a lettered notation that - following alphabetical order per grouping - starts with the W (omni) channel, moves to its lower right, then its lower left, and then its lower center; then it moves to the next order and starts at the R, moves to its right, then to its left, then the further right, then the further left; then it moves to the next order and follows a similar pattern. On the other hand, the latter is numbered in a much easier to follow left-to-right order.

  • Spatialisation - Stereo and Ambisonic - Richard W.E. Furse, "This chapter discusses an approach to generating stereo or Ambisonic sound images using Csound. The focus here is not on the acoustics of the human head but on modelling sound in an acoustic space. We will use Csound to produce a ‘virtual’ acoustic space in which to move sounds and make recordings."

  • ARCADE - a patented spatial audio codec that allows encoding scene-based 3D audio over stereo with no additional metadata required. It allows encoding sources with height in a fully spherical manner. Its decoder is able to decode to virtually any 3D or 2D audio format, for example first-order or higher-order spherical harmonics (FOA, HOA), VBAP, Surround, Binaural with or without head-tracking etc. The decoder also works as an upmixer for any stereo content, to any of the formats it can decode to.


to sort

  • mcfx – multichannel audio plug-in suite, VST plug-ins for MacOS, Windows and Linuxm (mcfx_convolver, mcfx_delay, mcfx_filter, mcfx_gain_delay, mcfx_meter), these plug-ins are very handy if you want to process multiple channels in one go for example; multiple loudspeaker setups, Ambisonics (see ambiX); Microphone array post productions (eg. Eigenmike®)

  • - a set of free, open source and modular software tools for sound spatialization. It is comprised of 4 different types of software: spatialization renderers: standalone applications that render spatialized audio using ambisonics or amplitude panning; spatialization interfaces: standalone interfaces that generate spatial information to control the spatialization renderers via OSC; plugins: audio unit plugin and max for live devices to control the spatialization renderers via OSC; max objects: a library of objects for spatialization using ambisonics or amplitude panning in Cycling’74 Max.

"In short, it takes so-called 'pair-wise' panning - i.e. the panning of localised sounds between two loudspeakers - and does a little more math to extend it into triplet-wise panning. The three loudspeakers are arranged in a triangle layout. Localised sounds no longer just pan horizontally, between two positions, but now pan vertically too. This change means that we have extended from 1-dimensional movement into 2-dimensional movement.

"As Ville's diagram shows, as you add more triangles you can extend into the 3rd dimension too, by creating a 'mesh' similar to the polygons that describe 3D space in computer games. The amplitude of any sound 'moving through' the space is calculated for each of the nearest three speakers. The equation takes into account distance from the loudspeaker, and so VBAP differentiates from Ambisonics and irregular loudspeaker layouts can be supported. However there still needs to be a 'mesh' based on triangles, as any individual sound can only exist between the nearest three points. The emphasis here is still on satisfying a 'sweet spot', a localised and immobile audience. In this respect, VBAP is similar to Ambisonics."

  • - documentations and implementations of the Vector Base Amplitude Panning (VBAP). The VBAP is a spatialization techniques created by Ville Pulkki in the late 90's. For further information see the references. The VBAP is available as a C library with an implementation as externals for Pure Data (and also as abstractions).

Wave field synthesis


SoundScape Renderer


  • WONDER - a software suite for using Wave Field Synthesis and Binaural Synthesis. It's primary platform is Linux, but it can be used under OSX too.


  • - a method of recording sound that uses two microphones, arranged with the intent to create a 3-D stereo sound sensation for the listener of actually being in the room with the performers or instruments. This effect is often created using a technique known as "dummy head recording", wherein a mannequin head is outfitted with a microphone in each ear. Binaural recording is intended for replay using headphones and will not translate properly over stereo speakers. This idea of a three dimensional or "internal" form of sound has also translated into useful advancement of technology in many things such as stethoscopes creating "in-head" acoustics and IMAX movies being able to create a three dimensional acoustic experience.

  • - a cognitive process that involves the "fusion" of different auditory information presented binaurally, or to each ear. In humans, this process is essential in understanding speech as one ear may pick up more information about the speech stimuli than the other. The process of binaural fusion is important for computing the location of sound sources in the horizontal plane (sound localization), and it is important for sound segregation. Sound segregation refers the ability to identify acoustic components from one or more sound sources. The binaural auditory system is highly dynamic and capable of rapidly adjusting tuning properties depending on the context in which sounds are heard. Each eardrum moves one-dimensionally; the auditory brain analyzes and compares movements of both eardrums to extract physical cues and synthesize auditory objects.



  • Jack Meter is a basic console based DPM (Digital Peak Meter) for JACK. I wrote it for quickly checking remote signal levels, without having to run X11 to use a pretty graphical meter such as meterbridge.

JACK Meterbridge

  • JACK Meterbridge - software meterbridge for the UNIX based JACK audio system. It supports a number of different types of meter, rendered using the SDL library and user-editable pixmaps.


  • Ebumeter provides level metering according to the EBU R-128 recommendation. The current release implements all features required by the EBU document except the oversampled peak level monitoring.


  • meters.lv2 is a collection of audio-level meters with GUI in LV2 plugin format.
  • K-Meter - Implementation of a K-System meter according to Bob Katz’ specifications.

JACK bitmeter

  • JACK bitmeter - a diagnosis tool for JACK audio software on Linux (and perhaps other systems which have JACK and GTK+ 2.x). As its name might suggest, the bitmeter operates at the bare metal of JACK's I/O layer, looking at the 32 binary digits in each individual sample.


See Lighting#Visualisation


  • xoscope - a digital oscilloscope for Linux

  • jack_oscrolloscope - a simple waveform viewer for JACK. The waveform is displayed in realtime, so you can always see the signal the instant it comes through JACK's input port.

  • jack-scope - an oscilloscope for JACK under X11. jack-scope draws either a time domain signal trace or a self correlation trace. Multiple input channels are superimposed, each channel is drawn in a different color. jack-scope accepts OSC packets for interactive control of drawing parameters.

  • QOscC - a highly flexible and configurable software Oscilloscope with a large number of features. This includes support for any number of audio devices (ALSA or OSS), each with any number of channels. Each scope display can be configured individually to different display types and variants. e.g. you can chose from standard y-t mode (as on an usual oscilloscope), xy mode (e.g. for measuring the phase shift between two signals) of the FFT mode (to view a spectrum plot of the signal). This software is intended for electronic hobyists, who cannot afford a hardware oscilloscope or need a simple spectrum analyzer as well as for musicans for doing basic signal analysis.

Spectrum graph


  • spectrojack - A little spectrogram/audiogram/sonogram/whatever for jack. gtk 2 and fftw 3.
  • Spectrum 3D - a 3D audio spectrogram in real time or not from the microphone or an audio file (including recorded file from the microphone); it is compatible with Jack (jack-audio-connection-kit). Optionally, it supports multitouch gestures from touchscreen and touchpad. It is build with the Gstreamer, SDL (or Gtkglext), OpenGl, GTK+-2.0 and uTouch-Geis free libraries and is under GPL license.

  • xspect3d - a bespoke, radical drawing algorithm and buffer scheduling paradigm, to render a 3D sonic landscape in real time, at up to several hundred frames a second.

  • Photosounder Spiral - Spectrum analyser - a music analysis plugin. It's a fresh take on spectral analysis focused on allowing you to see and understand music and the notes that make it up instantly. This is achieved mainly by coiling the spectrum into a spiral framed by a chromatic circle, thus allowing you to instantly see what's happening musically and spectrally. - $



See Lighting#Visualisation

  • Signalizer - a all-in-one signal visualizing package with a bunch of unique focus-points; real-time audio visualization with optimized 3D GPU graphics, everything being scalable and zoomable gridlessly as well as being arbitrarily precise in both settings and display. Combined with a rich feature set, Signalizer is suited both for electrical/audio engineers fullscreen-inspecting signals, or for general small windows giving an overview of your audio as you create it.

  • sndpeek - real-time 3D animated display/playback, can use mic-input or wav/aiff/snd/raw/mat file (with playback), time-domain waveform, FFT magnitude spectrum, 3D waterfall plot

  • VSXu - VSX Ultra, is an OpenGL-based (hardware-accelerated), modular visual programming environment with its main purpose to visualize music and create graphic effects in real-time. Its intention is to bridge the gap between programmer and artist and enabling acreative and inspiring environment to work in for all parties involved. VSXu is built on a modular plug-in-based architecture so anyone can extend it and or make visualization presets ("visuals" or "states"). The program is free software which means it's free from restrictions, free to share and copy, free to adapt / modify and use it any way you like.


Sonic Visualiser

  • Sonic Visualiser - an application for viewing and analysing the contents of music audio files. The aim of Sonic Visualiser is to be the first program you reach for when want to study a musical recording rather than simply listen to it. We hope Sonic Visualiser will be of particular interest to musicologists, archivists, signal-processing researchers and anyone else looking for a friendly way to take a look at what lies inside the audio file. Sonic Visualiser is Free Software, distributed under the GNU General Public License (v2 or later) and available for Linux, OS/X, and Windows. It was developed at the Centre for Digital Music at Queen Mary, University of London.

Don't forget to install at least the QM vamp plugins.


  • Baudline is a time-frequency browser designed for scientific visualization of the spectral domain. Signal analysis is performed by Fourier, correlation, and raster transforms that create colorful spectrograms with vibrant detail. Conduct test and measurement experiments with the built in function generator, or play back audio files with a multitude of effects and filters. The baudline signal analyzer combines fast digital signal processing, versatile high speed displays, and continuous capture tools for hunting down and studying elusive signal characteristics.



  • SoundRuler is a tool for measuring and graphing sound and for teaching acoustics. Its visual interactive approach to analysis brings you the best of two worlds: the control of manual analysis and the objectivity and speed of automated analysis.

Binary download needs 32-bit libxp to be installed.


  • BRP-PACU - A cross platform dual channel FFT based Acoustic Analysis Tool to help engineers analyze live professional sound systems using the transfer function. One feature is the ability to capture four sample plots, average them, and invert to aid in final EQ.




  • Spek - helps to analyse your audio files by showing their spectrogram. Spek is free software available for Unix, Windows and Mac OS X.

Visual only.


  • zrtstr is a small command line application for detecting faux-stereo WAV-files, that is, files with two identical channels that should have been saved as mono. Such files are sometimes generated by some audio-editing software and DAWs (I’m looking at you, old Cubase 5). Gotten tired of receiving such files from clients for mixing, as they are using twice the necessary space and require twice the processing power, I decided to deal with this nuisance once and for all. zrtstr is a cross-platform application which runs very fast, thanks to being written in Rust.


  • DFasma is a free open-source software which is used to compare audio files in time and frequency. The comparison is first visual, using wavforms and spectra. It is also possible to listen to time-frequency segments in order to allow perceptual comparison. It is basically dedicated to analysis. Even though there are basic functionnalities to align the signals in time and amplitude, this software does not aim to be an audio editor.


  • AS Annotation is an application for the analysis and automated or manual annotation of sound files. It features state of the art sound analysis algortithms, specialized sound inspection tools and can import Standard MIDI files. ASAnnotation is based on AudioSculpt, a sound analysis and transformation software developed at IRCAM since 1996. In addition to the analysis and annotation features present in AS Annotation, AudioSculpt comes with state of the art sound processing, mostly based on an enhanced version of the phase vocoder. To store and exchange analysis and annotation data, ASAnnotation can use two formats; MIDI for notes and text, and SDIF for all analyses. The MIDI support facilitates the verification, alignment and correction of Standard MIDI Files to soundfiles. SDIF is a specialized format for sound description data, which combines very high precision with efficiency and interchangeability. Numerous other programs support SDIF, such as Max/MSP, OpenMusic, CLAM and SPEAR. A collection of utility programs can be used to convert SDIF files to text.



  • Toscanalyzer is a powerful audio analysis tool for mixing and mastering. Toscanalyzer helps you to mix and master better. It is not only an analysis tool but a complete guide to understand why your song sounds as it sounds. Toscanalyzer lets you compare audible and visually your project to any reference songs in a very convenient way. Toscanalyzer offers a clear project view including many options to analyze. The analysis gives you a detailed report about possible problems and in addition a clear guidance how to fix it.

Java. Doesn't work for me.


Raven Lite



  • QLoud - tool to measure loudspeaker frequency and step responses and distortions

Room EQ Wizard

  • REW - free room acoustics analysis software for measuring and analysing room and loudspeaker responses. The audio analysis features of REW help you optimise the acoustics of your listening room, studio or home theater and find the best locations for your speakers, subwoofers and listening position. It includes tools for generating audio test signals; measuring SPL and impedance; measuring frequency and impulse responses; measuring distortion; generating phase, group delay and spectral decay plots, waterfalls, spectrograms and energy-time curves; generating real time analyser (RTA) plots; calculating reverberation times; calculating Thiele-Small parameters; determining the frequencies and decay times of modal resonances; displaying equaliser responses and automatically adjusting the settings of parametric equalisers to counter the effects of room modes and adjust responses to match a target curve.




  • Transcribe! - an assistant for people who want to work out a piece of music from a recording, in order to write it out, or play it themselves, or both. It doesn't do the transcribing for you, but it is essentially a specialised player program which is optimised for the purpose of transcription. It has many transcription-specific features not found on conventional music players. It is also used by many people for play-along practice. It can change pitch and speed instantly, and you can store and recall any number of named loops. So you can practice in all keys, and you can speed up as well as slow down. There is some advice about play-along practice in Transcribe!'s help, under the heading "Various Topics". And it is also used for speech transcription. With its support for foot pedals and its superior slowed-down sound quality, it is an excellent choice for this purpose. There is some advice about speech transcription in Transcribe!'s help, under the heading "Various Topics".

Feature extraction


  • aubio is a tool designed for the extraction of annotations from audio signals. Its features include segmenting a sound file before each of its attacks, performing pitch detection, tapping the beat and producing midi streams from live audio. Because these tasks are difficult, we thought it was important to gather them in a dedicated library. To increase the fun, we have made these algorithms work in a causal way, so as to be used in real time applications with as low delay as possible. Functions can be used offline in sound editors and software samplers, or online in audio effects and virtual instruments.
  • Aubio-LV2-Plugins is an unoffial set of LV2 plugins which wrap the functionality of the audio analysis library Aubio. Currently it consists of a transient/steady state separator, and an onset detector.


  • Vamp is an audio processing plugin system for plugins that extract descriptive information from audio data — typically referred to as audio analysis plugins or audio feature extraction plugins.


  • - an Audio Activity Detection tool that can process online data (read from an audio device or from standard input) as well as audio files. It can be used as a command line program and offers an easy to use API.

Beats [per minute]

  • bonk - Pure Data unit [bonk~] is a very useful musical tool for performance and composition. It processes a stream of audio on its input and produces messages when it thinks the signal matches certain patterns. It doesn't have an audio output, just messages. What [bonk~] does is analyse the incoming signal.
  • MiniBPM is a simple, reliable tempo estimator for use in music audio applications. It quickly gets you a fixed beats-per-minute estimate from a sample of audio, provided the tempo doesn't change too much in it.
  • libbeat - a lightweight beat detection library for Qt. It currently supports ALSA and PulseAudio. It uses fftw to process the samples.
  • bpm-tools software is the result of some experiments I did into automatically calculating and tagging the tempo (in beats-per-minute) of music files. Right now the code serves as the best explanation of the algorithm — a relatively simple application of an autocorrelation by statistical sampling. As yet, there is no scientific comparison of the algorithm with others software.
  • BeatDetektor - uses a very simple statistical model designed from scratch by myself to detect the BPM of music and provides real-time feedback useful for visualization and synchronization.

  • - a causal beat tracking algorithm intended for real-time use. It is implemented in C++ with wrappers for Python and the Vamp plug-in framework.

  • MIDI Trigger - LV2 plugin which detects peaks by audio signal and sends MIDI notes.

  • BeatCounter - a simple plugin designed to facilitate beatmatching software and turntables. It displays the current tempo in beats per minute (BPM), and an accumulated average over the last few seconds. BeatCounter is the perfect tool for DJ’s that want to integrate computer effects with turntables or a live band.

  • Tapita - (snack in spanish) is a BPM detector trough keyboard, MIDI and jack written in C with GTK2.




  • Essentia - an open-source C++ library for audio analysis and audio-based music information retrieval released under the Affero GPLv3 license (also available under proprietary license upon request). It contains an extensive collection of reusable algorithms which implement audio input/output functionality, standard digital signal processing blocks, statistical characterization of data, and a large set of spectral, temporal, tonal
  • - C++ library to apply similarity measures and classifications on the results of audio analysis, including Python bindings. Together with Essentia it can be used to compute high-level descriptions of music.


  • openSMILE - feature extration tool enables you to extract large audio feature spaces in realtime. It combines features from Music Information Retrieval and Speech Processing. SMILE is an acronym forSpeech & Music Interpretation by Large-space Extraction. It is written in C++ and is available as both a standalone commandline executable as well as a dynamic library. The main features of openSMILE are its capability of on-line incremental processing and its modularity. Feature extractor components can be freely interconnected to create new and custom features, all via a simple configuration file. New components can be added to openSMILE via an easy binary plugin interface and a comprehensive API.


  • - an audio signal processing library written in Python with a strong focus on music information retrieval (MIR) tasks. The library is internally used by the Department of Computational Perception, Johannes Kepler University, Linz, Austria ( and the Austrian Research Institute for Artificial Intelligence (OFAI), Vienna, Austria ( Possible acronyms are: Madmom Analyzes Digitized Music Of Musicians, Mostly Audio / Dominantly Music Oriented Modules


  • jMIR - an open-source software suite implemented in Java for use in music information retrieval (MIR) research. It can be used to study music in the form of audio recordings, symbolic encodings and lyrical transcriptions, and can also mine cultural information from the Internet. It also includes tools for managing and profiling large music collections and for checking audio for production errors. jMIR includes software for extracting features, applying machine learning algorithms, applying heuristic error error checkers, mining metadata and analyzing metadata.



  • Audacity is a free, easy-to-use, multi-track audio editor and recorder for Windows, Mac OS X, GNU/Linux and other operating systems. The interface is translated into many languages. You can use Audacity to record live audio, record computer playback on any Windows Vista or later machine, convert tapes and records into digital recordings or CDs, edit WAV, AIFF, FLAC, MP2, MP3 or Ogg Vorbis sound files, cut, copy, splice or mix sounds together, change the speed or pitch of a recording, etc.


  • mhWaveEdit is a graphical program for editing, playing and recording sound files. It is lightweight, portable, user-friendly and handles large files very well. The program itself has only simple editing features such as cut'n'paste and volume adjustment but it can also use Ladspa effect plugins and the effects provided by the SoX application. It can also support additional file formats besides wav through libsndfile and mp3/ogg import and export through lame and oggenc/oggdec.


  • Sweep is an audio editor and live playback tool for GNU/Linux, BSD and compatible systems. It supports many music and voice formats including WAV, AIFF, Ogg Vorbis, Speex and MP3, with multichannel editing and LADSPA effects plugins.


  • ReZound aims to be a stable, open source, and graphical audio file editor primarily for but not limited to the Linux operating system.


  • ocenaudio is a cross-platform, easy to use, fast and functional audio editor. It is the ideal software for people who need to edit and analyze audio files without complications. ocenaudio also has powerful features that will please more advanced users.


  • WaveSurfer is an open source tool for sound visualization and manipulation. Typical applications are speech/sound analysis and sound annotation/transcription. WaveSurfer may be extended by plug-ins as well as embedded in other applications.



  • soniK is an open source digital audio editor for Linux, using the KDE platform. soniK allows you to record, edit and process sounds on your computer.



  • wavbreaker is a GTK wave file splitter for Linux and Unix-like operating systems licensed under the terms of the GNU General Public License. This application's purpose in life is to take a wave file and break it up into multiple wave files. It makes a clean break at the correct position to burn the files to an audio cd without any dead air between the tracks. It will only read wave files, so use an appropriate tool to convert ogg, mp3, etc. files and then break them up.



  • LAoE means Layer-based Audio Editor, and it is a rich featured graphical audiosample-editor, based on multi-layers, floating-point samples, volume-masks, variable selection-intensity, and many plugins suitable to manipulate sound, such as filtering, retouching, resampling, graphical spectrogram editing by brushes and rectangles, sample-curve editing by freehand-pen and spline and other interpolation curves, effects like reverb, echo, compress, expand, pitch-shift, time-stretch, and much more... And it is free of charge, under GPL license!


  • Snd is a sound editor modelled loosely after Emacs. It can be customized and extended using either s7 (included in the Snd sources), Ruby, or Forth.
  • San Dysth is a standalone realtime soft-synth written in SND. This softsynth has controls to generate various kinds of sounds inbetween white noise and pure tones. It also provides controllers to disturb the generated sound by using a "period counter" to extend the variety of the generated output. Common usage for the softsynth is organ-like sound, organic-like sound, alien-like sounds, water-like sounds, and various kinds of noise (noise artists could find this softsynth most useful).


  • GNUsound is a multitrack sound editor for GNOME 1 and 2. The current version is 0.7.5, which was released 6 July 2008.


  • Marlin - A GNOME Sample Editor. last updated 03-08-2004


  • GNoise - gtk+ or gnome (you can ./configure it either way) wave file editor for Linux. Prime considerations were for it to be speedy and be able to handle big files. So far it can: load and display files, generate a display cache, play the file, cut, copy, paste, (unlimited) undo, mute, fade in/out, reverse, normalize, and more. 2003

Processing tools



  • SoX - Sound eXchange, the Swiss Army knife of sound processing programs. SoX is a cross-platform (Windows, Linux, MacOS X, etc.) command line utility that can convert various formats of computer audio files in to other formats. It can also apply various effects to these sound files, and, as an added bonus, SoX can play and record audio files on most platforms.
play --show-progress -c 2 --null synth brownnoise reverb bass 6 treble -3 echos 0.8 0.9 1000 0.3 1800 0.25 [54]

play -n -c1 synth whitenoise band -n 100 20 band -n 50 20 gain +25  fade h 1 864000 1

play -c2 -n synth pinknoise band -n 280 80 band -n 60 25 gain +20 treble +40 500 bass -3 20 flanger 4 2 95 50 .3 sine 50 lin [55]

  • sonfilade - allows the user to rapidly strip junk audio from the beginning and end of audio files. It can be used, for example, to clean up files recorded with Streamripper (e.g., streamripper --xs_padding=5000:5000). Sonfilade is designed to be as effortless and fun as possible to use. An entire edit session can be carried out using only three keys and sound feedback as the entire user interface. (There is also text output, but it is non-essential.) Uses sox.



  • - a multi-platform package of audio handling routines that unifies the best open-source audio libraries. play .mp3, .ogg, .wav, .flac, .m4a, .opus and cdrom audio files. 16, 32 or float 32 bit resolution. record all types of input into file, in 16 or 32 bit resolution, mono or stereo. add DSP effects and filters, however many you want and record it. play multiple inputs and outputs simultaneously. internet audio streaming of mp3 and opus files. produce sound by the build-in synthesizer. Uos can use the SoundTouch, PortAudio, SndFile, Mpg123, Faad, OpusFile and Mp4ff audio libraries. Included in the package: Examples and binary libraries for Linux 32/64, arm-Raspberry Pi, Windows 32/64, Mac OSX 32 and FreeBSD 32/64.

Composers Desktop Project

  • The Sound Loom - an integrated graphic interface to the CDP sound-processing software, a comprehensive collection of over 500 instruments for sound transformation developed as practical working tools by composers over many years, available from the Composers' Desktop Project. The Sound Loom + CDP software is a powerful toolbox for composers, not a performance instrument. Using it, you can specify the parameters of any process to any degree of time-varying detail, detail you may have composed or have extracted from some other complex sound-event. You cannot, however, alter these parameters while the process is running. In compensation, the system offers almost any conceivable process for transforming sounds and sound-data (the data might be loudness envelopes, pitch-tracking information, spectral analysis data, filter specifications etc.) all running in a unified, intelligent environment. Trevor Wishart.i


  • Mammut does an FFT of the whole sound (no windows). Various operations can subsequently be done in the frequency domain, such as unlinear stretching of the spectrum, sprectrum shifting, etc. How is the program useful? Doing a giant FFT of the entire sound, as opposed to splitting the sound up into short windows, is unusual. Such a method implies that time-related parameters are included in the spectral coefficients in a non-intuitive manner, and changes in the frequency domain may radically change developments in the time domain. Mammut is a fairly unpredictable program, and the user will need to get used to letting go of controlling the time axis. The sounding results are often surprising and exciting.



  • FreqTweak is a tool for FFT-based realtime audio spectral manipulation and display. It provides several algorithms for processing audio data in the frequency domain and a highly interactive GUI to manipulate the associated filters for each. It also provides high-resolution spectral displays in the form of scrolling-raster spectrograms and energy vs frequency plots displaying both pre- and post-processed spectra.


  • TAPESTREA (Techniques And Paradigms for Expressive Synthesis, Transformation, and Rendering of Environmental Audio, or taps, is a unified framework for interactively analyzing, transforming and synthesizing complex sounds. Given one or more recordings, it provides well-defined means to: identify points of interest in the sound and extract them into reusable templates; transform sound components independently of the background and/or other events; continually resynthesize the background texture in a perceptually convincing manner; controllably place event templates over backgrounds, using a novel graphical user interface and/or scripts written in the ChucK audio programming language

Build fails on Linux. Fixing two or three indirect includes gets further, but fails on building it's included [old] chuck.


  • SPEAR (Sinusoidal Partial Editing Analysis and Resynthesis) is an application for audio analysis, editing and synthesis. The analysis procedure (which is based on the traditional McAulay-Quatieri technique) attempts to represent a sound with many individual sinusoidal tracks (partials), each corresponding to a single sinusoidal wave with time varying frequency and amplitude. Something which closely resembles the original input sound (a resynthesis) can be generated by computing and adding all of the individual time varying sinusoidal waves together. In almost all cases the resynthesis will not be exactly identical to the original sound (although it is possible to get very close).

Aside from offering a very detailed analysis of the time varying frequency content of a sound, a sinusoidal model offers a great deal of flexibility for editing and manipulation. SPEAR supports flexible selection and immediate manipulation of analysis data, cut and paste, and unlimited undo/redo. Hundreds of simultaneous partials can be synthesized in real-time and documents may contain thousands of individual partials dispersed in time. SPEAR also supports a variety of standard file formats for the import and export of analysis data.

Windows/Mac only :(


  • Ceres3 is a cut-and-paste spectral editor with musically enhanced graphic control over spectral activity of a sound file. It is a free educational program with no other aims, and it owes most of its framework to Oyvind Hammer's Ceres and Jonathan Lee's Ceres2. It has an X-window Motif/OpenMotif based GUI, organized around four principal menus with simple keyboard shortcuts.


  • ATS is a spectral modeling system based on a sinusoidal plus critical-band noise decomposition. The system can be used to analyze recorded sounds, transform their spectrum using a wide variety of algorithms and resynthesize them both out of time and in real time.
  • ATS is a software library of functions for spectral Analysis, Transformation, and Synthesis of sound based on a sinusoidal plus critical-band noise model. A sound in ATS is a symbolic object representing a spectral model that can be sculpted using a variety of transformation functions. Spectral data can be accessed trough an API, and saved to/loaded from disk. ATS is written in LISP, its analysis and synthesis algorithms are implemented using the CLM (Common Lisp Music) synthesis and sound processing language.

Only takes mono .wav files


  • Cecilia is an audio signal processing environment aimed at sound designers. Cecilia mangles sound in ways unheard of. Cecilia lets you create your own GUI using a simple syntax. Cecilia comes with many original built-in modules and presets for sound effects and synthesis.


  • Loris is an Open Source sound modeling and processing software package based on the Reassigned Bandwidth-Enhanced Additive Sound Model. Loris supports modified resynthesis and manipulations of the model data, such as time- and frequency-scale modification and sound morphing. The Loris programmers' interface supports the C, C++, and Python programming languages, and SWIG interface files are provided so that the API can be easily extended to a variety of other languages. The package includes a handful of utility programs for basic sound modeling and resynthesis, and standard UNIX/Linux tools that build and install the libraries, headers, and utilties.

SMS Tools

  • Spectral Modeling Synthesis Tools - SMS Tools is a set of techniques and software implementations for the analysis, transformation, and synthesis of musical sounds based on various spectral modeling approaches. These techniques can be used for synthesis, processing and coding applications, while some of the intermediate results might also be applied to other music related problems, such as sound source separation, musical acoustics, music perception, or performance analysis. The basic model and implementation were developed by Xavier Serra as part of his PhD thesis published 1989. Since then many extensions have been proposed at MTG-UPF and by other researchers.


  • FxEngine - an Open C++ Framework under LGPL license. The FxEngine Framework simplifies the plugin architecture for the data flow processing. It provides a full control to the plugin architecture for applications that require custom solutions.
  • FxJackPack - contains two plugins for the FxEngine framework which enables the recording and playback sound through JACK (Jack Audio Connection Kit).




  • SpectMorph is a free software project which allows to analyze samples of musical instruments, and to combine them (morphing). It can be used to construct hybrid sounds, for instance a sound between a trumpet and a flute; or smooth transitions, for instance a sound that starts as a trumpet and then gradually changes to a flute.

Spectral Toolbox

  • The Spectral Toolbox - a suite of analysis-resynthesis programs that locate relevant partials of a sound and allow them to be resynthesized at any specified frequencies. This enables a variety of techniques including spectral mappings (sending all partials of a sound to fixed destinations), spectral morphing continuously interpolating between the partials of a source sound and a destination) and dynamic tonality (a way of organizing the relationship between a family of tunings and a set of related timbres). A complete application called the TransFormSynth concretely demonstrates the methods using either a one-dimensional controller such as a midi keyboard or a two-dimensional control surface (such as a MIDI guitar, a computer keyboard, or the forthcoming Thummer controller). Requires installing either Max Runtime (free from cycling74) or Max/MSP (not free) and some java routines.


  • - a simple application which takes a small chunk (window) from input wav and outputs the frequencies one by one (a sweep) into another wav file. It is very useful to hear the harmonics (one by one) from a sound. It can be used as a spectrum tool for the blind people who are interested in sound analysis.





Paul's Extreme Sound Stretch


  • tcStretch - a Windows VST 2.4 plug-in for time stretching, pitch shifting, and blurring. Time stretch can be up to 1 million times slower. Pitch shift is plus or minus one octave. Blurring blends nearby spectral material to make the output less static. Playback is sensitive to transients in the source material. Playback rate and blur amount are automatically adjusted according to the transient contour of the material being stretched. Playing transients at a faster rate than non-transients tends to make the output sound less obviously stretched. Playing transients more slowly than non-transients emphasizes the stretchiness [good when playing in reverse mode with highly transient material]. Adding blur brings in some subtle (or not so subtle) randomness which helps to keep the output less static.

Rubber Band

  • Rubber Band Library is a high quality software library for audio time-stretching and pitch-shifting. It permits you to change the tempo and pitch of an audio stream or recording dynamically and independently of one another. Rubber Band Library is intended for use by developers creating their own application programs rather than directly by end users, although it does also include a simple (free) command-line utility program that you can use for fixed adjustments to the speed and pitch of existing audio files.

Play it Slowly

  • Play it Slowly - software to play back audio files at a different speed or pitch. It does also allow you to loop over a certain part of a file. It's intended to help you learn or transcribe songs. It can also play videos thanks to gstreamer. Play it slowly is intended to be used on a GNU/Linux system like Ubuntu.


  • StretchPlayer is an audio file player that allows you to change the speed of the song without changing the pitch. It will also allow you to transpose the song to another key (while also changing the speed). This is a very powerful tool for musicians who are learning to play a pre-recorded song.


  • PitchTempoPlayer (PTPlayer) is an audio player for Linux that allows to change pitch and speed (tempo) of the sound independently of each other. Fine tuning (less than half tone) is also possible, as well recording, exporting the modified audio file and managing a playlist.



  • SoundTouch is an open-source audio processing library for changing the Tempo, Pitch and Playback Rates of audio streams or audio files. The library additionally supports estimating stable beats-per-minute rates for audio tracks. The SoundTouch library is intended for application developers writing sound processing tools that require tempo/pitch control functionality, or just for playing around with the sound effects.

The SoundTouch library source kit includes also an example utility SoundStretch for processing .wav audio files from command-line interface.


Gnome Wave Cleaner



Neural network





~/.wine/drive_c/Program Files (x86)/VstPlugins
~/.wine/drive_c/Program Files/VstPlugins

  • JackAss is a VST plugin that provides JACK-MIDI support for VST hosts. Simply load the plugin in your favourite host to get a JACK-MIDI port. Each new plugin instance creates a new MIDI port.




  • LV2 - an open standard for audio plugins, used by hundreds of plugins and other projects. At its core, LV2 is a simple stable interface, accompanied by extensions which add functionality to support the needs of increasingly powerful audio software.

Unix paths:

  # list all lv2 plugins available

  • LV2 1.0 released, what's next? - "LV2 is a successor of both LADSPA (audio effects) and DSSI (instruments) with some backwards compatibility. The scope of the API more or less equals to the sum of LADSPA and DSSI, not in the last place thanks to its modular design."

  • LV2 MIDI - defines a data type for a MIDI message, midi:MidiEvent, which is normalised for fast and convenient real-time processing. MIDI is the Musical Instrument Digital Interface, a ubiquitous binary standard for controlling digital music devices. For plugins that process MIDI (or other situations where MIDI is sent via a generic transport) the main type defined here, midi:MidiEvent, can be mapped to an integer and used as the type of an LV2 Atom or Event.

  • Programming LV2 Plugins - a series of well-documented example plugins that demonstrate the various features of LV2. Starting with the most basic plugin possible, each adds new functionality and explains the features used from a high level perspective. API and vocabulary reference documentation explains details, but not the “big picture”. This book is intended to complement the reference documentation by providing good reference implementations of plugins, while also conveying a higher-level understanding of LV2.

  • lv2file - a simple program which you can use to apply effects to your audio files without much hassle.
  • lv2proc - generates an output sound file by applying a LV2 effect plugin to an input sound file.

  • LV2 Create - a GUI utility that lets you easily enter information about a plugin, without needing to know too many details about LV2 (certainly not about those godawful, over-engineered, developer/enduser hostile, inefficient, easily-broken TTL files. Terrible design for audio work). Then you click a button, and the utility creates the TTL files, and C skeleton code for the plugin. You just need to add your DSP code, and compile to create your plugin. It even generates the GNU Makefile for you.
  • Torture tester - a program to help with testing of LADSPA and LV2 plugins.

  • NASPRO bridges - a collection of bridges to LV2 that, once installed, allow you to use plugins developed for other plugin standards in LV2 hosts. As of now, it contains two bridges: a LADSPA 1.1 bridge and a DSSI 1.0.0/1.1.0 bridge.



See also #Audio programming and #Graphical programming

  • HISE - a cross-platform open source audio application for building virtual instruments. HISE emphasizes on sampling, but includes some basic synthesis features for making hybrid instruments. You can build patches, design a custom interface and compile them as a VST / AU plug-in or iOS app.



  • Carla - an audio plugin host, with support for many audio drivers and plugin formats. It has some nice features like automation of parameters via MIDI CC (and send output back as MIDI too) and full OSC control. Carla currently supports LADSPA (including LRDF), DSSI, LV2, VST2/3 and AU plugin formats, plus GIG, SF2 and SFZ file support. It uses JACK as the default and preferred audio driver but also supports native drivers like ALSA, DirectSound or CoreAudio.

There are 4 types of engine processing:

  • Single-client: (JACK driver only) - carla-jack-single
    • Same as Multi-client, except that all JACK ports belong to a single master client.
    • This is needed when a setup doesn't support multi-client JACK apps, such as LADISH.
  • Multi-client: (JACK driver only) - carla-jack-multi
    • Every single plugin is exposed as a new JACK client. Audio and MIDI ports are registered as needed.
  • Rack: - carla-rack
    • Plugins are processed in order, from top to bottom.
    • Plugins with non-stereo audio channels are not supported, but a forced-stereo option is available for Mono ones.
  • Patchbay: - carla-patchbay
    • Modular patchbay mode, just like in JACK Multi-client and many other modular applications.
    • Every plugin gets its own canvas group and ports allowing you to interconnect plugin audio and MIDI.

  # usage: /usr/bin/carla-single [arch (optional)] [format] [filename/uri] [label (optional)] [uniqueId (optional)]

Possible archs:

 - native (default)
 - linux32
 - linux64
 - win32
 - win64

Possible formats:

 - internal
 - ladspa
 - dssi
 - lv2
 - vst|vst2
 - gig
 - sf2
 - sfz


/usr/bin/carla-single internal midisplit
/usr/bin/carla-single dssi /usr/lib/dssi/
/usr/bin/carla-single lv2
/usr/bin/carla-single native vst /usr/lib/vst/
/usr/bin/carla-single win32 vst "~/.wine/drive_c/Program Files (x86)/VstPlugins/Kontakt 5.dll"

  • - Performer lets you manage all the songs in your setlist as individual Carla patches and loads each of them when you need it. Additionally Performer uses Okular or QWebEngine to display notes and chords of your songs.
  • Jost (dead) is the first open source multi-technology (native vst, ladspa, dssi) host in linux. It will mainly host a chain of plugins per instance, publishing jack, alsa and alsa_seq ports in order to be connected in your main stream flow. it still have some very good features that makes it a first class host.


  • MrsWatson - a command-line audio plugin host. It takes an audio and/or MIDI file as input, and processes it through one or more audio plugins. Currently MrsWatson only supports VST 2.x plugins, but more formats are planned in the future. MrsWatson was designed for primarily three purposes: Audio plugin development and testing, Automated audio processing for servers or other applications, Unit testing audio plugins

  • dssi-vst - Run Windows VST plugins on Linux. DSSI doesn't support host tempo to plugin features.

  • FST - a program by which uses Wine, Jack and Steinberg's VST Audio Plug-Ins SDK to enable the use of many VST audio plugins under Gnu/Linux.
  • FeSTige - a GUI for fst and dssi-vst, allowing you to run Windows VST plugins on Linux.
fsthost -g ~/.vst
  # build plugin db
export VST_PATH=~/VST:/usr/share/vst:/otherlocation

fsthost -g
  # Perl GTK menu to startup plugins
  # Perl GTK app for control via TCP socket
  # simple application to show known plugins ( read about XML DB )

export FSTMENU_GTK=2 # or 3

  • Airwave - a WINE-based VST bridge, that allows for the use of Windows 32- and 64-bit VST 2.4 audio plugins with Linux VST hosts
  • - "Airwave is very nice, but adding more than a few plugins to it is awfully tedious. So I've taken the matter into my own hands and written a script to add a large number of plugins to Airwave (plus the ability to edit their names) as a batch process." [58]

  • - a Linux vst plugin that runs Windows 64 bit vst's. To use LinVst, the file simply needs to be renamed to match the windows vst dll's filename.

  • VSTForx - a full-modular effect network creation tool which comes as a VST-plugin. With VSTForx you are able to load any number of VST-plugins and connect them anyway you want. Additional modules allow you to manipulate such signal chains and offer a whole new way in mixing and producing. Windows/Mac. $.q


  • ghostess - a rough start at a graphical DSSI host, based on jack-dssi-host, but capable of saving and restoring plugin configuration, as well as specifying MIDI channels and layering synths. ghostess includes three MIDI drivers: an ALSA sequencer MIDI driver, a (clumsy but functional) CoreMIDI driver (which allows ghostess to be used on Mac OS X), and a JACK MIDI driver for use with the MIDI transport in recent versions (>=0.105.0) of JACK. ghostess also comes with a universal DSSI GUI, which attempts to provide GUI services for any DSSI or LADSPA plugin, and may be used with any DSSI host.


  • JACK Rack is an effects "rack" for the JACK low latency audio API. The rack can be filled with LADSPA effects plugins and can be controlled using the ALSA sequencer. It's phat; it turns your computer into an effects box.
  • jackspa - A small utility which will host a LADSPA plugin, providing JACK ports for its audio inputs and outputs, and sliders in a gtkmm GUI for its control inputs. I find it useful for hosting plugins with odd port configurations (such as a vocoder or a ring modulator), and for testing plugins. This project is pretty hacky. I threw it together quickly because I needed it in a hurry, and as a result, it's fairly buggy, and the code is a mess. But, it does the job.
  • ng-jackspa is a set of simple user interfaces that host a LADSPA plugin, providing JACK ports for its audio inputs and outputs, and dynamic setting of its control inputs. Additionally, the plugin controls can be exported to or controlled by control voltages on standard JACK audio ports.
  • Soundtank hosts LADSPA plugins in "realtime objects" which embody the structure of the audio signal flow. RTObjects can be controlled in a completely customizeable fashion using MIDI events sent through the ALSA sequencer interface.
  • Stomper - a virtual pedalboard for guitar, using commonly-available audio plugins in a user-defined arrangement and MIDI for switching. It is intended for on-stage use and will be optimized as such.


  • zynjacku - JACK based, GTK (2.x) host for LV2 synths. It has one JACK MIDI input port (routed to all hosted synths) and one (two for stereo synths) JACK audio output port per plugin. Such design provides multi-timbral sound by running several synth plugins.

  • Jalv is a simple but fully featured LV2 host for Jack. It runs LV2 plugins and exposes their ports as Jack ports, essentially making any LV2 plugin function as a Jack application.

  • Synthpod is both LV2 host and plugin. It can be run as a standalone app and be used as a tool for live performances or general audio and event filtering. Or it can be run as a plugin itself inside another host (or inside itself) to add support for non-linear patching where only strictly linear connections are supported (e.g. as in most DAWs). Patching of audio channels is clickless.

  • Elven that comes with this software package is written for revision 2 of the LV2 specification and is NOT compatible with revisions 3 and later. It may work, it may break subtly or it may give your computer the swine flu.

  • - the UI for the MOD software. It's a webserver that delivers an HTML5 interface and communicates with mod-host. It also communicates with the MOD hardware, but does not depend on it to run.


Equalisation / mastering


  • Alsaequal - a real-time adjustable equalizer plugin for ALSA. It can be adjusted using an ALSA compatible mixer, like alsamixergui or alsamixer. Alsaequal uses the Eq CAPS LADSPA Plugin as it's default equalizer but you can change it to use almost any LADSPA plugin, like mbeq from the swh-plugin package.


  • jackEQ - intended to provide an accessible method for tweaking the treble, mid and bass of any JACK aware applications output. Designed specifically for live performance, it is modelled on various DJ mixing consoles which the main author Patrick Shirkey (aka DJ Kotau) has worked with live. LADSPA.


  • NotNotchFilter - a performance-oriented filter designed to replace the mid-EQ found in a standard 3-band DJ mixer. The key advantage of this filter is that it cleanly cuts out single voices or instruments in a track, whereas a standard 3-band filter dampens them. This is because NotNotchFilter, as the name suggests, is not actually a notch filter. Rather, it is a combination of a hipass and lopass filter which work on opposite sides of the target frequency.


  • HiLoFilter - a simple hipass and lopass filter which can be easily controlled with a single knob. It is loosely inspired by the same type of filter found on some Pioneer DJM mixers, and also the The Pilgrim, another great plugin which provides roughly the same functionality.


  • x42-eq - a 4 band parametric equalizer with additional low+high shelf filters, Low and High-pass, as well as an optional, custom GUI displaying the transfer function and realtime signal spectrum or spectrogram. It is available as LV2 plugin and standalone JACK-application.


  • Lv2fil - Stereo and mono LV2 plugins, four-band parametric equalisers


  • EQ10Q - an audio plugin bundle over the LV2 standard ( implementing a powerful and flexible parametric equalizer and more (gate)


  • Luftikus - a digital adaptation of an analog EQ with fixed half-octave bands and additional high frequency boost. As an improvement to the hardware it allows deeper cuts and supports a keep-gain mode where overall gain changes are avoided.


  • JAMin - the JACK Audio Connection Kit (JACK) Audio Mastering interface. JAMin is an open source application designed to perform professional audio mastering of stereo input streams. It uses LADSPA for digital signal processing (DSP)

XO Wave

  • XO Wave - a digital audio workstation designed to meet the needs of audio and video professionals, with a focus on CD Mastering and audio for video work. XO Wave provides professional-grade capabilities for manipulating audio, with a familiar and elegant interface. Support for direct recording from any Core Audio device, importing and exporting files in a variety of formats (including songs from iTunes playlists), CD burning, multi-track editing, support for many Audio Units, and QuickTime synchronization and export, make XO Wave a great tool for all kinds of audio work, including podcasting, vodcasting, CD Mastering, mixing and more! Java.


Windows VST

Windows / Mac

Compression / limiting


  • VLevel - a tool to amplify the soft parts of music so you don't have to fiddle with the volume control. It looks ahead a few seconds, so it can change the volume gradually without ever clipping. Because the volume is changed gradually, "dynamic contrast" is preserved.


  • DPL1 - a look-ahead digital peak limiter, the kind you would use as the final step to avoid clipping when mastering or mixing. It can be used as an effect on individual instrument tracks as well. Latency is 1.2 ms rounded up to the nearest multiple of 8, 16 or 32 samples depending on sampling frequency. This amounts to 56 samples at 44.1 kHz, 64 samples at 48 kHz, and twice those values for 88.2 or 96 kHz.

Radium Compressor

  • - The system compressor in Radium is also available as an independent audio plugin. The unique interface: Accurately visualize the sound compression. Rapidly helps you find good compressor settings. With the Radium Compressor, you spend less time listening and finetuning. The interface quickly makes you find the sound you want. Other Features: Top class audio quality. The DSP code is implemented by Julius O. Smith III. Julius O. Smith III is a professor in Music and EE at Stanford University, and he is one of the legends in audio research. The compressor is based on code Julius has written for Faust. OSS/$

Visual Compressor

Molot Lite


  • - A compressor with character. A bit experimental: It works and sounds wonderfull, but has too many parameters, so is a bit fiddly to use. Also; I have no idea what to name the parameters, or how to explain a lot of them.

Xhip Compressor

  • Xhip Compressor - designed to loosely model a circuit I designed myself. It is like a pedal or what you might find in a channel-strip. It isn't designed to function as a limiter and it doesn't have instant response, look-ahead, low distortion or any of that stuff. That is exactly what makes it useful.

Xhip Limiter

  • Xhip Limiter - A counterpart to the compressor for limiting with the most simple design possible. This limiter is designed for maximal sustain and minimal distortion. It aims to be as transparent as possible in terms of timbre without using any advanced techniques.

x42-limiter / dpl.lv2

  • x42 Digital Peak Limiter - (aka dpl.lv2) is a look-ahead digital peak limiter based on Fons Adriaensen's DPL-1. It is intended to be used as final step to avoid clipping when mixing or mastering, but can also be used on individual tracks as well. It comes in mono and stereo variants, the stereo version applies the same gain reduction to both channels. Latency is 1.2 ms, rounded up to the nearest multiple of 8, 16 or 32 samples depending on sampling frequency. This amounts to 56 samples at 44.1 kHz, 64 samples at 48 kHz, and twice those values for 88.2 or 96 kHz.

Windows VST

  • nova67p - a parallel parametric equalizer plugin combined with a compressor. The compressor can optionally operate in frequency dependent and split-band modes. In this case the plugin operates as a parallel dynamic equalizer.
  • sixtyfive is a vintage-style, RMS Compressor, inspired by the dbx® 165A, a classic 1970’s compressor found in many studios, but it also adds a couple of new twists. It’s a soft-knee RMS compressor with a vintage flavour. The RMS detection and soft-knee help to give a smooth and musical compression experience, the non-linear response imparts character and the gentle saturation brings colour and warmth. As well as all of the originals features, this version adds parallel compression, extended knob ranges and peak metering
  • GTA - a vintage style ‘character’ compressor, designed like its muscle car name-sake for brute power, pure speed and to make a loud noise. It is stripped down for ease of use and has a unique vintage colouring on top.
Rough Rider 2 Compressor
  • Rough Rider Compressor - a modern compressor with a bit of "vintage" style bite and a uniquely warm sound. Perfect for adding compression effects to your drum buss, it also sounds great with synth bass, clean guitar, and backing vocals. Definitely not an all-purpose compressor, Rough Rider is at its best when used to add pump to rhythmic tracks. Rough Rider is available as both 32- and 64-bit VSTs for Windows, and as a Universal Binary AU/VST for OSX.
W1 Limiter
  • W1 Limiter - a clone of Waves L1, with identical output, as well as an approximation of Waves L2.



  • abGate - a LV2 noise gate plugin in the LV2 format to manage noise. A noise gate is a component which attenuates an audio signal when it falls below a set threshold, so it can be applied to an audio track which has one or more periods of silence where no noise should be apparent.

Xhip Gate

  • Xhip Gate - A gate effect including side-chain input and a few other additional features. Ideally the trigger/threshold detector and envelope generator would be separate plug-ins, although this is not so practical in the majority of cases. Instead a general-purpose AhR envelope is used.


  • GT10QM / GT10QS - Versatile noise-gate plugin with mono (M) and stereo (S) versions. This plugin provides the usual controls in a dynamic gate processor: threshold, range, attack, release and hold. But also features some extra parameters to allow an accurate adjustment. The gain control is useful to bring the input sound level in the correct VU meter range to be able to tune the threshold right. Even though the internal processing is performed in floating point at 32 bits, it’s helpful to adjust the input gain in a standard range to achieve an optimum level visualization. Together with VU meter, there is also a gain reduction meter that aids to set up the controls. Besides the power of dynamic section controls (DYN), there is also a filtered side chain section (SC). Here it is possible to adjust a low pass filter (LPF) and a high pass filter (HPF) and listen the actual side chain signal after filtering with the “Key” button. This side chain flexibility is very interesting to avoid unwanted noises to start triggering the gate.






  • moot - a flexible audio mute plugin with a number of additional features. The mute switch (Hit Me) can be assigned to a single midi keyboard key and act in 3 ways: as a latch (one press mute, release stays muted, next press unmute, release stays unmuted), in default mode (hold down to mute, release to play audio), in invert mode (hold down to play audio, release to mute). Windows VST2.
  • gator - creates a random steppy, gated effect, triggering volume between 2 adjustable values by a tempo-sync'd probability-based step sequencer. An LFO can modulate the volume when triggered and trigger pulse length, attack and release can shape the sound. Windows VST2
  • StormGate1 - an unique and innovative amplitude rhythmic gating effect which lets you draw the gating patterns freely by hand or with the aid of powerful drawing tools. PC VST format and MAC VST and AU formats.

  • A1TriggerGate - "There are many sequenced / rhythmic gates out there, but honestly nothing really satisfied me yet. So I decided to write my own plugin for this task..." - Windows/Mac


Calf Vintage Delay

  • Calf Vintage Delay - based on bpm-oriented delay time settings. Additionally the delayed signal is processed by a filter to simulate old tape-machine based delay effects. Some options of the stereo distribution of the delayed signal makes it very flexible and wide-ranged in sound.


  • TAL-Dub - a vintage style delay effect. It can be used for a wide range of delay effects from clean to extreme distorted, resonating never ending delays :-)

TAL-DUB-II is an extended version of TAL-Dub-I with a completely new sound engine. A 4x oversampled distortion stage allows to add vintage distortion to the delayd signal, but its also possible to make clean delays. A sinus LFO has the possibility to modulate delay time and low pass filter cutoff. Adjustable LFO stereo width is also included. An analog sounding 6dB low pass filter with resonance and a 3dB high cut filter are also parts of TAL-DUB-II. Different routing options open a wide range of possibilities.

TAL-DUB-III is an easy to use delay device with some special features. Its no tape delay emulation and has its own sound. It has an alias free saturation stage, a non-linear 6dB low pass and a 3dB high pass filter that are included in the feedback path of the device. An input drive knob allows to adjust the saturation level. Pop-up menues show the current values of volume, delay-time and feedback knobs. A tab button allowing to adjust the delay time for live sessions.


Xhip mDelay

  • Xhip mDelay - dual delay with stereo, cross and ping-pong modes. It has an LFO and can be used to create anything from flangers and chorus to ordinary delays and strange detuned echos all the way up to extreme pitch modulation effects.


  • - a free and open source delay VST effect modelled after a renowned vintage tape recorder. Its aim is to provide music producers with a fun and useful tool that sounds and works like vintage tape echo boxes, as well as to serve as an educational tool for anyone interested in audio development and digital signal processing.


  • - The evolution of my bolliedelay.lv2, features full control over two seperate channels, HPF/LPF on both, the delay path and feedback path, fractional delay, that enables modulation

Pitched Delay


  • BOWECHO - Quad Modular Delay. Use the key R7P1R64720175164548606433 to unlock the demo version. OS X, Windows and Linux.


  • Tapiir - a simple and flexible audio effects processor, inspired on the classical magnetic tape delay systems used since the early days of electro-acoustic music composition. It provides a graphical user interface consisting of six delay lines, or "taps", which can introduce an almost arbitrarily big or small delay to their inputs and can be feed back to each other. A wide set of effects can be easily achieved by properly configuring and connecting the delay lines: complex echo patterns, resonances, filtering, etc. Delays, interconnections and gains can all be controlled in real time.


  • Tom Pong is a ping pong delay VST Plugin. Windows only. A ping pong delay is a delay which alternates from one speaker to another. If the balance is set to 0.5, a stereo signal will invert itself each time the delay buffer is played back. If the balance is set to either side of 0.5, the signal bounces back and forth between the speakers. Tom Pong can be synced to the host sequencer, or delay times can be set by sample count. Feedback and output level are also adjustable.

  • combover comes back from the archives | de la Mancha plugins - a comb delay, and four more comb delays, and a step sequencer that does odd things, and a pool of prime numbers that make stuff happen, and some mix n match pitch detection. It also has an xy pad and wet / dry controls that let some stuff through and other stuff not. It does crazy stuff, you should stop reading these words and find out for yourself.


See also #Convolution


  • Multiverb is an audio plug-in that produces a reverb effect on a monaural or stereo audio track. It is available as both a VST plug-in for Windows (Win32 and Win64) and an LV2 plug-in for Linux. Multiverb implements an acoustic system that is an interconnection of multi-port acoustic elements. For a full technical description of the Multiverb algorithm please see my paper in the Journal of Multidisciplinary Engineering Science and Technology (JMEST).


Dragonfly Reverb


  • mverb - Studio quality, open-source reverb. Its release was intended to provide a practical demonstration of Dattorro’s figure-of-eight reverb structure and provide the open source community with a high quality reverb.


  • REV1 - a reworked version of the reverb originally developed for Aeolus. Its character is more 'hall' than 'plate', but it can be used on a wide variety of instruments or voices. It is not a spatialiser - the early reflections are different for the L and R inputs, but do not correspond to any real room. They have been tuned to match left and right sources to some extent.

KR-Reverb FS

  • KR-Reverb FS - an easy to use Reverb processor based on features found on our commercial product KR_Space. KR-Reverb FS is designed for ease of use by adjusting internally the equalization and damping controls to optimal levels for producing a warm reverb sound suitable for a wide range of applications.


  • Protoverb - an experimental reverb based on the idea of a "room simulator". Most algorithmic reverbs try to avoid resonances or model the reflections of sound from a rooms walls. Protoverb does the opposite. It builds up as many room resonances as possible, modeling the body of air in the room. It therefore does not need to modulate or colour the signal. The result is a very natural sounding reverbration with some interesting features: Long standing frequencies resonate louder, as if the air takes some time to get excited. Multiple instruments don't mash into a diffuse mud, they stay distinct. If you play a short melody, the room seems to repeat a ghost echo of that melody. Those properties are indeed found in churches and large halls, but they're rarely found in conventional algorithmic reverbs.

Xhip Reverb

  • Xhip Reverb - A reverb effect designed to avoid an ultra-smooth fade-to-noise decay.

Teufelsberg Reverb

Sound retainer

Sostenuto, "infinite sustain"

  • Infamous Plugins: Stuck - a clone of the electro-harmonix freeze. It drones the note being played when the "Stick It!" port is set to 1 (or the CV port input goes above 1), causing the note to be "stuck". Once the port falls below 1 the drone is released with a decay set in seconds. The drone is added to the dry signal (so original signal is passed through at all times un-processed). This plugin is pretty useless except in live situations, though I'd love someone to creatively prove me wrong.


  • Richter - a two LFO tremolo, the interaction of which creates far more complex volume oscillations than can be created with a single LFO. The rate and depth controls of each LFO can each be modulated by two more LFOs to create variations in frequency and depth, and tempo sync is available on all oscillators. VST, VST3, and Audio Unit, Windows, Mac, and Linux.

  • Xhip Tremolo - a standard tremolo with the addition of a couple features. One is that it has phase adjustment for the LFO, allowing it to produce a "wide" tremolo panning effect. Second is that the waveform can be adjusted between pulse-like, linear and saturated which produce different characters mimicking various old tremolo effects.

Windows VST


  • Xhip Phaser - standard phaser effect with a variable number of stages. This phaser doubles as a frequency dependent delay, chorus effect and vibrato because it allows you to use up to 128 stages.


  • vm.lv2 - a virtual machine plugin bundle

  • - Songbird is a modulated vowel/formant filter. Select two vowels and then modulate between them using either the manual slider or the LFO. There are five vowel sounds and two modulation modes available to choose from. The "freq" modulation mode allows you to modulate a single filter between the two vowels chosen, to create a powerful vocal sound. The "blend" mode provides a more subtle effect, modulating the mix between two parallel filters. Songbird is ideal for creating both vocal bass sounds and subtle filter sweeps.

  • Xhip Vocal - A simple phoneme synthesizing filter made from several parallel formant filters. This filter provides AEIOU formants with control over center frequency and Q with smooth interpolation between formant sets with the phoneme control.

  • DtBlkFx - a freeware Fast-Fourier-Transform (FFT) based Multi effect VST plug-in for Windows and Mac. Precision parametric equalizing with sharp-roll off, adjust individual harmonics of a sound. Harmonic based (or comb) filtering, including active harmonic tracking. Various types of noise control, change contrast between loud and soft frequencies, clip frequencies or apply sound smearing. Frequency shifting, harmonic and non-harmonic shifting, including active harmonic repitch.

  • sfilter - creates a stepped filtered sequence, to create gating, sweeps or rhythmic modulation of filter cut-off. It uses a variable state filter, varying between 2 adjustable cut-off values according to a tempo-sync’d step sequencer. An LFO can also modulate the filter for extra movement
  • pfilter - creates a steppy, gated effect, but using filter cut-off instead of volume to give a wider range of possibilities. It uses a variable state filter, triggered between 2 adjustable cut-off values by a tempo-sync’d probability-based step sequencer. An LFO can modulate the filter when triggered and trigger pulse length, attack and release can shape the sound

  • sumo - an effect plugin to make any sound as fat as you like. It’s good for fattening up leads and basses, adding some weight to pads, making your vocals chubby and your drums obese


  • You Wa Shock ! - VST/Winamp effect to brighten up and maximize any truck you hand it.


TF Chorus

  • - an LV2 and LADSPA port of TF Chorus by TraumFlug. The great, unique chorus effect was found in the Armstrong package of an old PPA. Any information about this plugin seems to have disappeared from the internets. There was no license, except that it is implied to be "free, open source software" (see src/tfcho_orig.cpp, the original source) and that the rest of Armstrong is released under GPLv2. For now let's say this port is under the same license as the original, whatever it may be.

JPC Ensemble Chorus


Ring modulation

  • - performs classic ring-modulation effects using one or two oscillators. With a clean interface that gives easy access to more advanced controls like our adjustable phase difference and shape features, including editable custom waveforms and harmonics.
  • ring thing is a multi-flavour ring modulator, with frequency and mix level controlled by an XY pad and each axis modulated by it’s own tempo-sync LFO. The modulation in both axes is shown graphically on the XY pad.


Wolf Shaper

Windows VST

  • - goes beyond a traditional wave-shaping plugin. Unlike the conventional approach of providing a few predefined patterns, MWaveShaper lets you construct your own shape creating a much greater range and control over your sound.
  • - an extremely powerful multi-comb filter plugin. Using its 2 extremely versatile modulators it becomes a powerful processor, which can follow a simple LFO, react to input levels, MIDI note, input pitch...



  • SDRR - built to satisfy almost all of your saturation desires. It provides a comprehensive set of controls to manipulate the character of the saturation to make it fit exactly. SDRR offers four different main modes: TUBE, DIGI, FUZZ, DESK and reacts dynamically to the input signal. Each mode has its unique crosstalk behavior, which can be switched off or exaggerated. A unique RMS level difference metering mode makes level matching an easy task. SDRR can be different things: a saturation, a compressor, an EQ, a bit-crusher, a subtle stereo widener, or simply add some movement to your tracks with the DRIFT control. Add warmth, depth and character to your tracks with SDRR.
  • IVGI - can deliver very soft and subtle saturation, that feels at home on the master buss. It is equally capable of very dense and dirty distortion effects to spice up single tracks. IVGI's base sound is comparable to the DESK mode in the big brother SDRR. Windows/Mac VST.

Rotating speaker

  • - x42-whirl is a designed to imitate the sound and properties of the electromechanical rotating speaker device that brought world-wide fame to the name and products Don Leslie. It is a standalone version of the effect that came to be with the setBfree synth. Rather than simulating the net effect of the electromechanical device, x42-whirl physically models the properties, which results in very accurate representation of the sound of the real device. Since all individual parts are modelled, x42-whirl not only provides advanced control, but also facilitates customizations, some of which are not feasible in the mechanical device.


  • Arcangel is a jack effect for arctan distortion. Sounds nice and grungy without clipping at high levels, and sounds nice at lower levels.

  • Analog distortion emulation developed by mod team (lv2). The effects were developed suposing you have a -15dB input signal (measured with digital peak meter) when you play loud, so its recommended that you ajust your input gain to this level. We recommend that you use only the stable plugins: -DS1 -Big Muff Pi
  • deteriorate-lv2 - A set of plugins to deteriorate the sound quality of live inputs. The set contains two plugins: A basic granulator, a basic downsampler
  • Deathcrush is a distorsion plugin, made up of some raw effects (such as bitcrusher and compressor) to really ruin your gentle sounds.

  • WubFlip - It sort of flips high or low values beyond a threshold making a dirty distorted mess of the sound that might be useful for people making wanting big dirty breaks or synths or something like that. Play around with the sliders. The upper threshold slider needs to be higher (to the right of) than the lower threshold slider otherwise you'll get no sound. The difference between them effects the sound. Then the multiplier slider effects how much "flipping" gets done. LADSPA.

  • Carve - a wave shaping distortion with several available waveforms and two distortion units, which can be configured in serial, parallel or stereo, and blended with the dry sound. VST, VST3, and Audio Unit, Windows, Mac, and Linux

Windows VST

  • Gorgon - Distortion. Glitchmachines Subvert is the evolution of Gorgon. The download contains the unlocked installers. OS X and Windows.

  • - a serious tool for extreme distortion lovers. It converts the audio into limited fixed-point precision form, from a 1 single bit up to 16 bits per sample, and lets you access each bit, applying several operations.
  • FuzzPlus3 - vintage fuzz pedal model, plus a new filter, self-feedback, and a modern procedural user interface. Windows/Mac
  • thrummaschine - a 3-band distortion effect with independant, LFO-driven filters. Make your bass, mid and high frequencies oscillate at different speeds, shapes and pan, with whatever flavour and level of distortion you dial in for each band
  • Imperfection - an effect plugin to put some lofi back into your pristine 64bit audio. Who wants hi fidelity reproduction when you can reduce the quality, take out some of that bottom end, add a smear of saturation and bring your noise floor back up. Hmmm, perfectly imperfect
  • bent - a circuit-bent resynthesis effect. It will recreate the incoming audio into an approximation of itself using a waveform-morphing audio oscillator. Depending on the volume and pitch of the audio, it will gate, stutter and morph the output in sync with your host tempo
  • - freq show - screws around with your audio and ouputs an unholy version of whatever you fed it
  • GClip - Free VST wave-shaping signal clipper. Clip peaks off audio with abrupt or smooth wave-shaping. Graph and waveform displays assist in setting the clip level according to the source material. Oversampling can be enabled to reduce aliasing.

Frequency shifting

  • Frequency Shifter - a VST™2.4 software effect for Microsoft Windows® written in native C++ code. Frequency shifting up to ±5000 Hz. Optional LFO with five waveforms. Four frequency ranges, three mix modes.
  • - an extremely versatile frequency shifter. Unlike pitch-shifters it doesn't keep harmonic relationships and can provide everything from mild stereo expansion to complete sonic destruction.

Pitch shifting

  • Autotalent began as the result of a week of recreational signal processing in May 2009. It's a real-time pitch correction plugin. You specify the notes that a singer is allowed to hit, and Autotalent makes sure that they do. You can also use Autotalent for more exotic effects, like the Cher / T-Pain effect, making your voice sound like a chiptune, adding artificial vibrato, or messing with your formants. Autotalent can also be used as a harmonizer that knows how to sing in the scale with you. Or, you can use Autotalent to change the scale of a melody between major and minor or to change the musical mode. LADSPA.
  • talentledhack - an LV2 port of Tom Baran's Autotalent, with added features and improved performance.
  • AT1 - an 'autotuner', normally used to correct the pitch of a voice singing (slightly) out of tune. Compared to 'Autotalent' it provides an improved pitch estimation algorithm, and much cleaner resampling. AT1 does not include formant correction, so it should be used to correct small errors only and not to really transpose a song. The 'expected' pitch can be controlled by Midi (via Jack only), or be a fixed set of notes. AT1 can probably be used on some instruments as well, but is primarily designed to cover the vocal range. It's also usable as a quick and dirty guitar tuner.
  • x42-autotune - aka fat1.lv2, is an auto-tuner based on Fons Adriaensen's zita-at1. The main differences to zita-at1 are that the LV2 plugin version reports its latency to the host, saves the state with the session and the MIDI input has sidechain semantics.
  • TAL-Vocoder is a vintage vocoder emulation with 11 bands that emulates the sound of vocoders from the early 80’s. It includes analog modeled components in combination with digital algorithms such as the SFFT (Short-Time Fast Fourier Transform). This vocoder does not make a direct convolution of the carrier and modulation signal as other digital vocoders maybe do. It includes an envelope follower for every of the eleven bands. This vocoder is optimized for voice processing and includes some algorithms for consonants to make the voice more intelligible. The carrier signal is a VCO (Voltage Controlled Oscillator) with a Pulse, Saw, Noise and SubOsc. But it’s also possible to use the left stereo input as carrier. This way every sound source can be used as carrier signal.
  • vocoder (JACK standalone). It's a complete rewrite of the old vocoder, now done in C++ using FLTK.

  • VocProc - an LV2 plugin for pitch shifting (with or without formant correction), vocoding, automatic pitch correction and harmonizing of singing voice.

  • - Turns any monophonic sound into a synthesizer, preserving the pitch and spectral dynamics of the input. The name was chosen because I use it mostly to turn my voice into a singing robot, and it's made in Faust.

Windows VST

  • Vintage Vocoder - real-time audio effect - VST and DXI plug-in for PC/MAC. Originally a commercial product published by Sonicism Digital Audio Solutions in 2002. This software was used for the robot voices and sound effects in the computer game Freelancer.

  • La Voz Cantante - a 512 channel vocoder. The modulator input - usually a sung or simpjy spoken voice - is analyzed with respect to its spectral content, which is then applied to the other sound source. The latter may be any externally supplied signal ranging from pink noise, synth pads, guitar or even drums. Alternatively, there is an internal, MIDI driven synth which is optimized for best speech reproduction fidelity. You can blend the high frequencies with noise for more natural sounding plosives and fricatives. Threre is also a noise gate, a compressor and a stereo reverb on board.


  • MIDI Choir - will take a single-pitched audio source and transpose it in real time according to the supplied MIDI notes. My main motivation to create MIDI Choir was to be able to sing harmonies live, however the product may also be used for studio work

Phase vocoder

  • pvc - PVC is a collection of phase vocoder signal processing routines and accompanying shell scripts for use in the transformation and manipulation of sounds. It is written in C and designed to be used in a UNIX environment.
  • pv in WaoN project is yet another phase vocoder implementation for my understanding of the process behind WaoN and others. Here is what you can do: time streching/shrinking without pitch changing (by rate option) and pitch shifting without time streching (by pitch option)
  • pvoc is a collection of LADSPA units and a command line tool for time compression/expansion of sound data making use of the phase-vocoding technique[1].

Noise reduction


messy section

  • - In acoustics, reverberation is the convolution of the original sound with echoes from objects surrounding the sound source. In digital signal processing, convolution is used to map the impulse response of a real room on a digital audio signal.

In electronic music convolution is the imposition of a spectral or rhythmic structure on a sound. Often this envelope or structure is taken from another sound. The convolution of two signals is the filtering of one through the other.

  • - impulse response function (IRF), of a dynamic system is its output when presented with a brief input signal, called an impulse. More generally, an impulse response refers to the reaction of any dynamic system in response to some external change. In both cases, the impulse response describes the reaction of the system as a function of time (or possibly as a function of some other independent variable that parameterizes the dynamic behavior of the system). In all these cases, the dynamic system and its impulse response may be actual physical objects, or may be mathematical systems of equations describing such objects.

  • QLoud - tool to measure loudspeaker frequency and step responses and distortions

  • Aliki - an integrated system for Impulse Response measurements, using the logaritmic sweep method developed by Prof. Angelo Farina. Release 0.0.3-beta is available on the downloads page. It's still very incomplete but it has been used for real measurement work.
  • deconvolv - convolution and deconvolution of WAV files. Supported signal processing functions: correlation, convolution, de-convolution, convolution with Hilbert transformation, de-convolution with Hilbert transformation.

  • The HISSTools Impulse Response Toolbox: Convolution for the Masses - this paper introduces the HISSTools project, and its first release, the HISSTools Impulse Response Toolbox (HIRT); a set of tools for solving problems relating to convolution and impulse responses (IRs). Primarily, the aims and de- sign criteria for the HISSTools project are discussed. The elements of the HIRT are then outlined, along with mo- tivating factors for its development, underlying technolo- gies, design considerations and potential applications.
    • - HISSTools first release is a set of tools for working with convolution and impulse responses in Max. This set of object addresses various tasks, including measuring impulse responses, spectral display from realtime data/ buffers, and buffer-based convolution, deconvolution and inversion.
  • ExpoChirpToolbox - an impulse response (IR) measurement tool chain in Pure Data, available for Windows, OSX and Linux. It implements the Exponential Sine Sweep method which has been so succesfully advocated by Angelo Farina. The toolbox is in developement, and has functionality for the generation of test signals, recording test responses, IR editing and basic IR analysis. The edited IR can be applied as a convolution filter in the toolbox. This page shows screenshots, and the tool can be downloaded from the page bottom.
  • DRC - a program used to generate correction filters for acoustic compensation of HiFi and audio systems in general, including listening room compensation. DRC generates just the FIR correction filters, which can be used with a real time or offline convolver to provide real time or offline correction. DRC doesn't provide convolution features, and provides only some simplified, although really accurate, measuring tools.

  • REW - free room acoustics analysis software for measuring and analysing room and loudspeaker responses. The audio analysis features of REW help you optimise the acoustics of your listening room, studio or home theater and find the best locations for your speakers, subwoofers and listening position. It includes tools for generating audio test signals; measuring SPL and impedance; measuring frequency and impulse responses; measuring distortion; generating phase, group delay and spectral decay plots, waterfalls, spectrograms and energy-time curves; generating real time analyser (RTA) plots; calculating reverberation times; calculating Thiele-Small parameters; determining the frequencies and decay times of modal resonances; displaying equaliser responses and automatically adjusting the settings of parametric equalisers to counter the effects of room modes and adjust responses to match a target curve.

  • BruteFIR - a software convolution engine, a program for applying long FIR filters to multi-channel digital audio, either offline or in realtime. Its basic operation is specified through a configuration file, and filters, attenuation and delay can be changed in runtime through a simple command line interface. The FIR filter algorithm used is an optimised frequency domain algorithm, partly implemented in hand-coded assembler, thus throughput is extremely high. In realtime, a standard computer can typically run more than 10 channels with more than 60000 filter taps each.
  • IR - a no-latency/low-latency, realtime, high performance signal convolver especially for creating reverb effects. Supports impulse responses with 1, 2 or 4 channels, in any soundfile format supported by libsndfile.
  • SpecMatch - can be used to adapt the sound produced by a Guitarix setting to another recorded sound. It can also be used independently of Guitarix (cf. specmatch --help). Then you will need another convolver like the LV2 Convolution Reverb to use the produced filter. You can also use just the Python modules (e.g. from specmatch import SmoothedIR).
  • keFIR - provides music producers and sound engineers with a zero-latency FIR filter effect designed to help them enhance their tracks and generate astonishing sounds. Windows VST.

  • Jconvolver - a Convolution Engine for JACK, based on FFT convolution and using non-uniform partition sizes: small ones at the start of the IR and building up to the most efficient size further on. It can perform zero-delay processing with moderate CPU load. Jconvolver uses the convolution engine designed for Aella, a convolution application for reverberation processing (to be announced later). This distributes the calculation over up to five threads, one for each partition size, running at priorities just below the the one of JACK's processing thread. This engine is a separate library that will be documented as soon as I can find the time.
  • - a Convolution Engine for JACK using FFT-based partitioned convolution with multiple partition sizes. It's a command line version of what will be the core of the Aella reverb processor, but without the special reverb feautures, preset management, reverb envelope editing etc. that Aella will have.

  • - a vst/au plug-in which recreates a (analog) filter response with the use of a recording of a filtered train of impulses. It is created with the use of the JUCE framework. The plug-in is already in a working state, but still under construction.

  • HybridReverb2 - a convolution-based reverberation effect which combines the superior sound quality of a convolution reverb with the tuning capability of a feedback delay network. The sound quality of a convolution reverb depends on the quality of the used room impulse responses. HybridReverb2 comes with a set of room impulse responses which were synthesized with tinyAVE, an auralization software which was developed at the Institute of Communication Acoustics, Ruhr-Universität Bochum (Borß and Martin, 2009; Borß, 2009a). These room impulse responses are designed for a speaker setup with two front and two rear speakers (Borß, 2009b). For a full surround sound effect, you will need two plugins, one plugin which uses a "front" preset for the front channels and a second plugin which uses the corresponding "rear" preset for the rear channels.

  • Voxengo Deconvolver - offers a very convenient environment in which to deconvolve large sets of recorded files for use with convolution plug-ins that support only a small subset of available bit-depths. Windows $.


  • truc is a multi-effect VST plug-in, with 4 banks of effects controlled by the movement of 2 pucks. The top puck controls the level of each effect bank, the bottom puck modulates any 4 of 13 parameters within the effect banks. This allows continuous morphing of the sound by moving the pucks to vary the impact of each effect bank. As well as manually controlling each puck (either by mouse or midi controller) you can lock the pucks together and/or set them to move automatically, either randomly or by a configurable LFO, all in sync with your project tempo. Windows VST.
  • truc2 is a multi-effect plug-in with 4 different effect modules and two automated XY pads to modulate their levels and parameters. It is designed to add variation and movement, anywhere along the scale of subtle to overkill and is suitable for any material. The 4 effect modules are DIRT, GRAIN, RING and DELAY. The first XY pad modulates the volume/mix level of each module, whilst the second XY pad modulates any 4 of the 15 automatable parameters. Both XY pads can be moved manually and additionally automated with a variety of LFO shapes, speeds and depths. Windows VST.

  • SynthTrack - an effect plugin. Applied to audio tracks, this plug-in applies Filter ADSR and LFO effects to synths on flat chords or moving sound waves. With this effect you can create chopping effects to your favorite sounds. It's ideal for creating the typical rhythmic gated pad sounds. With the envelope controlled step sequencer it's even possible to turn your pad sound into a powerful arp-like sequence. This plugin synchronizes to the host sequencer / DAW tempo. Windows/Mac VST.

Disto:Fx Free

  • Disto:Fx Free (Dirty Sound Destructor) - a multi-fx with distortion/dynamic shaper/saturation/filter/ring modulator/phaser/EQ and Input-Output control units. macOS AU/VST, Win VST 32-bit/64-bit.

Amplio 2.0

  • Amplio 2.0 - (VST Effect) is a way to enhance boring and dull sound. It's three band Equalizer with adjustable bands and additional multiband effect modules. Plugin was originally designed to enhance drum patterns, but version 2 is powerfull enough to be used on any type of sound.



  • Fracture - features a buffer effect, a multimode filter, three LFOs and a delay. The order of the effects in the processing chain can also be reconfigured. This plugin is geared toward adding glitchy articulations and abstract textures to your projects. Use it on anything from drums and percussion to synth lines and sound effects. Fracture’s intuitive interface and diverse features make it simple to give your projects a unique technical edge.


  • ExEf - Extreme Effect, is an extremely powerful and flexible Real Time effect engine running on a PC under LINUX. It is designed to work with guitars, microphones and other instruments. It can run both in X Window System and command line. It supports both recording an post-processing. For easy start, try some presets! Requirepments include PC - Pentium MMX or above (Alpha may work as well), Full duplex soundcard.

Rack based


  • guitarix is a virtual guitar amplifier for Linux running on Jack Audio Connection Kit. It is free as in speech and free as in beer. The available sourcecode allows to build it on other UNIX-like systems, too, namely for BSD and for MacOSX.


  • Rakarrack is a richly featured multi-effects processor emulating a guitar effects pedalboard. Effects include compressor, expander, noise gate, graphic equalizer, parametric equalizer, exciter, shuffle, convolotron, valve, flanger, dual flange, chorus, musicaldelay, arpie, echo with reverse playback, musical delay, reverb, digital phaser, analogic phaser, synthfilter, varyband, ring, wah-wah, alien-wah, mutromojo, harmonizer, looper and four flexible distortion modules including sub-octave modulation and dirty octave up. Most of the effects engine is built from modules found in the excellent software synthesizer ZynAddSubFX. Presets and user interface are optimized for guitar, but Rakarrack processes signals in stereo while it does not apply internal band-limiting filtering, and thus is well suited to all musical instruments and vocals. Rakarrack is designed for Linux distributions with Jack Audio Connection Kit.


  • GNUitar is guitar effects software that allows you to use your PC as guitar processor. It includes the following effects: wah-wah, sustain, distortion, reverberator, echo, delay, tremolo, vibrato, and chorus/flanger.



  • CP-GFX is simply a Cross Platform Guitar Effect Processor. The aim of the project is to create an extensible and easy to use program which is easy to port to different platforms an operating systems. Currently in development are Linux x86 and Win32 builds.


  • RedFX - FX Processor (for guitar mainly) Effects: Noise Filter, Compressor, Wah, Distortion, Tremolo, Phaser, Flanger-Vibrato, Pitch Shifter, Delay, Reverb, EQ.


  • Ecamegapedal is real-time effects processor software with a graphical user interface for controlling the effect parameters. It is meant to be used as a virtual guitar-fx or studio effects box. In addition to real-time operation, it also supports reading from and writing to audio files. All audio object and effect plugin types provided by the Ecasound libraries are supported. This includes ALSA, JACK, OSS, aRts, over 20 file formats, over 30 effect types, LADSPA plugins, and multi-operator effect presets. The implementation is based on the Ecasound and Qt libraries.


  • - Guitar effects program with plug-in in support allowing you to play guitar through your computer. Includes custom plug-ins for distortion, gain, recording, piped audio input, pitch tuner, etc. Supports alsa, jack, lasdpa audio plug-ins, and jconv convolution engine for guitar cabinet simulation.

Various collections

  • TAP-plugins is short for Tom's Audio Processing plugins. It is a bunch of LADSPA plugins for digital audio processing, intended for use in a professional DAW environment such as Ardour. These plugins should compile and run on any recent (that is, not seriously outdated) GNU/Linux system. They don't require any special libraries besides the standard GNU C and math libraries, which are expected to be provided on the machine used for compiling.
  • CAPS is a collection of audio plugins comprising basic virtual guitar amplification and a small range of classic effects, signal processors and generators of mostly elementary and occasionally exotic nature. LADSPA.
  • LSP (Linux Studio Plugins) is a collection of open-source plugins currently compatible with LADSPA and LV2 formats. Phase Detector, Delay Compensator Mono, Delay Compensator Stereo, Delay Compensator X2 Stereo.
  • Infamous Plugins is a collection of open-source LV2 plugins. It hopefully helps fill some holes, supplying non-existing plugins for linux audio. There is little interest in creating ANOTHER compressor, or ANOTHER EQ when myriad other excellent lv2 versions of such already exist. At least until I become interested in making one of those things and feel I can do something different...
  • ArtyFX - a plugin bundle of artistic real-time audio effects. The aim of this plugin collection is to allow the designing of your sound just as you desired using a fast, efficient workflow.
  • Calf Studio Gear - available exclusively for LINUX-based operating systems and runs as a stand-alone effect rack connectable through Jack sound server or as plug-ins in every audio host that is able to fire up LV2 compilant devices, e.g. the highly recommended Ardour Audio Workstation. Play your SF2 sample banks, create filthy organs, fatten your sounds with phasers, delays, reverbs and other FX, process your recordings with gates, compressors, deesser and finally master your stuff with multiband dynamics - for free!

  • mda-vst - Windows VST, including Bandisto - Multi-band distortion. BeatBox - Drum replacer, Combo - Amp & speaker simulator, De-ess - High frequency dynamics processor, Degrade - Sample quality reduction, Delay - Simple stereo delay with feedback tone control, Detune - Simple up/down pitch shifting thickener, Dither - Range of dither types including noise shaping, DubDelay - Delay with feedback saturation and time/pitch modulation, Dynamics - Compressor / Limiter / Gate, Envelope - Envelope follower / VCA, Image - Stereo image adjustment and M-S matrix, Leslie - Rotary speaker simulator, Limiter - Opto-electronic style, limiter, Loudness - Equal loudness contours for bass EQ and mix correction , Multiband - Multi-band compressor with M-S processing modes, Overdrive - Soft distortion, Re-Psycho! - Drum loop pitch changer, RezFilter - Resonant filter with LFO and envelope follower, Round Panner - 3D panner, Shepard - Continuously rising/falling tone generator, Splitter - Frequency / level crossover for setting up dynamic processing, Stereo Simulator - Haas delay and comb filtering, Sub-Bass Synthesizer - Several low frequency enhancement methods, Talkbox - High resolution vocoder, TestTone - Signal generator with pink and white noise, impulses and sweeps, Thru-Zero Flanger - Classic tape-flanging simulation, Tracker - Pitch tracking oscillator, or pitch tracking EQ, Vocoder - Switchable 8 or 16 band vocoder, VocInput - Pitch tracking oscillator for generating vocoder carrier input
  • MDA-LV2 is an LV2 port of the MDA plugins by Paul Kellett. It contains 36 high-quality plugins for a variety of tasks. This is a more or less faithful port of both the effects and instrument plugins. The only functional difference in code is to support LV2-style toggle ports (> 0.0 is on, rather than 0.5). All the plugins have been tested, and thanks to several bug fixes this collection should be more reliable than the original.

  • SAFE Plug-ins (SAFE stands for Semantic Audio Feature Extraction) are a series of DAW plug-ins that allow the user to provide timbral descriptions of the audio they are processing. The plug-in then analyses the audio and saves the anonymous data to our server. This data is collected from all users and analysed to give a general synopsis of the types of sound that a given descriptor is used for. All this information can then be used to create a series of ‘semantic plug-in settings’. Users will be able to load plug-in settings by typing in descriptive words regarding the timbre of the sound being processed. The more people who upload descriptors to the server the more perceptually representative the downloaded plug-in settings will get.
  • x42-plugins - professional audio processing units available as LV2-plugins and JACK-applications
  • ReaPlugs VST FX Suite - Want to use some of the comprehensive FX plug-ins that REAPER provides, but stuck in another host? Haven't made the switch yet? Fear not -- you can download ReaPlugs, a package of FX that includes many of the plug-ins that come with REAPER, for free!

  • DISTRHO Mini Series - This collection currently includes: 3-Band EQ, 3-Band Splitter, Ping Pong Pan

  • Russolo Suite - a collection of LV2 plugins (and in the future, hopefully, VST) developed by Valerio Orlandini and named after the Futurist musician Luigi Russolo. For the moment, the attention is focused on the first part of this project: a sufficiently crazy synthesizer, called (what a surprise) Crazynth, and a do-it-all effect, called Omnifono.

  • Computer Music Toolkit (CMT) is a collection of LADSPA plugins for use with software synthesis and recording packages on Linux. See the license before use.

  • GVST - several free VST effects and instruments for Windows. For the main part they are designed to be simple, light-weight and efficient, although some are more ambitious and some more experimental. Effects; GBand - Band-pass filter. GChorus - Chorus effect. GClip - Wave-shaping signal clipper. GComp - Compressor. GComp2 - Compressor. GDelay - Delay effect. GDuckDly - Ducking delay effect. GFader - Signal gain (-100 to 0 dB). GGain - Signal gain (-12 to 12 dB). GGate - Gate. GGrain - Granular resynthesis. GHi - High-pass filter. GLow - Low-pass filter. GLFO - Triple LFO effect. GMax - Limiter. GMonoBass - Bass stereo imaging effect. GMulti - Multi-band compressor and stereo enhancer. GNormal - Noise generator for avoiding denormal problems. GRevDly - Reverse delay effect. GSnap - Pitch-correction. GTune - Chromatic tuner.

  • ELE - the Excellent Low-latency Effects
  • ExEf (Extreme Effect) is an extremely powerful and flexible Real Time effect engine running on a PC under LINUX. It is designed to work with guitars, microphones and other instruments. It can run both in X Window System and command line.

  • Creox is a real-time sound processor. You can plug your electric guitar or any other musical instrument directly to the PC's sound card and start experimenting with various sound effects. Creox has a nice user-friendly GUI, a preset support, a low-latency DSP engine and each effect parameter can be altered "on the fly".

  • Louderbox is a complete 8 band audio processor. Louderbox is intended to be used with software stereo and R[B]DS generators (but perfectly usable for other things (such as web "radio") using the jack audio connection kit under Linux (and possibly other systems but this is untested). LADSPA.

  • Mustajuuri - an audio signal processing application and toolkit. It is designed to meet wide range of needs. The first and foremost is real-time effects processing. Mustajuuri can process guitar, vocals or any instrument with ease. It is also useful if you have a virtual reality system with more than 10 loudspeakers and you wonder how to control them all :-)

  • BetabugsAudio :::plug-ins - here you will find the plug-ins that we have available for download. These will have download buttons beneath them. Any GUIs without a download button are currently in development and not presently available. All other completed GUIs that are currently in need of a caring and affectionate programmer are available for viewing on the "job ads" page.

Anarchy Effects is a cross-platform bundle consisting of 5 audio plugins, each of which does a different novel form of frequency domain processing. VST versions now comply to VST 2.4 standard and plugins in different formats are available for both Mac and PC. Features: Parameter automation using MIDI controllers or VST automation. Complies with VST 2.4 standard. 32-bit & 64-bit versions for Mac (VST/AU) & PC (VST).

  • SpectralAutopan – assigns different pan positions to the different component pitches in the input signal. The effect pitch has on pan position is controlled by control points, which can change in pan position and pitch according to LFOs. This adds stereo depth and motion to sounds.
  • Corkscrew – mixes together multiple pitch-shifts of the input signal, increasing or decreasing their pitches in parallel, and fading them in/out at the extremes of their range. This creates the illusion of a sound that seems to continually rise or fall, but doesn’t actually change in average pitch.
  • HarmonicAdder – creates harmonic resonances by pitch shifting the dominant frequencies in your input signal by the various intervals in the harmonic series (octave, octave+fifth, two octaves, two octaves+major third etc). These harmonics can be mixed with the dry input signal to make it more resonant, or used on their own as a new sound.
  • LengthSeparator – bisects the input signal according to the lengths of its component frequencies. Short sounds become the ‘transient’ part, long sounds become the ‘stable’ part. These parts can be isolated (ie the other part removed), or assigned different pan positions to create stereo movement.
  • Convoluter – applies a convolution matrix to the spectral representation of the input signal. This bends the sound along the continuum between pure sine tones and pure noise.
  • GeoSynth - a vst instrument made several years ago, but never released – I wasn’t as excited as I’d hoped about the sounds it made. But it’s here now so you can judge for yourself. There’s no doubting that it’s a great idea, whether the sounds light your candle or not.
  • SwarmSynth - a vst instrument which uses a flocking algorithm to control a bank of oscillators as they move through an envelope-constrained 5 dimensional parametric hyperspace.

  • Svep Phaser - Flanger - Chorus (VST + AU + AAX) - Svep is a stereo modulation filter effect suitable for any sound. All parameters are easily editable in one screen and the clean and responsive user interface encourages creativity. Tweak it to produce anything from old-school phasers to subtle choruses.
  • SyndtSphere - VST + AU. basically a sphere version of the polyphonic synthesizer Syndt. With a minimalistic approach, it features a unique experience of ”surfing” between presets. All parameters are morphed according to the proximity of the different presets. By rotating a sphere that consists of more than 70 professionally created states, anyone can dial in the perfect sound without having to deal with specific parameters. In addition to the sphere, a ping-pong delay and a few more global settings are available.

  • Tweakbench - free VST instruments and free VST effects


  • KVR Audio is a community and news site for popular Audio Plug-in formats and related subjects, such as sample libraries and mobile apps. Our mission is to supply up to date news to VST, AU, RTAS, DX and DSSI/LADSPA plug-in and iOS and Android App users in a friendly, up-front and timely manner.

Audio and MIDI looping

See also Sampling#Audio looping and MIDI#MIDI looping


  • Giada - a free, minimal, hardcore audio tool for DJs, live performers and electronic musicians. How does it work? Just pick up your channel, fill it with samples or MIDI events and start the show by using this tiny piece of software as a loop machine, drum machine, sequencer, live sampler or yet as a plugin/effect host. Giada aims to be a compact and portable virtual device for Linux, Mac OS X and Windows for production use and live sets.

What can you control with MIDI:

  • Global elements — sequencer, metronome, main volumes and so on, stored inside the configuration file and you set them once;
  • Per-channel elements — channel on/off, mute, volume, solo and so on, stored inside the patch and you set them whenever you create a new song.

No MIDI mod system, each binding is 'channel' specific ('channel' being the Giada term for a sample or sequence), which doesn't seem like it would scale well.

Algorithmic / generative

oof mess

  • - the technique of using algorithms to create music. Algorithms (or, at the very least, formal sets of rules) have been used to compose music for centuries; the procedures used to plot voice-leading in Western counterpoint, for example, can often be reduced to algorithmic determinacy. The term is usually reserved, however, for the use of formal procedures to make music without human intervention, either through the introduction of chance procedures or the use of computers. Some algorithms or data that have no immediate musical relevance are used by composers as creative inspiration for their music. Algorithms such as fractals, L-systems, statistical models, and even arbitrary data (e.g. census figures, GIS coordinates, or magnetic field measurements) have been used as source materials.

  • - field of study among musicians and computer scientists with a goal of producing successful pop music algorithmically. It is often based on the premise that pop music is especially formulaic, unchanging, and easy to compose. The idea of automating pop music composition is related to many ideas in algorithmic music, Artificial Intelligence (AI) and computational creativity.



  • BackupBand - a music auto-arranger. It has a virtual drummer, bassist, and rhythm guitarist. These 3 "musicians" follow your chord changes live (as you play some MIDI instrument, such as a keyboard) and they play along with you in perfect time. It's like having a live rhythm section backing you up. The rhythm section knows how to play in 60 different styles such as Rock, Disco, HipHop, Heavy Metal, Reggae, Swing, various latin styles, etc. You can also create your own styles for them to play. The bassist plays a rickenbacker, fender precision, synth, and double (acoustic) bass. The guitarist plays a les paul, steel string, and nylon string. The drummer plays 6 kits. You can also create your own multi-sampled guitars, basses, and kits for them to play.


  • Expresseur - Play any score, improvise over a chord-grid, invent new instruments. Even if you are not a musician. Load a MusicXML score. Expresseur compiles the notes of the score. You interpret the rythm with your feeling, interacting with the other musicians/singers. Select a list of chords. Play the Chords, the bass, the scales. Always in tune! Invent new instruments. Connect any sensor. Script new music logic.


  • MMA—Musical MIDI Accompaniment - an accompaniment generator. It creates MIDI tracks for a soloist to perform over from a user supplied file containing chords and MMA directives. MMA is very versatile and generates excellent tracks. It comes with an extensive user-extendable library with a variety of patterns for various popular rhythms, detailed user manuals, and several demo songs.
  • LinuxBand is a GUI front-end for MMA (Musical MIDI Accompaniment). Type in the chords, choose the groove and LinuxBand will play a musical accompaniment for you. It’s an open source alternative to Band-in-a-Box.
  • LeMMA is a simple GUI “front-end” written in Python for MMA (Musical MIDI Accompaniment – also written in Python). I wrote it so that I can easily churn out chord progressions. Just enter the chords, select the grooves and press “Play”. Should work for both Linux and Windows.


  • Ryth(M)aid + GUI - Little GUI - Jazz - Practice - Program, which plays a bass, drums and piano track based on a given set of chord changes and probabilities. Uses TSE3 for midi code and gtk+ for GUI.


  • SoundHelix - a free, versatile and flexible Java framework for composing and playing algorithmic random music based on constrained random generation (CRG). SoundHelix is an algorithmic random music generator (including a built-in MIDI sequencer) which can play generated songs on MIDI devices in real-time. It can also write the generated songs as MIDI files.



  • Machina is a MIDI sequencer based on Finite State Automata. A machine can be constructed manually from the user interface, recorded from MIDI input (free-form or step), or loaded from a MIDI file. The probability of arcs can be manipulated to build a machine that produces structured but constantly changing output. This way, Machina can be used as a generative recording tool that plays back patterns similar to, but not identical to, the original input.


  • Infno - an algorithmic generator of electronic dance music fully implemented in SuperCollider 3, the latest generative music work in a line of 'Infinite Length Pieces'. The program attempts to model the production of electropop and dance music styles with a closer union of parts than typical in many previous algorithmic composition systems. Voices (including a percussion section, bass, chord and lead lines) are not independently created: the parts can cross-influence each other based on underlying harmonic ideas, rhythmic templates and already generated lines. The eventual system posits a potential for any one part to influence or in the extreme force the recalculation of another, both from top-down and bottom-up information impacting on compositional preference. In particular, dynamic programming is used to choose melodic lines for counter melodies under cost constraints of register, harmonic template, existing voices and voice leading heuristics.



  • Impro-Visor - short for “Improvisation Advisor”, is a music notation program designed to help jazz musicians compose and hear solos similar to ones that might be improvised. The objective is to improve understanding of solo construction and tune chord changes. There are other, secondary, things it can do, such as improvise on its own. It has also been used for transcription. Because rhythm-section (e.g. piano, bass, drums) accompaniment is automatically generated from chords, Impro-Visor can be used as a play-along device. Now having a wider array of accompaniment styles, its use is not limited to jazz.


  • bassline - generates randomly wandering jazz-like basslines. It mostly plays one note per beat, but sometimes throws in a pair of swung quavers. It tends to keep moving either up or down, but sometimes it turns round and starts going the other way. It prefers intervals of one and two semitones, but from time to time it throws in larger intervals.

to sort

  • Tapis Bourgeois is a batch music generator for contrepoint and tonal music, written in PERL. It outputs a partially random midifile, constrained by the mesure-wise configuration file.
  • Cycle Plus One is a musical pattern generator that can be used to explore interesting sonic experiences. Using a fixed rhythmic profile, a steady eighth note pulse, you can experiment with tonality and density within a specified cycle of beats. The application can help composers create a matrix of tones that can be exported as MIDI or musical XML. From there, the material can be worked with further in a sequencing or music notation program. Cycle Plus One is meant to be a starting point for experimentation, allowing the composer to play with variation using a minimal amount of musical material.
  • GRAMophone II is an algorithmic generator of music composition. The music is generated using two kinds of formal grammar: Chomsky's regular grammar (or Type 3) for a TOP DOWN approach to the composition and a reduced version of Lindenmayer grammar for a BOTTOM UP approach.

  • Grammidity - a preliminary release of a reworking of a reworking of a program I wrote back in the mid 90's. This is a genetic programming system based on the idea of using a kind of grammar as the underlying gene. Grammars are stored in a raw format, then parsed into productions. Those productions are expanded. The resulting strings are then evaluated (usually in a kind of basic stack machine) and the evaluations rated and evolved.

  • Cycle Plus One - a musical pattern generator that can be used to explore interesting sonic experiences. Using a fixed rhythmic profile, a steady eighth note pulse, you can experiment with tonality and density within a specified cycle of beats. The application can help composers create a matrix of tones that can be exported as MIDI or musical XML. From there, the material can be worked with further in a sequencing or music notation program. Cycle Plus One is meant to be a starting point for experimentation, allowing the composer to play with variation using a minimal amount of musical material.

  • Random Phase Music Generator - small program that generates random phase music. Phasing is the process of looping the same pattern of music on two or more tape recorders (running at slightly different speeds), so they will slowly shift out of synchronization and produce the out of phase effect. Play with this program to experiment with this technique!

  • JChordBox - a library tool that can generate backing tracks from a chord progression and a music style (containing music templates or grooves). A music Style is describe using an XML and a MIDI file. You can generate an XML Style file from a MIDI file by adding markers to delimit grooves (or music templates). An XML Song file describes a chord progression and sets the music style to use. JChordBox comes with several command line tools (GenerateSong, CreateStyleFromMidiFile, SongPlayer …).

  • chasp - creates simple accompanying pieces of different genres. To accomplish this ASP is used to solve the problem of chord progressions, based on the rules proposed by the theory of harmony. This results into a harmonic sequence that eventually provides the basis for the creation of simple musical pieces by applying genre-specific templates, through an additional imperative control framework.

  • - MMG is a simple command-line tool that can generate musical patterns is based on "the Molecular Music Box" by Duncan Lockerby. The properties of the algorithm can easily be defined in a JSON file, which will then be rendered into a MIDI file, which can in turn be opened in DAW music software or be played back by synthesizers.

  • - The Mozart Digital Composer project is an attempt at a heuristics based approach, in place of traditional neural networks, for creating computer generated music. The basic principle is to use computer software, in this case a Java SE application, to compose and play, using a MIDI device, music that imitates, as closely as possible, music composed by a Human. The project is currently under active development and many planned features, such as the ability to select the type of scale (e.g major, minor, mixolydian, etc) and key, are in the works.

  • - a free Csound generative beat tool - part of a larger compositional environment I'm building. It will constantly create perfectly looped rhythms that can easily be dropped into a program like Ableton Live. Leave it running all night, and you'll wake up with hundreds of loops in your folder! It sounds similar to a Nord Drum or modular analog drums. It can also produce some 808-ish sounds.

  • Jnana - a generative musical accompaniment system integrated into Ableton Live. It has the ability to analyze MIDI input and generate new material in a similar style. It can analyze input in real-time or from desired clips within Ableton and can populate Ableton clips with new material.

  • Muzoti - a new approach to music composition based on a theory of human evolution. Compose with Muzoti and give your loved ones a completely unique album of classical music. [70]

  • QuasiMusic - turns quasiperiodic tilings of the plane into something like a MIDI version of a player-piano roll. This was inspired by a suggestion made by Akira Bergman in a discussion with John Baez. [71]

  • - computer accompaniment that learns in real-time from the human host pianist. When the host pianist stops playing for a given amount of time, the computer AI will then improvise in the space using the style learned from the host.

One-line algorithmic C

  • - a bytebeat synthesizer implemented on the Arduino. It's not the first bytebeat synthesizer on the Arduino, but I think it's the first that does real-time composite video visualizations of the signal, using the TVout library hacked to remove its audio output.

Pure Data

Max for Live

  • Jnana - Jnana Live plug-in will take live input from the Ableton track and integrate it into the analysis each time a phrase has “ended”. The end of a phrase is determined simply by noticing when the input stops for a given amount of time. When an input phrase has completed, the plug-in has the ability to auto-generate a “response” to this phrase based on all the phrases seen thus far.


  • - a lexicon of systems and research. This site provides a comprehensive research resource for computer aided algorithmic music composition, including over one-thousand research listings, over one hundred system listings, cross referenced links to research, links to software downloads and documentation, and web-based tools for searching and filtering the complete lexicon.

  • Procedural Audio Now! - is a monthly Queen Mary University of London meetup for people interested in developing Procedural Audio systems for video games and other interactive media.


  •ürfelspiel - was a system for using dice to randomly 'generate' music from precomposed options. These 'games' were quite popular throughout Western Europe in the 18th century. Several different games were devised, some that did not require dice, but merely 'choosing a random number.'

  • - employed in algorithmic music composition, particularly in software such as CSound, Max and SuperCollider. In a first-order chain, the states of the system become note or pitch values, and a probability vector for each note is constructed, completing a transition probability matrix (see below). An algorithm is constructed to produce output note values based on the transition matrix weightings, which could be MIDI note values, frequency (Hz), or any other desirable metric.

  • Computer Music Algorithms (4th Ed 2018) - explains algorithmic music and contains 57 programs, 20 styles, & 24 chapters that will generate music of different styles, new each time executed, as a midi file. You need only a 'c' compiler and a midi player to play the midi files generated. The 'styles' produce music from ancient Greek music to the algorithm of Kircher (1650) to all styles including the game music and the author's own unique fractal music and others of his. The algorithms are explained in technical terms of music theory, including the special data structures constructed. There are folders of c files for each chapter with code, that may be compiled and when executed, generate the midi files. The last program added here is the ancient Greek music program, which included two years of intense research. Also, my fractal music programs are a result of 20 years of development and improvement. Reverse music was nearly my dissertation topic, but it was solved in one of the chapters.

  • GERP - an attempt to generate stylistically valid EDM using human-informed machine-learning. We have employed experts (mainly Chris Anderson) to hand-transcribe 100 tracks in four genres: Breaks, House, Dubstep, and Drum and Bass. Aspects of transcription include musical details (drum beats, percussion parts, bass lines, melodic parts), timbral descriptions (i.e. “low synth kick, mid acoustic snare, tight noise closed hihat”), signal processing (i.e. the use of delay, reverb, compression and its alteration over time), and descriptions of overall musical form. This information is then compiled in a database, and machine analysed to produce data for generative purposes. Two different systems have been created to interpret this data: GESMI (created by Arne Eigenfeldt/loadbang) and GEDMAS (created by Chris Anderson/Pittr Patter). GEDMAS began producing EDM tracks in June 2012, while GESMI produced her first fully autonomous generation in March 2013. It is interesting to note the similarities of the systems (due to the shared corpus) and the differences (due to the different creative choices made in the implementation). Closed source?
  • Cellular automata and music - Take computers, mathematics, and the Java Sound API, add in some Java code, and you've got a recipe for creating some uniquely fascinating music. IBM Staff Software Engineer Paul Reiners demonstrates how to implement some basic concepts of algorithmic music composition in the Java language. He presents code examples and resulting MIDI files generated by the Automatous Monk program, which uses the open source jMusic framework to compose music based on mathematical structures called cellular automata.

  • MusicAlgorithms - interactive tools that provide a unique learning experience for users, regardless of their musical training. Students of music composition can explore algorithmic composition, while others can create musical representations of models for the purpose of aural interpretation and analysis. Here, the algorithmic process is used in a creative context so that users can convert sequences of numbers into sounds.


  • - the audio counterpart to evolutionary art, whereby algorithmic music is created using an evolutionary algorithm. The process begins with a population of individuals which by some means or other produce audio (e.g. a piece, melody, or loop), which is either initialized randomly or based on human-generated music. Then through the repeated application of computational steps analogous to biological selection, recombination and mutation the aim is for the produced audio to become more musical. Evolutionary sound synthesis is a related technique for generating sounds or synthesizer instruments. Evolutionary music is typically generated using an interactive evolutionary algorithm where the fitness function is the user or audience, as it is difficult to capture the aesthetic qualities of music computationally. However, research into automated measures of musical quality is also active. Evolutionary computation techniques have also been applied to harmonization and accompaniment tasks. The most commonly used evolutionary computation techniques are genetic algorithms and genetic programming.



  • - a programmable virtual drum machine based on algorhythmic and evolving beats. An 8-channel drum machine and beat generator supporting .wav sample playback and MIDI. It generates patterns up to 4 measures long in any of six time signatures and at virtually any BPM. It allows the user to mutate or regenerate any or all of the drum sounds in a pattern or to edit them by hand via the pattern editing window.

Neural net



Windows VST

  • Cube Breath - a standalone application for Windows created with SynthEdit and SAVIhost, but will also run as a VST effect. It is a realtime, fully automatic pop music generator. Audio input is vocoded in tune with the music, allowing the user to "instantly transform their shopping list, answering machine messages or office memos into floor-filling number one hit singles".

  • Breath Cube - employs the 'pop music generating' engine of Cube Breath and adds a 3-band synthetic voice to "challenge your listening skills". Breath Cube was created using SynthEdit and runs as a standalone with the included SAVIHost application or as a VST plug-in.


  • Mixtikl is a dedicated, integrated & powerful multi-platform generative music editor, mixer, arranger and cell sequencer. It includes many modifiable generative music templates that you can easily mix together. To generate its sounds it features the Partikl Sound Engine, a powerful sound source comprising a modular synth with Soundfont (SF2)/DLS support + live FX. - $
  • Noatikl 3 Noatikl, an immensely deep & powerful app for generative MIDI music composition AND sound design, Generative Music Composer → for iOS, Mac, Windows, VST/AU. - $



  • Boodler is an open-source soundscape tool -- continuous, infinitely varying streams of sound. Boodler is designed to run in the background on a computer, maintaining whatever sound environment you desire. Boodler is extensible, customizable, and modular. Each soundscape is a small piece of Python code -- typically less than a page. A soundscape can incorporate other soundscapes; it can combine other soundscapes, switch between them, fade them in and out. This package comes with many example soundscapes. You can use these, modify them, combine them to arbitrary levels of complexity, or write your own.


Random Parallel Player

  • - Takes a bunch of audio files as tracks and plays them back randomly creating new music each playthrough. The core rule of RPP: No human interaction once the playback has started. RPP is based on an idea of Louigi Verona. The included audio samples in example.rpp were created by him. You can read about the original project here


  • - a VSTi plugin which generates random atmospheric soundscapes with samples scraped from FreeSound. We had originally imagined that the plugin colud generate soundscapes resembling parks, public places, nature, etc. However, the resulting sounds that it makes are generally quite surreal and creepy, hence the name. :)


  • - a cross-platform desktop app which runs in menubar. Foco boosts your productivity by creating perfect productive environment. It has the best sounds for getting work done .


  • myNoise - background noises and relaxing soundscape generator, web/app


See also Synthesis#Graphics synthesis

  • - the use of non-speech audio to convey information or perceptualize data. Auditory perception has advantages in temporal, spatial, amplitude, and frequency resolution that open possibilities as an alternative or complement to visualization techniques. For example, the rate of clicking of a Geiger counter conveys the level of radiation in the immediate vicinity of the device.

  • CodeSounding - CodeSounding is an open source sonification framework which makes possible to hear how any existing Java program "sounds like", by assigning instruments and pitches to code statements (if, for, etc) and playing them as they are executed at runtime. In this way the flowing of execution is played as a flow of music and its rhythm changes depending on user interaction.

  • HyperMammut - transform sounds to images and vice-versa using single BIG Fourier Transforms (or DCT/DST,etc.).


  • Scalable Music: Automatic Music Retargeting and Synthesis - S. Wenner, J.C. Bazin, A. Sorkine-Hornung, C. Kim, M. Gross. In this paper we propose a method for dynamic rescaling of music, inspired by recent works on image retargeting, video reshuffling and character animation in the computer graphics community. Given the desired target length of a piece of music and optional additional constraints such as position and importance of certain parts, we build on concepts from seam carving, video textures and motion graphs and extend them to allow for a global optimization of jumps in an audio signal. Based on an automatic feature extraction and spectral clustering for segmentation, we employ length-constrained least-costly path search via dynamic programming to synthesize a novel piece of music that best fulfills all desired constraints, with imperceptible transitions between reshuffled parts. We show various applications of music retargeting such as part removal, decreasing or increasing music duration, and in particular consistent joint video and audio editing.

  • - a python library that aims to make it easy to create audio by piecing together bits of other audio files. This library was originally written to enable my research in audio editing user interfaces, but perhaps someone else might find it useful.


See also WebDev#Web Audio API, Drumming#Web



  • FMIT (Free Music Instrument Tuner) is a graphical utility for tuning your musical instruments, with error and volume history and advanced features.
  • LINGOT is a musical instrument tuner. It's accurate, easy to use, and highly configurable. Originally conceived to tune electric guitars, it can now be used to tune other instruments. It looks like an analogue tuner, with a gauge indicating the relative shift to a certain note, found automatically as the closest note to the estimated frequency.
  • - an oscilloscope style guitar tuner. It uses jack for audio input and opengl for rendering. The signal is displayed in both normal and XY mode, using an automatically selected not as the reference.



  • Audiogrep - transcribes audio files and then creates "audio supercuts" based on search phrases. It uses CMU Pocketsphinx for speech-to-text and pydub to stitch things together. [79]

  • Simon is an open source speech recognition program that can replace your mouse and keyboard. The system is designed to be as flexible as possible and will work with any language or dialect.

  • Pi-Voice - The beginnings of a Star Trek-like computer. Run the program, speak into your microphone and hear the response from your speakers.

  • - makes it easy for developers to build applications and devices that you can talk or text to. Our vision is to empower developers with an open and extensible natural language platform. learns human language from every interaction, and leverages the community: what’s learned is shared across developers.


to sort/categorise




  • Festival - or The Festival Speech Synthesis System, offers a general framework for building speech synthesis systems as well as including examples of various modules. As a whole it offers full text to speech through a number APIs: from shell level, though a Scheme command interpreter, as a C++ library, from Java, and an Emacs interface. Festival is multi-lingual (currently English (British and American), and Spanish) though English is the most advanced. Other groups release new languages for the system. And full tools and documentation for build new voices are available through Carnegie Mellon's FestVox project (


  • Festvox - aims to make the building of new synthetic voices more systemic and better documented, making it possible for anyone to build a new voice. Specifically we offer: Documentation, including scripts explaining the background and specifics for building new voices for speech synthesis in new and supported languages. Example speech databases to help building new voices. Links, demos and a repository for new voices. This work is firmly grounded within Edinburgh University's Festival Speech Synthesis System and Carnegie Mellon University's small footprint Flite synthesis engine.


  • MaryTTS is an open-source, multilingual Text-to-Speech Synthesis platform written in Java. It was originally developed as a collaborative project of DFKI’s Language Technology Lab and the Institute of Phonetics at Saarland University. It is now maintained by the Multimodal Speech Processing Group in the Cluster of Excellence MMCI and DFKI.


  • eSpeak is a compact open source software speech synthesizer for English and other languages, for Linux and Windows. eSpeak uses a "formant synthesis" method. This allows many languages to be provided in a small size. The speech is clear, and can be used at high speeds, but is not as natural or smooth as larger synthesizers which are based on human speech recordings.

OpenSource SpeechSynth


Assistive Context-Aware Toolkit


  • Praat - doing phonetics by computer


  • gnuspeech - makes it easy to produce high quality computer speech output, design new language databases, and create controlled speech stimuli for psychophysical experiments. gnuspeechsa is a cross-platform module of gnuspeech that allows command line, or application-based speech output. The software has been released as two tarballs that are available in the project Downloads area of [85]

Project Merlin



Adobe VoCo

VST Speek




  • IPOX - an experimental, all-prosodic speech synthesizer, developed many years ago by Arthur Dirksen and John Coleman. It is still available for downloading, and was designed to run on a 486 PC running Windows 3.1 or higher, with a 16-bit Windows-compatible sound card, such as the Soundblaster 16. It still seems to run on e.g. XP, but I haven't tried it on Vista.


Pink Trombone

  • Pink Trombone - Bare-handed procedural speech synthesis, version 1.1, March 2017, by Neil Thapen




  • ESPS - Entropic Signal Processing System, is a package of UNIX-like commands and programming libraries for speech signal processing. As a commercial product of Entropic Research Laboratory, Inc, it became extremely widely used in phonetics and speech technology research laboratories in the 1990's, in view of the wide range of functions it offered, such as get_f0 (for fundamental frequency estimation), formant (for formant frequency measurement), the xwaves graphical user interface, and many other commands and utilities. Following the acquisition of Entropic by Microsoft in 1999, Microsoft and AT&T licensed ESPS to the Centre for Speech Technology at KTH, Sweden, so that a final legacy version of the ESPS source code could continue to be made available to speech researchers. At KTH, code from the ESPS library (such as get_f0) was incorporated by Kåre Sjölander and Jonas Beskow into the Wavesurfer speech analysis tool. This is a very good alternative way to use many ESPS functions if you want a graphical user interface rather than scripting.

NICO toolkit

  • NICO toolkit - mainly intended for, and originally developed for speech recognition applications, a general purpose toolkit for constructing artificial neural networks and training with the back-propagation learning algorithm. The network topology is very flexible. Units are organized in groups and the group is a hierarchical structure, so groups can have sub-groups or other objects as members. This makes it easy to specify multi-layer networks with arbitrary connection structure and to build modular networks.

Speech Research Tools

  • - Software for speech research. It includes programs and libraries for signal processing, along with general purpose scientific libraries. Most of the code is in Python, with C/C++ supporting code. Also, contains code releases corresponding to publishe


  • Higgins Annotation Tool - can be used to transcribe and annotate speech with one or more audio tracks (such as dialogue). Windows.


See also Video

  • Xjadeo is a software video player that displays a video-clip in sync with an external time source (MTC, LTC, JACK-transport). Xjadeo is useful in soundtrack composition, video monitoring or any task that requires to synchronizing movie frames with external events.



  • The Box of No Return - a Linux-based musical synthesizer platform, suitable for live musicianship, designed to handle multiple patches with enormous demands, and switch between them with zero delay and zero cutout.  If you sit in your home studio and use single SoundFonts with a laptop and simple GUI, you don't need this.  If you play live, and pile on the tone generators and filters in patch development in order to feel and deliver the unyielding power of the musical harmonic roar, a full implementation of the BNR may suit you well.  There are obviously middle grounds too ☺, and there are articles here to help in general.


  • FRACT - a musical exploration game. You arrive in a forgotten place and explore the unfamiliar landscape to discover the secrets of an abandoned world that was once built on sound. As you start to make sense of this strange new environment, you work to rebuild its machinery by solving puzzles and bring the world back to life by shaping sound and creating music in the game.


  • Augment - an amazing way to listen to the world. It harmonizes your listening experience which helps you to be less distracted and stressed. The Augment app filters your acoustic environment, takes out harsh sounds and turns stressful noise into harmonic sound environments. Try it now, it's a free download on the app Store!