From Things and Stuff Wiki
Jump to: navigation, search


  • - a sequence of characters that define a search pattern, mainly for use in pattern matching with strings, or string matching, i.e. "find and replace"-like operations. The concept arose in the 1950s, when the American mathematician Stephen Kleene formalized the description of a regular language, and came into common use with the Unix text processing utilities ed, an editor, and grep, a filter.

In modern usage, "regular expressions" are often distinguished from the derived, but fundamentally distinct concepts of regex or regexp, which no longer describe a regular language. See below for details.

Regexps are so useful in computing that the various systems to specify regexps have evolved to provide both a basic and extended standard for the grammar and syntax; modern regexps heavily augment the standard. Regexp processors are found in several search engines, search and replace dialogs of several word processors and text editors, and in the command lines of text processing utilities, such as sed and AWK.

Many programming languages provide regexp capabilities, some built-in, for example Perl, JavaScript, Ruby, AWK, and Tcl, and others via a standard library, for example .NET languages, Java, Python, POSIX C and C++ (since C++11). Most other languages offer regexps via a library.



See also Languages#Perl

  • PCRE - Perl Compatible Regular Expressions - The PCRE library is a set of functions that implement regular expression pattern matching using the same syntax and semantics as Perl 5. PCRE has its own native API, as well as a set of wrapper functions that correspond to the POSIX regular expression API. The PCRE library is free, even for building proprietary software.


Web tools

  • RegExr is an online tool to learn, build, & test Regular Expressions (RegEx / RegExp). Results update in real-time as you type. Roll over a match or expression for details. Save & share expressions with others. Explore the Library for help & examples. Undo & Redo with Ctrl-Z / Y. Search for & rate Community patterns.

  • Debuggex - Online visual regex tester. JavaScript, Python, and PCRE.

  • Rubular - a Ruby regular expression editor
  • PCREck - a multi-dialect regular expression editor
  • txt2re - regular expression generator (perl php python java javascript coldfusion c c++ ruby vb vbscript j# c#



  • - a command-line utility for searching plain-text data sets for lines matching a regular expression. Its name comes from the ed command g/re/p (globally search a regular expression and print), which has the same effect: doing a global search with the regular expression and printing all matching lines. Grep was originally developed for the Unix operating system, but is available today for all Unix-like systems.
grep "apple" *.txt
grep ^a.ple oldbashimplementations.txt
  # begin with the letter a, followed by any one character, followed by the letter sequence ple.
grep "stuff" sqldump.sql | fold -w 200 | grep -C 1 "stuff"

The first grep gets the (mile-wide) line that has the match, then fold will split the mile-wide line into 200 char long lines, and "grep -C 1" will show only the one 200 char wide line where the match is + 1 line of context before and after. [9]





ag --hidden --ignore .git --ignore .winscp -l -g ""
  # lists all files



  • - a program in Unix-like operating systems that finds and prints text strings embedded in binary files such as executables. It can be used on object files and core dumps. Strings are recognized by looking for sequences of at least 4 (by default) printable characters terminating in a NUL character (that is, null-terminated strings). Some implementations provide options for determining what is recognized as a printable character, which is useful for finding non-ASCII and wide character text. Common usage includes piping its output to grep and fold or redirecting the output to a file.

Google Code Search


Search and replace

  • regexxer is a nifty GUI search/replace tool featuring Perl-style regular expressions

  • Hyperscan is a high-performance multiple regex matching library. It follows the regular expression syntax of the commonly-used libpcre library, yet functions as a standalone library with its own API written in C. Hyperscan uses hybrid automata techniques to allow simultaneous matching of large numbers (up to tens of thousands) of regular expressions, as well as matching of regular expressions across streams of data. [13]


echo "test string oldWord yadayada" | sed 's/oldWord/newWord/g'
sed -i 's/search/replace#/' filename
sed -i 's#test#replace#' filename
  # in-place editing of a file, alternative separators
find . -name "*.html" -exec sed -i "s/oldWord/newWord/g" '{}' \;
  replace text in multiple files [15]
 echo "<a href="index.html"><img src="logo.svg" id="site-logo"></a>
          <h1>Site Title</h1>" | sed 'N; s@</a>\
          <h1>Site Title</h1>@\
          <h1>Site Title</h1></a>@g'
   multiline replacement


awk \'{print $NF;}\
  # "GG      TC    CC" to "G G      T C       C C"
awk ' { gsub("GG","G G");gsub("TC","T C");gsub("CC","C C");print } ' file
  # [17]



  • - Simple, english syntax on top of Perl text processing. Designed to replace simple uses of sed/grep/AWK/Perl. Bsed is a stream editor. In contrast to interactive text editors, stream editors process text in one go, applying a command to an entire input stream or open file. [22]



  • Regex Crossword is a crossword puzzle game, where the crossword clues are defined using regular expressions. [23] [24]