Regex
General
- https://en.wikipedia.org/wiki/Regular_expression - a sequence of characters that define a search pattern, mainly for use in pattern matching with strings, or string matching, i.e. "find and replace"-like operations. The concept arose in the 1950s, when the American mathematician Stephen Kleene formalized the description of a regular language, and came into common use with the Unix text processing utilities ed, an editor, and grep, a filter.
In modern usage, "regular expressions" are often distinguished from the derived, but fundamentally distinct concepts of regex or regexp, which no longer describe a regular language. See below for details.
Regexps are so useful in computing that the various systems to specify regexps have evolved to provide both a basic and extended standard for the grammar and syntax; modern regexps heavily augment the standard. Regexp processors are found in several search engines, search and replace dialogs of several word processors and text editors, and in the command lines of text processing utilities, such as sed and AWK.
Many programming languages provide regexp capabilities, some built-in, for example Perl, JavaScript, Ruby, AWK, and Tcl, and others via a standard library, for example .NET languages, Java, Python, POSIX C and C++ (since C++11). Most other languages offer regexps via a library.
POSIX
PCRE
See also Languages#Perl
- PCRE - Perl Compatible Regular Expressions - The PCRE library is a set of functions that implement regular expression pattern matching using the same syntax and semantics as Perl 5. PCRE has its own native API, as well as a set of wrapper functions that correspond to the POSIX regular expression API. The PCRE library is free, even for building proprietary software.
Guides
- http://stackoverflow.com/questions/16621738/d-less-efficient-than-0-9
- http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags
Web tools
- RegExr is an online tool to learn, build, & test Regular Expressions (RegEx / RegExp). Results update in real-time as you type. Roll over a match or expression for details. Save & share expressions with others. Explore the Library for help & examples. Undo & Redo with Ctrl-Z / Y. Search for & rate Community patterns.
- Debuggex - Online visual regex tester. JavaScript, Python, and PCRE.
- Build RegEx - A Regular Expression GUI.
- RegexPalregexpal — a JavaScript regular expression tester
- Automatic Generation of Regular Expressions from Examples - with examples and generation from csv
- Rubular - a Ruby regular expression editor
- Regexper - JS
- PCREck - a multi-dialect regular expression editor
- Regular Expression Analyzer - An online regular expression tool that helps analyzing regular expression structure.
- txt2re - regular expression generator (perl php python java javascript coldfusion c c++ ruby vb vbscript j# c# c++.net vb.net)
Search
grep
- https://en.wikipedia.org/wiki/grep - a command-line utility for searching plain-text data sets for lines matching a regular expression. Its name comes from the ed command g/re/p (globally search a regular expression and print), which has the same effect: doing a global search with the regular expression and printing all matching lines. Grep was originally developed for the Unix operating system, but is available today for all Unix-like systems.
grep "apple" *.txt
grep ^a.ple oldbashimplementations.txt # begin with the letter a, followed by any one character, followed by the letter sequence ple.
grep "stuff" sqldump.sql | fold -w 200 | grep -C 1 "stuff"
The first grep gets the (mile-wide) line that has the match, then fold will split the mile-wide line into 200 char long lines, and "grep -C 1" will show only the one 200 char wide line where the match is + 1 line of context before and after. [9]
sgrep
- sgrep - search a file for a structured pattern
sift
ack
- ack - a tool like grep, designed for programmers with large heterogeneous trees of source code, ack is written purely in portable Perl 5 and takes advantage of the power of Perl's regular expressions.
ag
- The Silver Searcher - or Ag is a tool for searching code. It started off as a clone of Ack, but their feature sets have since diverged slightly. In typical usage, Ag is 5-10x faster than Ack.
ag --hidden --ignore .git --ignore .winscp -l -g "" # lists all files
ripgrep
strings
- https://en.wikipedia.org/wiki/strings_(Unix) - a program in Unix-like operating systems that finds and prints text strings embedded in binary files such as executables. It can be used on object files and core dumps. Strings are recognized by looking for sequences of at least 4 (by default) printable characters terminating in a NUL character (that is, null-terminated strings). Some implementations provide options for determining what is recognized as a printable character, which is useful for finding non-ASCII and wide character text. Common usage includes piping its output to grep and fold or redirecting the output to a file.
Google Code Search
- https://github.com/google/codesearch - Fast, indexed regexp search over large file trees
qgrep
- https://github.com/zeux/qgrep - Fast regular expression grep for source code with incremental index updates
CUDA grep =
- CUDA grep - We successfully created a parallel regular expression matcher using CUDA for GPUs. Our implementation is anywhere from 2x-10x faster than grep depending on the workload and about 68x faster than the perl regex engine. We think that this makes it a viable candidate for use in the real world. [13]
Search and replace
- regular expressions 101 — an online regex tester for javascript, php, pcre and python.
- My Regex Tester - PHP PCRE with search and replace
- REGex TESTER - ver. 1.5.3
- Regex Tester 2.0 alpha
- regexxer is a nifty GUI search/replace tool featuring Perl-style regular expressions
- Hyperscan is a high-performance multiple regex matching library. It follows the regular expression syntax of the commonly-used libpcre library, yet functions as a standalone library with its own API written in C. Hyperscan uses hybrid automata techniques to allow simultaneous matching of large numbers (up to tens of thousands) of regular expressions, as well as matching of regular expressions across streams of data. [14]
sed
echo "test string oldWord yadayada" | sed 's/oldWord/newWord/g'
sed -i 's/search/replace#/' filename sed -i 's#test#replace#' filename # in-place editing of a file, alternative separators
find . -name "*.html" -exec sed -i "s/oldWord/newWord/g" '{}' \; replace text in multiple files [16]
echo "<a href="index.html"><img src="logo.svg" id="site-logo"></a> <h1>Site Title</h1>" | sed 'N; s@</a>\ <h1>Site Title</h1>@\ <h1>Site Title</h1></a>@g' multiline replacement
awk
awk \'{print $NF;}\
# "GG TC CC" to "G G T C C C" awk ' { gsub("GG","G G");gsub("TC","T C");gsub("CC","C C");print } ' file # [18]
- https://ia802309.us.archive.org/25/items/pdfy-MgN0H1joIoDVoIC7/The_AWK_Programming_Language.pdf [20]
- mawk – pattern scanning and text processing language - an interpreter for the AWK Programming Language.
- https://github.com/cup/lake - Portable standard library for Awk
- GoAWK, an AWK interpreter written in Go - GoAWK: an AWK interpreter written in Go [22]
sd
- https://github.com/chmln/sd - an intuitive find & replace CLI.
bsed
- https://github.com/andrewbihl/bsed - Simple, english syntax on top of Perl text processing. Designed to replace simple uses of sed/grep/AWK/Perl. Bsed is a stream editor. In contrast to interactive text editors, stream editors process text in one go, applying a command to an entire input stream or open file. [23]
Library
- Regex Colorizer - JS library
Other
- Regex Crossword is a crossword puzzle game, where the crossword clues are defined using regular expressions. [24] [25]
- https://codegolf.stackexchange.com/questions/17718/meta-regex-golf
- http://nbviewer.ipython.org/url/norvig.com/ipython/xkcd1313.ipynb [26]
- http://regex.alf.nu/