User Tools

Site Tools


regex:cheat_sheet

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
regex:cheat_sheet [2020/08/25 19:30] 192.168.1.1regex:cheat_sheet [2021/05/20 23:54] (current) peter
Line 1: Line 1:
 ====== Regex - Cheat Sheet ====== ====== Regex - Cheat Sheet ======
  
 +<code>
 +Cheat Sheet
 +Character classes
 +. any character except newline
 +\w \d \s word, digit, whitespace
 +\W \D \S not word, digit, whitespace
 +[abc] any of a, b, or c
 +[^abc] not a, b, or c
 +[a-g] character between a & g
 +Anchors
 +^abc$ start / end of the string
 +\b word boundary
 +Escaped characters
 +\. \* \\ escaped special characters
 +\t \n \r tab, linefeed, carriage return
 +\u00A9 unicode escaped ©
 +Groups & Lookaround
 +(abc) capture group
 +\1 backreference to group #1
 +(?:abc) non-capturing group
 +(?=abc) positive lookahead
 +(?!abc) negative lookahead
 +Quantifiers & Alternation
 +a* a+ a? 0 or more, 1 or more, 0 or 1
 +a{5} a{2,} exactly five, two or more
 +a{1,3} between one & three
 +a+? a{2,}? match as few as possible
 +ab|cd match ab or cd
 +</code>
 +
 +----
  
 ===== Basic regex ===== ===== Basic regex =====
Line 26: Line 57:
 |\s|.NET, Python 3, JavaScript: "whitespace character": any Unicode separator|a\sb\sc|a b c| |\s|.NET, Python 3, JavaScript: "whitespace character": any Unicode separator|a\sb\sc|a b c|
 |\D|One character that is not a digit as defined by your engine's \d|\D\D\D|ABC| |\D|One character that is not a digit as defined by your engine's \d|\D\D\D|ABC|
-|\W|One character that is not a word character as defined by your engine's \w|\W\W\W\W\W|*-+=)|+|\W|One character that is not a word character as defined by your engine's \w|\W\W\W\W\W|<nowiki>*-+=)</nowiki>|
 |\S|One character that is not a whitespace character as defined by your engine's \s|\S\S\S\S|Yoyo| |\S|One character that is not a whitespace character as defined by your engine's \s|\S\S\S\S|Yoyo|
  
Line 202: Line 233:
 ^Lookaround^Legend^Example^Sample Match^ ^Lookaround^Legend^Example^Sample Match^
 |(?=…)|Positive lookahead|(?=\d{10})\d{5}|01234 in 0123456789| |(?=…)|Positive lookahead|(?=\d{10})\d{5}|01234 in 0123456789|
-|(?<=…)|Positive lookbehind|(?<=\d)cat|cat in 1cat|+|<nowiki>(?<=…)</nowiki>|Positive lookbehind|<nowiki>(?<=\d)cat</nowiki>|cat in 1cat|
 |(?!…)|Negative lookahead|(?!theatre)the\w+|theme| |(?!…)|Negative lookahead|(?!theatre)the\w+|theme|
 |(?<!…)|Negative lookbehind|\w{3}(?<!mon)ster|Munster| |(?<!…)|Negative lookbehind|\w{3}(?<!mon)ster|Munster|
Line 211: Line 242:
  
 ^Class Operation^Legend^Example^Sample Match^ ^Class Operation^Legend^Example^Sample Match^
-|[…-[…]]|.NET: character class subtraction. One character that is in those on the left, but not in the subtracted class.|[a-z-[aeiou]]|Any lowercase consonant| +|<nowiki>[…-[…]]</nowiki>|.NET: character class subtraction. One character that is in those on the left, but not in the subtracted class.|<nowiki>[a-z-[aeiou]]</nowiki>|Any lowercase consonant| 
-|[…-[…]]|.NET: character class subtraction.|[\p{IsArabic}-[\D]]|An Arabic character that is not a non-digit, i.e., an Arabic digit| +|<nowiki>[…-[…]]</nowiki>|.NET: character class subtraction.|<nowiki>[\p{IsArabic}-[\D]]</nowiki>|An Arabic character that is not a non-digit, i.e., an Arabic digit| 
-|[…&&[…]]|Java, Ruby 2+: character class intersection. One character that is both in those on the left and in the && class.|[\S&&[\D]]|An non-whitespace character that is a non-digit.| +|<nowiki>[…&&[…]]</nowiki>|Java, Ruby 2+: character class intersection. One character that is both in those on the left and in the && class.|<nowiki>[\S&&[\D]]</nowiki>|An non-whitespace character that is a non-digit.| 
-|[…&&[…]]|Java, Ruby 2+: character class intersection.|[\S&&[\D]&&[^a-zA-Z]] An non-whitespace character that a non-digit and not a letter.| +|<nowiki>[…&&[…]]</nowiki>|Java, Ruby 2+: character class intersection.|<nowiki>[\S&&[\D]&&[^a-zA-Z]]</nowiki>|An non-whitespace character that a non-digit and not a letter.| 
-|[…&&[^…]]|Java, Ruby 2+: character class subtraction is obtained by intersecting a class with a negated class|.[a-z&&[^aeiou]]|An English lowercase letter that is not a vowel.| +|<nowiki>[…&&[^…]]</nowiki>|Java, Ruby 2+: character class subtraction is obtained by intersecting a class with a negated class.|<nowiki>[a-z&&[^aeiou]]</nowiki>|An English lowercase letter that is not a vowel.| 
-|[…&&[^…]]|Java, Ruby 2+: character class subtraction|[\p{InArabic}&&[^\p{L}\p{N}]]|An Arabic character that is not a letter or a number|+|<nowiki>[…&&[^…]]</nowiki>|Java, Ruby 2+: character class subtraction|<nowiki>[\p{InArabic}&&[^\p{L}\p{N}]]</nowiki>|An Arabic character that is not a letter or a number| 
 + 
 +---- 
 + 
 +===== Other Syntax ===== 
 + 
 +^Syntax^Legend^Example^Sample Match^ 
 +|\K|Keep Out.  Perl, PCRE (C, PHP, R…), Python's alternate regex engine, Ruby 2+: drop everything that was matched so far from the overall match to be returned.|prefix\K\d+|12| 
 +|\Q…\E|Perl, PCRE (C, PHP, R…), Java: treat anything between the delimiters as a literal string. Useful to escape metacharacters.|\Q(C++ ?)\E|(C++ ?)| 
 + 
regex/cheat_sheet.1598383811.txt.gz · Last modified: 2020/08/25 19:30 by 192.168.1.1

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki