Python Regular Expressions - Picone Press
PYTHON REGULAR EXPRESSIONS
rialspo int.co m/pytho n/pytho n_re g _e xpre ssio ns.htm
Co pyrig ht ? tuto rials po int.co m
A regular expression is a special sequence of characters that helps you match or find other string s or sets of string s, using a specialized syntax held in a pattern. Reg ular expressions are widely used in UNIX world.
T he module re provides full support for Perl-like reg ular expressions in Python. T he re module raises the exception re.error if an error occurs while compiling or using a reg ular expression.
We would cover two important functions, which would be used to handle reg ular expressions. But a small thing first: T here are various characters, which would have special meaning when they are used in reg ular expression. T o avoid any confusion while dealing with reg ular expressions, we would use Raw String s as r'expression'.
The match Function
T his function attempts to match RE pattern to string with optional flags.
Here is the syntax for this function:
re.match(pattern, string, flags=0)
Here is the description of the parameters:
P ar ame te r patte rn s tring
flag s
Desc ription
T his is the reg ular expression to be matched.
T his is the string , which would be searched to match the pattern at the beg inning of string .
You can specify different flag s using bitwise OR (|). T hese are modifiers, which are listed in the table below.
T he re.match function returns a matc h object on success, None on failure. We would use group(num) or groups() function of matc h object to g et matched expression.
Matc h O bjec t Methods g roup(num=0) g roups()
Desc ription
T his method returns entire match (or specific subg roup num)
T his method returns all matching subg roups in a tuple (empty if there weren't any)
Exa mp l e :
#!/usr/bin/python import re
line = "Cats are smarter than dogs"
matchObj = re.match( r'(.*) are (.*?) .*', line, re.M|re.I)
if matchObj: print "matchObj.group() : ", matchObj.group() print "matchObj.group(1) : ", matchObj.group(1) print "matchObj.group(2) : ", matchObj.group(2)
else: print "No match!!"
When the above code is executed, it produces following result:
matchObj.group() : Cats are smarter than dogs matchObj.group(1) : Cats matchObj.group(2) : smarter
The search Function
T his function searches for first occurrence of RE pattern within string with optional flags. Here is the syntax for this function:
re.search(pattern, string, flags=0)
Here is the description of the parameters:
P ar ame te r patte rn s tring
flag s
Desc ription
T his is the reg ular expression to be matched.
T his is the string , which would be searched to match the pattern anywhere in the string .
You can specify different flag s using bitwise OR (|). T hese are modifiers, which are listed in the table below.
T he re.search function returns a matc h object on success, None on failure. We would use group(num) or groups() function of matc h object to g et matched expression.
Matc h O bjec t Methods g roup(num=0) g roups()
Desc ription
T his method returns entire match (or specific subg roup num)
T his method returns all matching subg roups in a tuple (empty if there weren't any)
Exa mp l e :
#!/usr/bin/python import re
line = "Cats are smarter than dogs";
matchObj = re.match( r'(.*) are (.*?) .*', line, re.M|re.I)
if matchObj: print "matchObj.group() : ", matchObj.group() print "matchObj.group(1) : ", matchObj.group(1) print "matchObj.group(2) : ", matchObj.group(2)
else: print "No match!!"
When the above code is executed, it produces following result:
matchObj.group() : Cats are smarter than dogs matchObj.group(1) : Cats matchObj.group(2) : smarter
Matching vs Searching :
Python offers two different primitive operations based on reg ular expressions: matc h checks for a match only at the beg inning of the string , while searc h checks for a match anywhere in the string (this is what Perl does by de fault).
Exa mp l e :
#!/usr/bin/python import re
line = "Cats are smarter than dogs";
matchObj = re.match( r'dogs', line, re.M|re.I) if matchObj:
print "match --> matchObj.group() : ", matchObj.group() else:
print "No match!!"
matchObj = re.search( r'dogs', line, re.M|re.I) if matchObj:
print "search --> matchObj.group() : ", matchObj.group() else:
print "No match!!"
When the above code is executed, it produces the following result:
No match!! search --> matchObj.group() : dogs
Search and Replace:
Some of the most important re methods that use reg ular expressions is sub.
S yn ta x:
re.sub(pattern, repl, string, max=0)
T his method replaces all occurrences of the RE pattern in string with repl, substituting all occurrences unless max provided. T his method would return modified string .
Exa mp l e :
Following is the example:
#!/usr/bin/python import re
phone = "2004-959-559 # This is Phone Number"
# Delete Python-style comments num = re.sub(r'#.*$', "", phone) print "Phone Num : ", num
# Remove anything other than digits num = re.sub(r'\D', "", phone) print "Phone Num : ", num
When the above code is executed, it produces the following result:
Phone Num : 2004-959-559 Phone Num : 2004959559
Reg ular-expression Modifiers - Option Flag s
Reg ular expression literals may include an optional modifier to control various aspects of matching . T he modifiers are specified as an optional flag . You can provide multiple modifiers using exclusive OR (|), as shown pre viously and may be re pre se nte d by one of the se :
M o difie r re .I re .L
re .M
re .S re .U
re .X
Desc ription
Performs case-insensitive matching .
Interprets words according to the current locale. T his interpretation affects the alphabetic g roup (\w and \W), as well as word boundary behavior (\b and \B).
Makes $ match the end of a line (not just the end of the string ) and makes ^ match the start of any line (not just the start of the string ).
Makes a period (dot) match any character, including a newline.
Interprets letters according to the Unicode character set. T his flag affects the be havior of \w, \W, \b, \B.
Permits "cuter" reg ular expression syntax. It ig nores whitespace (except inside a set [] or when escaped by a backslash) and treats unescaped # as a comment marke r.
Reg ular-expression patterns:
Except for control characters, (+ ? . * ^ $ ( ) [ ] { } | \), all characters match themselves. You can escape a control character by preceding it with a backslash.
Following table lists the reg ular expression syntax that is available in Python:
P atte r n ^ $ .
[...] [^...] re* re+ re? re{ n} re{ n,} re{ n, m} a| b (re ) (?imx)
Desc ription Matches beg inning of line. Matche s e nd of line . Matches any sing le character except newline. Using m option allows it to match newline as well. Matches any sing le character in brackets. Matches any sing le character not in brackets Matche s 0 or more occurre nce s of pre ce ding e xpre ssion. Matche s 1 or more occurre nce of pre ce ding e xpre ssion. Matche s 0 or 1 occurre nce of pre ce ding e xpre ssion. Matches exactly n number of occurrences of preceding expression. Matches n or more occurrences of preceding expression. Matches at least n and at most m occurrences of preceding expression. Matches either a or b. Groups reg ular expressions and remembers matched text. T emporarily tog g les on i, m, or x options within a reg ular expression. If in parentheses, only that area is affected.
(?-imx)
(?: re) (?imx: re) (?-imx: re) (?#...) (?= re) (?! re ) (?> re) \w \W \s \S \d \D \A \Z \z \G \b
\B \n, \t, etc. \1...\9 \10
T emporarily tog g les off i, m, or x options within a reg ular expression. If in parentheses, only that area is affected. Groups reg ular expressions without remembering matched text. T emporarily tog g les on i, m, or x options within parentheses. T emporarily tog g les off i, m, or x options within parentheses. C omme nt. Specifies position using a pattern. Doesn't have a rang e. Specifies position using pattern neg ation. Doesn't have a rang e. Matches independent pattern without backtracking . Matches word characters. Matches nonword characters. Matches whitespace. Equivalent to [\t\n\r\f]. Matches nonwhitespace. Matches dig its. Equivalent to [0-9]. Matches nondig its. Matches beg inning of string . Matches end of string . If a newline exists, it matches just before newline. Matches end of string . Matches point where last match finished. Matches word boundaries when outside brackets. Matches backspace (0x08) when inside brackets. Matches nonword boundaries. Matches newlines, carriag e returns, tabs, etc. Matches nth g rouped subexpression. Matches nth g rouped subexpression if it matched already. Otherwise refers to the octal representation of a character code.
REGULAR-EXPRESSION EXAMPLES
Literal characters:
E xamp le python
Desc ription Match "python".
Character classes:
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- exploring regular expression usage and context in python
- mastering python regular expressions
- python regular expression regex cheat sheet by mutanclan cheatography
- python regular expressions picone press
- pattern matching with regular expressions no starch press
- part 1 regular expressions regex colorado state university
- python regular expressions dataquest
- regexing in sas for pattern matching and replacement
- python regex cheatsheet activestate
- python regular expressions university of cambridge
Related searches
- minecraft online no download just press p
- minecraft online no download just press play
- syneos press release
- dry cleaning press machine
- dry cleaner press machine
- us steel press release
- used dry clean press machine
- regular expressions js
- using regular expressions in java
- regular expressions tutorial
- regular expressions in java
- java regular expressions tutorial