Regular Expressions
Regular Expressions
Regular Expressions
The re library in Python provides several functions that make it a skill worth
mastering.
This chapter will walk you through the important concepts of regular
expressions with Python. You will start with importing re - Python library that
supports regular expressions. Then you will see how basic/ordinary characters
are used for performing matches, followed by wild or special characters. Next,
you will learn about using repetitions in your regular expressions. You will also
learn how to create groups and named groups within your search for ease of
access to matches.
pattern = r"Cookie"
sequence = "Cookie"
if re.match(pattern, sequence):
print("Match!")
else:
print("Not a match!")
Output
Most alphabets and characters will match themselves, as you saw in the
example.
The match() function returns a match object if the text matches the pattern.
Otherwise, it returns None.
print(re.search(r'Co.k.e', 'Cookie').group())
Output
Output
Output
[abc] - Matches a or b or c.
[a-zA-Z0-9] - Matches any letter from (a to z) or (A to Z) or (0 to 9).
import re
Output
import re
Output
Output
Output
Output
Repetitions
It becomes quite tedious if you are looking to find long patterns in a sequence.
Fortunately, the re module handles repetitions using the following special
characters:
+ - Checks if the preceding character appears one or more times starting from
that position.
import re
print(re.search(r'Py+thon', 'Pyyyython').group())
Output
print(re.search(r'Ca*o*kie', 'Cookie').group())
Output
? - Checks if the preceding character appears exactly zero or one time starting
from that position.
import re
print(re.search(r'Pythoi?n', 'Python').group())
Output
But what if you want to check for an exact number of sequence repetition?
For example, checking the validity of a phone number in an application. re
module handles this very gracefully as well using the following regular
expressions:
{x} - Repeat exactly x number of times.
{x,} - Repeat at least x times or more.
{x, y} - Repeat at least x times but no more than y times.
import re
print(re.search(r'\d{9,10}', '0987654321').group())
Output
Output