Accessing Values in Strings: 'Hello World!' "Python Programming"

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 29

Strings are amongst the most popular types in Python.

We can create them simply by enclosing


characters in quotes. Python treats single quotes the same as double quotes. Creating strings is
as simple as assigning a value to a variable. For example −
var1 ='Hello World!'
var2 ="Python Programming"

Accessing Values in Strings


Python does not support a character type; these are treated as strings of length one, thus also
considered a substring.
To access substrings, use the square brackets for slicing along with the index or indices to
obtain your substring. For example –

#!/usr/bin/python3

var1 ='Hello World!'


var2 ="Python Programming"

print("var1[0]: ", var1[0])


print("var2[1:5]: ", var2[1:5])
When the above code is executed, it produces the following result −
var1[0]: H
var2[1:5]: ytho

Updating Strings
You can "update" an existing string by (re)assigning a variable to another string. The new value
can be related to its previous value or to a completely different string altogether. For example –

#!/usr/bin/python3

var1 ='Hello World!'


print("Updated String :- ", var1[:6]+'Python')
When the above code is executed, it produces the following result −
Updated String :- Hello Python

String Special Operators


Assume string variable a holds 'Hello' and variable b holds 'Python', then −

Operato Description Example


r

+ a + b will give
Concatenation - Adds values on either side of the operator
HelloPython

* a*2 will give -


Repetition - Creates new strings, concatenating multiple copies of the same string
HelloHello

[] Slice - Gives the character from the given index a[1] will give e

[:] Range Slice - Gives the characters from the given range a[1:4] will give ell

In Membership - Returns true if a character exists in the given string H in a will give 1

not in Membership - Returns true if a character does not exist in the given string M not in a will
give 1

r/R Raw String - Suppresses actual meaning of Escape characters. The syntax for raw print r'\n' prints \n
strings is exactly the same as for normal strings with the exception of the raw and print R'\
string operator, the letter "r," which precedes the quotation marks. The "r" can be n'prints \n
lowercase (r) or uppercase (R) and must be placed immediately preceding the
first quote mark.

% Format - Performs String formatting See at next section

String Formatting Operator


One of Python's coolest features is the string format operator %. This operator is unique to
strings and makes up for the pack of having functions from C's printf() family. Following is a
simple example −
#!/usr/bin/python3

print("My name is %s and weight is %d kg!"%('KCP',21))


When the above code is executed, it produces the following result −
My name is Zara and weight is 21 kg!
Here is the list of complete set of symbols which can be used along with % −

Sr.No. Format Symbol & Conversion

1
%c
character

2
%s
string conversion via str() prior to formatting

3
%i
signed decimal integer

4
%d
signed decimal integer

5
%u
unsigned decimal integer

6
%o
octal integer

7
%x
hexadecimal integer (lowercase letters)

8
%X
hexadecimal integer (UPPERcase letters)

9
%e
exponential notation (with lowercase 'e')

10
%E
exponential notation (with UPPERcase 'E')

11
%f
floating point real number

12
%g
the shorter of %f and %e

13
%G
the shorter of %f and %E

Other supported symbols and functionality are listed in the following table −

Sr.No. Symbol & Functionality

1
*
argument specifies width or precision

2
-
left justification

3
+
display the sign

4
<sp>
leave a blank space before a positive number

5
#
add the octal leading zero ( '0' ) or hexadecimal leading '0x' or '0X', depending on whether 'x' or 'X' were
used.

6
0
pad from left with zeros (instead of spaces)
7
%
'%%' leaves you with a single literal '%'

8
(var)
mapping variable (dictionary arguments)

9
m.n.
m is the minimum total width and n is the number of digits to display after the decimal point (if appl.)

Triple Quotes
Python's triple quotes comes to the rescue by allowing strings to span multiple lines, including
verbatim NEWLINEs, TABs, and any other special characters.
The syntax for triple quotes consists of three consecutive single or double quotes.

#!/usr/bin/python3

para_str ="""this is a long string that is made up of


several lines and non-printable characters such as
TAB ( \t ) and they will show up that way when displayed.
NEWLINEs within the string, whether explicitly given like
this within the brackets [ \n ], or just a NEWLINE within
the variable assignment will also show up.
"""
print(para_str)
When the above code is executed, it produces the following result. Note how every single
special character has been converted to its printed form, right down to the last NEWLINE at the
end of the string between the "up." and closing triple quotes. Also note that NEWLINEs occur
either with an explicit carriage return at the end of a line or its escape code (\n) −
this is a long string that is made up of
several lines and non-printable characters such as
TAB ( ) and they will show up that way when displayed.
NEWLINEs within the string, whether explicitly given like
this within the brackets [
], or just a NEWLINE within
the variable assignment will also show up.
Raw strings do not treat the backslash as a special character at all. Every character you put into
a raw string stays the way you wrote it –
#!/usr/bin/python3

print('C:\\nowhere')
When the above code is executed, it produces the following result −
C:\nowhere
Now let's make use of raw string. We would put expression in r'expression' as follows –

#!/usr/bin/python3

print(r'C:\\nowhere')
When the above code is executed, it produces the following result −
C:\\nowhere

Unicode String
In Python 3, all strings are represented in Unicode.In Python 2 are stored internally as 8-bit
ASCII, hence it is required to attach 'u' to make it Unicode. It is no longer necessary now.
Built-in String Methods
Python includes the following built-in methods to manipulate strings −

Sr.No. Methods & Description

1 capitalize()

Capitalizes first letter of string

2 center(width, fillchar)

Returns a string padded with fillchar with the original string centered to a total of width columns.

3 count(str, beg = 0,end = len(string))

Counts how many times str occurs in string or in a substring of string if starting index beg and ending
index end are given.

4 decode(encoding = 'UTF-8',errors = 'strict')


Decodes the string using the codec registered for encoding. encoding defaults to the default string
encoding.

5 encode(encoding = 'UTF-8',errors = 'strict')

Returns encoded string version of string; on error, default is to raise a ValueError unless errors is given
with 'ignore' or 'replace'.

6 endswith(suffix, beg = 0, end = len(string))

Determines if string or a substring of string (if starting index beg and ending index end are given) ends
with suffix; returns true if so and false otherwise.

7 expandtabs(tabsize = 8)

Expands tabs in string to multiple spaces; defaults to 8 spaces per tab if tabsize not provided.

8 find(str, beg = 0 end = len(string))

Determine if str occurs in string or in a substring of string if starting index beg and ending index end are
given returns index if found and -1 otherwise.

9 index(str, beg = 0, end = len(string))

Same as find(), but raises an exception if str not found.

10 isalnum()

Returns true if string has at least 1 character and all characters are alphanumeric and false otherwise.

11 isalpha()

Returns true if string has at least 1 character and all characters are alphabetic and false otherwise.

12 isdigit()

Returns true if string contains only digits and false otherwise.

13 islower()

Returns true if string has at least 1 cased character and all cased characters are in lowercase and false
otherwise.

14 isnumeric()

Returns true if a unicode string contains only numeric characters and false otherwise.
15 isspace()

Returns true if string contains only whitespace characters and false otherwise.

16 istitle()

Returns true if string is properly "titlecased" and false otherwise.

17 isupper()

Returns true if string has at least one cased character and all cased characters are in uppercase and false
otherwise.

18 join(seq)

Merges (concatenates) the string representations of elements in sequence seq into a string, with separator
string.

19 len(string)

Returns the length of the string

20 ljust(width[, fillchar])

Returns a space-padded string with the original string left-justified to a total of width columns.

21 lower()

Converts all uppercase letters in string to lowercase.

22 lstrip()

Removes all leading whitespace in string.

23 maketrans()

Returns a translation table to be used in translate function.

24 max(str)

Returns the max alphabetical character from the string str.

25 min(str)

Returns the min alphabetical character from the string str.

26 replace(old, new [, max])


Replaces all occurrences of old in string with new or at most max occurrences if max given.

27 rfind(str, beg = 0,end = len(string))

Same as find(), but search backwards in string.

28 rindex( str, beg = 0, end = len(string))

Same as index(), but search backwards in string.

29 rjust(width,[, fillchar])

Returns a space-padded string with the original string right-justified to a total of width columns.

30 rstrip()

Removes all trailing whitespace of string.

31 split(str="", num=string.count(str))

Splits string according to delimiter str (space if not provided) and returns list of substrings; split into at
most num substrings if given.

32 splitlines( num=string.count('\n'))

Splits string at all (or num) NEWLINEs and returns a list of each line with NEWLINEs removed.

33 startswith(str, beg=0,end=len(string))

Determines if string or a substring of string (if starting index beg and ending index end are given) starts
with substring str; returns true if so and false otherwise.

34 strip([chars])

Performs both lstrip() and rstrip() on string

35 swapcase()

Inverts case for all letters in string.

36 title()

Returns "titlecased" version of string, that is, all words begin with uppercase and the rest are lowercase.

37 translate(table, deletechars="")

Translates string according to translation table str(256 chars), removing those in the del string.
38 upper()

Converts lowercase letters in string to uppercase.

39 zfill (width)

Returns original string leftpadded with zeros to a total of width characters; intended for numbers, zfill()
retains any sign given (less one zero).

40 isdecimal()

Returns true if a unicode string contains only decimal characters and false otherwise.

The string Module


The string module supplies several useful string attributes:
ascii_letters
The string ascii_lowercase+ascii_uppercase
ascii_lowercase
The string 'abcdefghijklmnopqrstuvwxyz'
ascii_uppercase
The string 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
digits
The string '0123456789'
hexdigits
The string '0123456789abcdefABCDEF'
octdigits
The string '01234567'
punctuation
The string '!"#$%&\'( )*+,-./:;<=>?@[\]^_'{|}~' (i.e., all ASCII characters
that are deemed punctuation characters in the 'C' locale; does not depend on
which locale is active)
printable
The string of those ASCII characters that are deemed printable (i.e., digits, letters,
punctuation, and whitespace)

whitespace
A string containing all ASCII characters that are deemed whitespace: at least
space, tab, linefeed, and carriage return, but more characters (e.g., certain control
characters) may be present, depending on the active locale
You should not rebind these attributes; the effects of doing so are undefined, since
other parts of the Python library may rely on them.

Text Wrapping and Filling


The textwrap module can be used for wrapping and formatting of plain text. This module
provides formatting of text by adjusting the line breaks in the input paragraph.
The TextWrapper instance attributes (and keyword arguments to the constructor) are as follows:
 width: This refers to the maximum length allowed of the wrapped lines. It’s default value is set to
70.
 expand_tabs: It’s default value is set to TRUE. If the value is equal to true, then, all the tab
characters in the sample input is expanded to spaces using this method.
 tabsize: It’s default value is set to 8. This method expands all tab characters in text to zero or more
spaces, depending on the current column and the given tab size, if the value of expand_tabs is
TRUE.
 replace_whitespace: It’s default value is set to TRUE. If the value is true, after tab expansion but
before wrapping, the wrap() method replaces each whitespace character with a single space.These
whitespace characters are replaced : tab, newline, vertical tab, formfeed, and carriage return (‘\t\n\v\
f\r’).
 drop_whitespace: It’s default value is set to TRUE. The whitespaces at the beginning and ending
of every line (after wrapping but before indenting) is dropped if the value is set to TRUE.
 initial_indent: It’s default value is set to’ ‘. This method prepends the given string to the first line
of wrapped output.
 subsequent_indent: It’s default value is set to ‘ ‘. This method prepends the given string to all the
lines of wrapped output except the first.
 placeholder: It’s default value is set to ‘ […]’. This method appends the string at the end of the
output text if it has been truncated.
 max_lines: It’s default value is set to None. If the value is not None, then the output text contains at
most max_lines lines, having placeholder at the end of the output.
 break_long_words: It’s default value is set to True. If TRUE, then words longer than width are
broken to fit every line in the given width. If it is FALSE, long words will not be broken and will be
put on a line by themselves, in order to minimize the amount by which width is exceeded.
 break_on_hyphens: It’s default value is set to True. If the value is equal to TRUE, wrapping
occurs on whitespaces and right after hyphens in compound words. If the value is equal to FALSE,
line breaks occur only on whitespaces, but you need to set break_long_words to FALSE if you want
truly insecable words.

Functions provided by the Textwrap module :

textwrap.wrap(text, width=70, **kwargs): This function wraps the input paragraph such that each line
in the paragraph is at most width characters long. The wrap method returns a list of output lines. The
returned list is empty if the wrapped output has no content. Default width is taken as 70.

import textwrap

value = """This function wraps the input paragraph such that each line
in the paragraph is at most width characters long. The wrap method
returns a list of output lines. The returned list
is empty if the wrapped
output has no content."""

# Wrap this text.


wrapper = textwrap.TextWrapper(width=50)

word_list = wrapper.wrap(text=value)
# Print each line.
for element in word_list:
print(element)

textwrap.fill(text, width=70, **kwargs): The fill() convenience function works similar to textwrap.wrap


except it returns the data joined into a single, newline-separated string. This function wraps the input
single paragraph in text, and returns a single string containing the wrapped paragraph.

import textwrap

value = """This function returns the answer as STRING and not LIST."""

# Wrap this text.


wrapper = textwrap.TextWrapper(width=50)

string = wrapper.fill(text=value)

print (string)

textwrap.dedent(text): This function is used to remove any common leading whitespace from every line
in the input text. This allows to use docstrings or embedded multi-line strings line up with the left edge of
the display, while removing the formatting of the code itself.

import textwrap

wrapper = textwrap.TextWrapper(width=50)

s = '''\
hello
world
'''
print(repr(s)) # prints ' hello\n world\n '

text = textwrap.dedent(s)
print(repr(text)) # prints 'hello\n world\n'

textwrap.shorten(text, width, **kwargs): This function truncates the input string so that the length of
the string becomes equal to the given width. At first, all the whitespaces are collapsed in the string by
removing the whitespaces with a single space. If the modified string fits in the given string, then it is
returned otherwise, the characters from the end are dropped so that the remaining words plus the
placeholder fit within width.

import textwrap

sample_text = """This function wraps the input paragraph such that each line
n the paragraph is at most width characters long. The wrap method
returns a list of output lines. The returned list
is empty if the wrapped
output has no content."""
wrapper = textwrap.TextWrapper(width=50)

dedented_text = textwrap.dedent(text=sample_text)
original = wrapper.fill(text=dedented_text)

print('Original:\n')
print(original)

shortened = textwrap.shorten(text=original, width=100)


shortened_wrapped = wrapper.fill(text=shortened)

print('\nShortened:\n')
print(shortened_wrapped)

textwrap.indent(text, prefix, predicate=None): This function is used to add the given prefix to the
beginning of the selected lines of the text. The predicate argument can be used to control which lines are
indented.

import textwrap
s = 'hello\n\n \nworld'
s1 = textwrap.indent(text=s, prefix=' ')
print (s1)
print ("\n")
s2 = textwrap.indent(text=s, prefix='+ ', predicate=lambda line:
True)
print (s2)

NOTE:

 String Formatting(From BOOK)


 The pprint module(From BOOK)
 The reprlib module((From BOOK))
Python - Regular Expressions

A regular expression is a special sequence of characters that helps you match or find
other strings or sets of strings, using a specialized syntax held in a pattern. Regular
expressions are widely used in UNIX world.
The Python module re provides full support for Perl-like regular expressions in Python.
The re module raises the exception re.error if an error occurs while compiling or using
a regular expression.
We would cover two important functions, which would be used to handle regular
expressions. But a small thing first: There are various characters, which would have
special meaning when they are used in regular expression. To avoid any confusion
while dealing with regular expressions, we would use Raw Strings as r'expression'.

The match Function
This function attempts to match RE pattern to string with optional flags.
Here is the syntax for this function −
re.match(pattern, string, flags=0)
Here is the description of the parameters −

Sr.No Parameter & Description


.

1
pattern
This is the regular expression to be matched.

2
string
This is the string, which would be searched to match the pattern at the beginning of string.

3
flags
You can specify different flags using bitwise OR (|). These are modifiers, which are listed in
the table below.

The re.match function returns a match object on success, None on failure. We


usegroup(num) or groups() function of match object to get matched expression.
Sr.No Match Object Method & Description
.

1
group(num=0)
This method returns entire match (or specific subgroup num)

2
groups()
This method returns all matching subgroups in a tuple (empty if there weren't any)

Example
Live Demo

#!/usr/bin/python
import re

line ="Cats are smarter than dogs"

matchObj = re.match( r'(.*) are (.*?) .*', line, re.M|re.I)

if matchObj:
print"matchObj.group() : ", matchObj.group()
print"matchObj.group(1) : ", matchObj.group(1)
print"matchObj.group(2) : ", matchObj.group(2)
else:
print"No match!!"

When the above code is executed, it produces following result −


matchObj.group() : Cats are smarter than dogs
matchObj.group(1) : Cats
matchObj.group(2) : smarter

The search Function
This function searches for first occurrence of RE pattern within string with
optional flags.
Here is the syntax for this function −
re.search(pattern, string, flags=0)
Here is the description of the parameters −
Sr.No Parameter & Description
.

1
pattern
This is the regular expression to be matched.

2
string
This is the string, which would be searched to match the pattern anywhere in the string.

3
flags
You can specify different flags using bitwise OR (|). These are modifiers, which are listed in
the table below.

The re.search function returns a match object on success, none on failure. We


use group(num) or groups() function of match object to get matched expression.

Sr.No Match Object Methods & Description


.

1
group(num=0)
This method returns entire match (or specific subgroup num)

2
groups()
This method returns all matching subgroups in a tuple (empty if there weren't any)

Example
Live Demo

#!/usr/bin/python
import re

line ="Cats are smarter than dogs";

searchObj = re.search( r'(.*) are (.*?) .*', line, re.M|re.I)

if searchObj:
print"searchObj.group() : ", searchObj.group()
print"searchObj.group(1) : ", searchObj.group(1)
print"searchObj.group(2) : ", searchObj.group(2)
else:
print"Nothing found!!"

When the above code is executed, it produces following result −


searchObj.group() : Cats are smarter than dogs
searchObj.group(1) : Cats
searchObj.group(2) : smarter

Matching Versus Searching


Python offers two different primitive operations based on regular
expressions: match checks for a match only at the beginning of the string,
while search checks for a match anywhere in the string (this is what Perl does by
default).
Example
Live Demo

#!/usr/bin/python
import re

line ="Cats are smarter than dogs";

matchObj = re.match( r'dogs', line, re.M|re.I)


if matchObj:
print"match --> matchObj.group() : ", matchObj.group()
else:
print"No match!!"

searchObj = re.search( r'dogs', line, re.M|re.I)


if searchObj:
print"search --> searchObj.group() : ", searchObj.group()
else:
print"Nothing found!!"

When the above code is executed, it produces the following result −


No match!!
search --> searchObj.group() : dogs

Search and Replace


One of the most important re methods that use regular expressions is sub.
Syntax
re.sub(pattern, repl, string, max=0)
This method replaces all occurrences of the RE pattern in string with repl, substituting
all occurrences unless max provided. This method returns modified string.
Example
Live Demo

#!/usr/bin/python
import re

phone ="2004-959-559 # This is Phone Number"

# Delete Python-style comments


num = re.sub(r'#.*$',"", phone)
print"Phone Num : ", num

# Remove anything other than digits


num = re.sub(r'\D',"", phone)
print"Phone Num : ", num

When the above code is executed, it produces the following result −


Phone Num : 2004-959-559
Phone Num : 2004959559

Regular Expression Modifiers: Option Flags


Regular expression literals may include an optional modifier to control various aspects
of matching. The modifiers are specified as an optional flag. You can provide multiple
modifiers using exclusive OR (|), as shown previously and may be represented by one
of these −

Sr.No Modifier & Description


.

1
re.I
Performs case-insensitive matching.

2
re.L
Interprets words according to the current locale. This interpretation affects the alphabetic
group (\w and \W), as well as word boundary behavior(\b and \B).

3
re.M
Makes $ match the end of a line (not just the end of the string) and makes ^ match the start of
any line (not just the start of the string).
4
re.S
Makes a period (dot) match any character, including a newline.

5
re.U
Interprets letters according to the Unicode character set. This flag affects the behavior of \w, \
W, \b, \B.

6
re.X
Permits "cuter" regular expression syntax. It ignores whitespace (except inside a set [] or
when escaped by a backslash) and treats unescaped # as a comment marker.

Regular Expression Patterns


Except for control characters, (+ ? . * ^ $ ( ) [ ] { } | \), all characters match themselves.
You can escape a control character by preceding it with a backslash.
Following table lists the regular expression syntax that is available in Python −

Sr.No Pattern & Description


.

1
^
Matches beginning of line.

2
$
Matches end of line.

3
.
Matches any single character except newline. Using m option allows it to match newline as
well.

4
[...]
Matches any single character in brackets.

5
[^...]
Matches any single character not in brackets

6
re*
Matches 0 or more occurrences of preceding expression.

7
re+
Matches 1 or more occurrence of preceding expression.

8
re?
Matches 0 or 1 occurrence of preceding expression.

9
re{ n}
Matches exactly n number of occurrences of preceding expression.

10
re{ n,}
Matches n or more occurrences of preceding expression.

11
re{ n, m}
Matches at least n and at most m occurrences of preceding expression.

12
a| b
Matches either a or b.

13
(re)
Groups regular expressions and remembers matched text.

14
(?imx)
Temporarily toggles on i, m, or x options within a regular expression. If in parentheses, only
that area is affected.

15
(?-imx)
Temporarily toggles off i, m, or x options within a regular expression. If in parentheses, only
that area is affected.
16
(?: re)
Groups regular expressions without remembering matched text.

17
(?imx: re)
Temporarily toggles on i, m, or x options within parentheses.

18
(?-imx: re)
Temporarily toggles off i, m, or x options within parentheses.

19
(?#...)
Comment.

20
(?= re)
Specifies position using a pattern. Doesn't have a range.

21
(?! re)
Specifies position using pattern negation. Doesn't have a range.

22
(?> re)
Matches independent pattern without backtracking.

23
\w
Matches word characters.

24
\W
Matches nonword characters.

25
\s
Matches whitespace. Equivalent to [\t\n\r\f].

26
\S
Matches nonwhitespace.
27
\d
Matches digits. Equivalent to [0-9].

28
\D
Matches nondigits.

29
\A
Matches beginning of string.

30
\Z
Matches end of string. If a newline exists, it matches just before newline.

31
\z
Matches end of string.

32
\G
Matches point where last match finished.

33
\b
Matches word boundaries when outside brackets. Matches backspace (0x08) when inside
brackets.

34
\B
Matches nonword boundaries.

35
\n, \t, etc.
Matches newlines, carriage returns, tabs, etc.

36
\1...\9
Matches nth grouped subexpression.

37
\10
Matches nth grouped subexpression if it matched already. Otherwise refers to the octal
representation of a character code.

Regular Expression Examples


Literal characters
Sr.No Example & Description
.

1
python
Match "python".

Character classes
Sr.No Example & Description
.

1
[Pp]ython
Match "Python" or "python"

2
rub[ye]
Match "ruby" or "rube"

3
[aeiou]
Match any one lowercase vowel

4
[0-9]
Match any digit; same as [0123456789]

5
[a-z]
Match any lowercase ASCII letter

6
[A-Z]
Match any uppercase ASCII letter

7
[a-zA-Z0-9]
Match any of the above

8
[^aeiou]
Match anything other than a lowercase vowel

9
[^0-9]
Match anything other than a digit

Special Character Classes


Sr.No Example & Description
.

1
.
Match any character except newline

2
\d
Match a digit: [0-9]

3
\D
Match a nondigit: [^0-9]

4
\s
Match a whitespace character: [ \t\r\n\f]

5
\S
Match nonwhitespace: [^ \t\r\n\f]

6
\w
Match a single word character: [A-Za-z0-9_]

7
\W
Match a nonword character: [^A-Za-z0-9_]

Repetition Cases
Sr.No Example & Description
.

1
ruby?
Match "rub" or "ruby": the y is optional

2
ruby*
Match "rub" plus 0 or more ys

3
ruby+
Match "rub" plus 1 or more ys

4
\d{3}
Match exactly 3 digits

5
\d{3,}
Match 3 or more digits

6
\d{3,5}
Match 3, 4, or 5 digits

Nongreedy repetition
This matches the smallest number of repetitions −

Sr.No. Example & Description


1
<.*>
Greedy repetition: matches "<python>perl>"

2
<.*?>
Nongreedy: matches "<python>" in "<python>perl>"

Grouping with Parentheses


Sr.No Example & Description
.

1
\D\d+
No group: + repeats \d

2
(\D\d)+
Grouped: + repeats \D\d pair

3
([Pp]ython(, )?)+
Match "Python", "Python, python, python", etc.

Backreferences
This matches a previously matched group again −

Sr.No Example & Description


.

1
([Pp])ython&\1ails
Match python&pails or Python&Pails

2
(['"])[^\1]*\1
Single or double-quoted string. \1 matches whatever the 1st group matched. \2 matches
whatever the 2nd group matched, etc.

Alternatives
Sr.No Example & Description
.

1
python|perl
Match "python" or "perl"

2
rub(y|le))
Match "ruby" or "ruble"

3
Python(!+|\?)
"Python" followed by one or more ! or one ?

Anchors
This needs to specify match position.

Sr.No Example & Description


.

1
^Python
Match "Python" at the start of a string or internal line

2
Python$
Match "Python" at the end of a string or line

3
\APython
Match "Python" at the start of a string
4
Python\Z
Match "Python" at the end of a string

5
\bPython\b
Match "Python" at a word boundary

6
\brub\B
\B is nonword boundary: match "rub" in "rube" and "ruby" but not alone

7
Python(?=!)
Match "Python", if followed by an exclamation point.

8
Python(?!!)
Match "Python", if not followed by an exclamation point.

Special Syntax with Parentheses


Sr.No Example & Description
.

1
R(?#comment)
Matches "R". All the rest is a comment

2
R(?i)uby
Case-insensitive while matching "uby"

3
R(?i:uby)
Same as above

4
rub(?:y|le))
Group only without creating \1 backreference

You might also like