Find and Replace Text and Other Data in A Word Document 5
Find and Replace Text and Other Data in A Word Document 5
Find and Replace Text and Other Data in A Word Document 5
regular
expressions in Word. For a list of the wildcards you can use, see the section Wildcards for items you want
to find and replace above.
This example uses a combination of wildcard characters and character codes to transpose names that
contain middle initials. If you're unfamiliar with character codes, see the section Use codes to find letters,
formatting, fields, or special characters above.
Whenever you use this expression on names that reside in a table, you must first convert that table
to text.
If the table contains more than one column, copy the column containing the names to a blank
document and convert it to text there.
After you transpose the names, convert the text back to a table. You can then delete the original
column and replace it with your changed data.
1. If you haven't already done so, start Word and create a new, blank document.
2. Insert a blank table into the document. Make the table 1 column wide by 4 rows high.
3. Copy these names individually, and paste each one into a blank table cell:
Doris X. Hartwig
Tamara Y. Johnston
Daniel Shimshoni
Tamara Y. Johnston
Daniel Shimshoni
4. Select the table, and on the Table Tools Layout tab, in the Data group, click Convert to Text.
5. Select Paragraph marks as the text separator, and then click OK.
1. On the Home tab, in the Editing group, click Replace to open the Find and Replace dialog box.
2. Select the Use wildcards check box (you may need to click More to see the check box), and then
type the following expression in the Find what box:
Make sure you include a space between the two sets of parentheses and after the exclamation
point. If you haven't seen the ^13 character before, we explain what it does in the next section.
\2, \1^p
4. Select the list of names, and then click Replace All. Word transposes the names and either middle
initials or middle names, like so:
2. On the Insert tab, in the Tables group, click Table, and then click Convert Text to Table.
3. In the Convert Text to Table dialog box, under Separate text at, click Paragraphs, and then click
OK.
Let's look at the individual pieces of the expression to see how they work, starting with the expression in
the Find what box.
The entire expression looks for two groups of patterns: a first name with a middle initial (or a middle
name) and a last name. The (*) finds all first names. Notice that there's a space after it.
([! ]@)^13
The exclamation point excludes any character specified in the brackets. In this case, [! ] means "find
everything but spaces." Its effect is to trim the space from in front of the last names.
The @ character finds one or more occurrences of the previous character, so it simply ensures that all
spaces in front of the last name are removed.
We need to know where the last name ends, so we also use the ^13 character to search for the
paragraph mark at the end of each line. However, because we don't plan to reuse the paragraph mark,
we surround everything else with parentheses.
You can try this by copying the names to your test document again (make sure you separate them with
paragraph marks), and then search using ([! ]@)^13 in the Find what box. Search matches each last
name.
Because search starts again at the beginning of the next line, we use the asterisk wildcard character (*)
to match everything from there to the beginning of the next last name.
Because we don't plan to reuse the space in front of the last name, we use parentheses to exclude it
from the two groups:
Important: Be careful when using the ^13 character code. Normally, you can
use the ^p character code to search for paragraph marks. However, that code
does not work in wildcard searches. Instead, you need to use the substitute code
^13. Although the ^p character code does not work in wildcard searches, you
should use it in wildcard replace operations because it includes formatting
information, and the ^13 character does not. In addition, you cannot assign style
information to the ^13 character at all. Misusing the ^13 code in a replace
operation can essentially convert your document into a file that you cannot
format.
The "replace" expression (\2 \1) does the actual transposition. In the Replace with box, the \2,
characters tell search to write the second pattern first and to add a comma after the pattern. The \1^p
characters tell search where to write the first pattern and to write a paragraph mark after that pattern.
This example uses regular expressions to convert dates in European format to dates in the U.S. format.
1. Copy and paste the following date into your document: 28th May 2003
2. Open the Find and Replace dialog box, and type the following expression in the Find what box:
Make sure you insert a space between the following opening and closing parentheses: 2}) (<[ and
2. Open the Find and Replace dialog box, and type the following expression in the Find what box:
Make sure you insert a space between the following opening and closing parentheses: 2}) (<[ and
*>) ([0.
\3 \1, \4
Let's start with the expression in the Find what box. The expression works by breaking dates down into
four patterns, denoted by the sets of parentheses. Each pattern contains the components that you find
in all dates written in the style that you used in the example. Working from left to right:
The number range [0-9] matches the single-digit numbers in the first pattern. Because dates can
consist of two numbers, we tell search to return either one-digit or two-digit dates: {1,2}. The result
is the first pattern: ([0-9]{1,2}).
Ordinals make up the second pattern. Ordinals consist of "th," "nd," "st," and "rd," so we add those
letters to a range [dhnrst]. Because ordinals always consist of two letters, we restrict the letter count
to two: ([dhnrst]{2}).
Next comes a space, followed by literal and wildcard characters that find month names. All month
names begin with these capital letters: ADFJMNOS. We don't know how many characters follow
each capital letter, so we follow them with the asterisk (*). We're only interested in the month name
itself, so we use greater-than and less-than characters to limit the results to the individual word. The
result is the fourth pattern: (<[ADFJMNOS]*>).
Finally, we search for the year. We use the same number range, but this time we restrict the count to
four letters ([0-9]{4}).
Notice that in the Replace with box we wrote only three of the four address patterns. We omitted the
ordinal (the "th") from the date because dates in the U.S. format don't use ordinals. If you want to leave
the ordinal in the date, enter \3 \1\2, \4 in the Replace with box. In this case, you enter a space both
after the 3 and after the comma, but nowhere else.
At this point, you may ask how to handle dates in which the name of the month isn't spelled out, such
as 28/05/03. You search using this expression:
([0-9]{1,2})/([0-9]{1,2})/([0-9]{2})
\3/\1/\2
If the date takes the format of 28/05/2003, you use {4} in the last pattern instead of {2}.
The previous example uses the following argument to find either one-digit or two-digit dates: {1,2}. In
this case, a comma separates the two values. However, your regional settings in Windows control the
list separator that you use. If your regional settings specify the use of semicolons as list separators, you
must use them instead of commas.
To find out which list separator your operating system specifies, do the following:
1. Open Control Panel. (Right-click the Windows Start button, and then click Control Panel in
Windows 8 and later. In Windows 7, click the Start button, and then click Control Panel.)
3. Click Change date, time, or number format, and then click Additional settings.
4. Click the Numbers tab, and then locate the List separator entry.
In some countries, honorific titles (Mr., Mrs., and so on) do not include periods. This example shows you
how to add periods to or remove them from honorifics. From this point on, we assume that you know
how to use the Find and Replace dialog box.
<([DM][ rs ]{1,2})( )
Notice that the expression uses a second pattern containing a blank space. That space normally would
follow the honorific if the period was not there. This expression adds the period:
\1.\2
<([DM][ rs ]{1,2}).
\1
When you use this expression, you may want to sort the list first to place duplicate rows next to each
other. Also, you need to remove all blank paragraph marks. In other words, if you use blank paragraphs
to separate blocks of text, like so:
You can use your favorite method to remove the blank paragraphs, but here's one that finds two
consecutive paragraph characters. Search using this expression (the @ character repeats the find-and-
replace operation and removes all multiple empty lines):
(^13)\1@
^p
Now let's look at ways to replace text. This expression finds any sequence of two consecutive identical
paragraphs:
(*^13)\1
This expression also matches longer repetitions of text that end in paragraphs. For example, run the
expression against the following list:
Search finds the first four lines and stops only when the overall pattern changes. In contrast, if you run
the expression against this list:
To search for a greater number of identical items, add more placeholders. For example, this expression
finds three consecutive identical paragraphs:
(*^13)\1\1
You can also use braces to do the same thing. The following examples find two and three identical
paragraphs, respectively:
(*^13){2} (*^13){3}
(*^13){2,3}
(*^13){2,}
(*^13){2,3}
(*^13){2,}
You can replace any of those expressions with the following string:
\1
In addition, you can repeat the find-and-replace operation as needed to replace all the duplicate
paragraphs in your document, or you can add the @ wildcard character and have the expression repeat
the operation for you:
(*^13)\1@
You also use this method to replace duplicate rows in a table. To do so, first remove any merged cells,
and then sort the table to place duplicate cells adjacent to each other. Next, convert your table to text.
(On the Table menu, point to Convert, and then click Table to text; when prompted, use the tab
delimiter.) After you make your replacements, convert the text back to a table.
More examples
For more examples of how to use regular expressions in Word, see Finding and replacing characters
using wildcards on the MVP FAQ site.