Filer Command
Filer Command
Filer Command
CONTENTS
Filters – definition
To format text – pr
Pick lines from the beginning – head
Pick lines from the end – tail
Extract characters – cut
Join two lines / files – paste
Sort, merge and remove – sort
Find unique and nonunique lines – uniq
Change, delete or squeeze characters - tr
Delimiter
A delimiter is one or more characters that separates text
strings. Common delimiters are commas (,), semicolon
(;), quotes ( ", ' ), braces ({}), pipes (|), or slashes ( / \ ).
When a program stores lots of data it may use a delimiter
to separate each of the data values.
Pipe is used to combine two or more command and in
this the output of one command act as input to another
command and this command output may act as input to
next command and so on. It can also be visualized as a
temporary connection between two or more commands/
programs/ processes. The command line programs that do
the further processing are referred to as filters.
For example, "john|doe" has a pipe as its delimiter, a
program or script could distinguish between the first and
last name in a string of text.
SIMPLE FILTERS
Commands which accept data from standard input,
manipulate it and write the results to standard output
Each filter performs a simple function
Some commands use delimiter, pipe (|) or colon (:)
Many filters work well with delimited fields, and some
simply won’t work without them
The piping mechanism allows the standard output of
one filter serve as standard input of another
The filters can read data from standard input when
used without a filename as argument, and from the
file otherwise.
THE SIMPLE DATABASE
Several UNIX commands are provided for text
editing and shell programming (emp.lst)
Each line of this file has six fields separated by five
delimiters
The details of an employee are stored in one single
line
2233 | a.k.shukla | g.m | sales | 12/12/52 | 6000
pr : paginating files
We know that,
cat dept.lst
01|accounts|6213
02|progs|5423
03|marketing|6521
04|personnel|2365
05|production|9876
06|sales|1006
pr command adds suitable headers, footers and
formatted text
pr adds five lines of margin at the top and bottom
pr dept.lst
May 06 10:38 1997 dept.lst page 1
01:accounts:6213
02:progs:5423
03:marketing:6521
04:personnel:2365
05:production:9876
06:sales:1006
Output: The header shows the date and time of last
modification of the file along with the filename and page
number
pr options
-k prints k (integer) columns
-t to suppress the header and footer
-h to have a header of user’s choice
-d double spaces input
-n will number each line and helps in debugging
-on offsets the lines by n spaces and increases left
margin of page
pr +10 chap01
starts printing from page 10
pr -l 54 chap01
this option sets the page length to 54
head
Displays the top of the file
It displays the first 10 lines of the file, when used
without an option
head emp.lst
01|accounts|6213
02|progs|5423
03|marketing|6521
04|personnel|2365
05|production|9876
06|sales|1006
Option –n to specify a line count
head -n 3 emp.lst
01|accounts|6213
02|progs|5423
03|marketing|6521
tail
Displays the end of the file
It displays the last 10 lines of the file, when used
without an option
tail emp.lst
Options
-n to specify a line count
tail -n 3 emp.lst
Monitoring the file growth (-f):
Extracting bytes rather than lines (-c)
cut
It is used for slitting the file vertically
head -n 5 emp.lst | tee shortlist
will select the first five lines of emp.lst and saves it to
shortlist
We can cut by using -c option with a list of column
numbers, delimited by a comma (cutting columns)
cut -c 6-22,24-32 shortlist
cut -c -3,6-22,28-34,55- shortlist
Most files don’t contain fixed length lines, so we have
to cut fields rather than columns (cutting fields)
-d for the field delimiter
-f for the field list
cut -d \ | -f 2,3 shortlist | tee cutlist1
will display the second and third columns of shortlist
and saves the output in cutlist1. here | is escaped to
prevent it as pipeline character
To print the remaining fields, we have
cut –d \ | -f 1,4- shortlist > cutlist2
paste
When we cut with cut, it can be pasted back with
the paste command, vertically
paste cutlist1 cutlist2
We can view two files side by side
Sort
Through this command data can be arranged in ascending
and descending order
By default whitespaces,numerals, uppercase letter and
lowercase letter.
Option
-r: It reverse the previous sorting order.
-u : remove repeated lines.
uniq
It require sorted file and print unique lines.
Tr (Translating Character
Tr options exp1 exp2 standardd input
Regular Expression
Regular Expression provides an ability to match a “string of text” in a very
flexible and concise manner. A “string of text” can be further defined as a
single character, word, sentence or particular pattern of characters.
Like the shell’s wild–cards which match similar filenames with a single
expression, grep uses an expression of a different sort to match a group of
similar patterns.
[ ]: Matches any one of a set characters
[ ] with hyphen: Matches any one of a range characters
^: The pattern following it must occur at the beginning of each line
^ with [ ] : The pattern must not contain any character in the set specified
$: The pattern preceding it must occur at the end of each line
. (dot): Matches any one character
\ (backslash): Ignores the special meaning of the character following it
*: zero or more occurrences of the previous character
(dot).*: Nothing or any numbers of characters.
Grep command
The grep filter searches a file for a particular pattern of
characters, and displays all lines that contain that pattern.
The pattern that is searched in the file is referred to as
the regular expression (grep stands for globally search for
regular expression and print out).
Syntax:
grep [options] pattern [files]
Options Description