UNIT4
UNIT4
UNIT4
UNIT 4
✓ Loopin
g
Outline
• Advanced Shell Programming
• Filtering utilities: grep, sed etc.
• awk utility
• Splitting (cat, cut, head and tail), comparing
(cmp, comm., diff), Sorting(sort), Merging &
Ordering files (paste, uniq)
Shell Programming
• VI Editor:-
🞂 The default editor that comes with the UNIX operating 47 system is called vi
(Editing a text)
Yà copy yy--Copies the current line.
10yy à copy total 10 lines from the current line
P à Puts the copied text after the cursor.
Shell Programming
🞂 Deleteing and Joining a text
🞂 Dd à Deletes the line the cursor is on.
🞂 x à copy Deletes the character under the cursor location
🞂 X àDeletes the character before the cursor location
🞂 2x à deletes two characters under the cursor location and 2dd
deletes two lines at the current position.
🞂 J àJoins the current line with the next one. A count of j commands
join many lines.
🞂 j àJoins the current line with the next one. A count joins that many
lines.
🞂 U àRestores the current line to the state it was in before the
cursor entered the line.
🞂 u àThis helps undo the last change that was done in the file.
Typing 'u' again will re-do the change
Shell Programming
Pattern Matching and Wild card characters
🞂 * matches zero or more character(s) in a file (or directory) name.
🞂 ? It will match exactly one character.
🞂 [ijk] A single character either I, j or k
🞂 [x-z] a single character that is within the asci range of the
characters x and z.
🞂 [!ijk] a single character that is not an I , j or k
🞂 [!x-z] a single character that is not within the asci range of the
characters x and z.
Shell Programming
🞂 EX:-
🞂 $ ls chap1 chap2 chap3 à ls chap* (it will match all files and
directory start with
chap)
🞂 $ echo* à(list of all files in current directory)
🞂 $ ls ???.txt à (match .txt file having at least 3 character )
🞂 $ ls .??? à (all hidden file names having at least 3 character after .
(dot))
🞂 $ ls emp*.txt à (it will match all text files start with emp)
🞂 $ ls chap? à (it will display file name like , chap1, chapx , chap5,
chapy )
🞂 $ cp * bcaà (copy all files from cuurent directory into bca
directory)
🞂 $ rm * à(all the files will be deleted from current directory)
Shell Programming
• Sed command for Unix support regular expression which allow at perform
complex pattern matching.
🞂 Syntax: sed options ‘address action’ file(s)
- address and action are enclosed within a single quotes.
Advance Filter Commands
• sed command:- (Stream Editor)
🞂 Addressing in sed is done in two ways:
🞂 I am modifying a name
🞂 sed 's/Anu/diyu/g' emp.lst
🞂 sed 's/India/USA/g' emp.lst
sed '/Anita/ s/USA/India/g' emp.lst //specific changing
Advance Filter Commands
• Changing in file
163
Advance Filter Commands
• sed command:- (Stream Editor)
163
🞂 Deleting a line:
🞂 $ sed '4d' emp.lst // 4th line deleting
• $ sed '$d' emp.lst //last line delete
• $ sed '2,4d' emp.lst //range 2 to 4 line delete
• $ sed '/UK/d' emp.lst //Specific coutry delete
• $ sed '/^$/d' emp.lst //empty line delete
• $ sed -i '/^$/d' emp.lst //permanently delete empty line
+ Add x +y
- Subtract x –y
* Multiply x *y
/ Divide x/y
% Modulus x %y
^ Exponentialx ^y
• Example:-
• Logical Operators:-
• $ awk '($2 > 5) && ($2 <= 15) {print $0}' file
• $ awk '$3 == 100 || $4 > 50' file
Advance Filter Commands
• awk command:-
• Output Statements:-
• print :- print easy and simple output
• printf:- print formatted (similar to C printf)
• sprintf:-format string (similar to C sprintf)
• Function: print
• Writes to standard output
• Output is terminated by ORS
• default ORS is newline
• If called with no parameter, it will print $0
• Printed parameters are separated by OFS,
• default OFS is blank
• Print control characters are allowed:
• \n \f \a \t \\ …
Advance Filter Commands
• awk command:-
• Some System Variables (Built-in variables ):-
Optio
Use
n
Select only the characters from each line as specified in
cut -c
LIST
cut -b Select only the bytes from each line as specified in LIST
Cuts the input file using list of field. The default field to be
cut -f used TAB. The default behavior can be overwritten by
use of -d option
Specifies a delimiter to by used as a field. Default field is
cut -d
TAB and this option overwrites this default behavior
cut Command Example
Select only the characters from each line as specified
cut -c
in LIST
cut Command Example
cut -b Select only the bytes from each line as specified in LIST
cut Command Example
Cuts the input file using list of field. The default field to be used TAB. The
cut -f
default behavior can be overwritten by use of -d option
Specifies a delimiter to by used as a field. Default field is TAB and this
cut -d
option overwrites this default behavior
head Command
• head makes it easy to output the first part (10 lines by default) of files.
🞂 Syntax :
head [OPTION]... [FILE]...
🞂 Example :
Optio
Use
n
head - Print the first n lines instead of the first 10; with the
n leading '-', print all but the last n lines of each file
Print the first n bytes of each file; with a leading '-', print
head -c
all but the last n bytes of each file
head -
Never print headers identifying file names
q
head Command Example
head Command Example
head - Print the first n lines instead of the first 10; with the
n leading '-', print all but the last n lines of each file
head Command Example
head - Print the first n bytes of each file; with a leading '-',
c print all but the last n bytes of each file
head Command Example
head -
Never print headers identifying file names
q
tail Command
• tail is a command which prints the last few number of lines (10 lines by
default) of a certain file, then terminates.
🞂 Syntax :
tail [OPTION]... [FILE]...
Optio
Use
n
tail -n Output the last num lines, instead of the default (10)
tail -c Output the last num bytes of each file
tail -q Never output headers
tail Command Example
tail Command Example
tail -n Output the last num lines, instead of the default (10)
tail Command Example
tail -c Output the last num bytes of each file
cmp Command
• cmp command in Linux/UNIX is used to compare the two files byte by
byte and helps you to find out whether the two files are identical or not.
• If a difference is found, it reports the byte and line number where the first
difference is found.
• If no differences are found, by default, cmp returns no output.
🞂 Syntax :
• cmp [OPTION]... FILE1 [FILE2 [SKIP1 [SKIP2]]]
cmp Command Example
Option Use
diff -b Ignores spacing differences
diff -i Ignores case
diff Command Example
Option Use
sort -c To check if the file given is already sorted or not
sort -r Reverse the result of comparisons
sort -n Compare according to string numerical value
sort -nr To sort a file with numeric data in reverse order
sort -k Sorting a table on the basis of any column
sort -b Ignore leading blanks
sort Command Example
sort Command Example
sort -c To check if the file given is already sorted or not
sort Command Example
sort -r Reverse the result of comparisons
sort Command Example
sort -n Compare according to string numerical value
sort -nr To sort a file with numeric data in reverse order
sort Command Example
sort -k Sorting a table on the basis of any column
paste Command
• The paste command displays the corresponding lines of multiple files side-
by-side.
🞂 Syntax :
paste [-options] [file]
🞂 Example :
Optio
Use
n
paste -
Reuse characters from LIST instead of tabs
d
paste -
Paste one file at a time instead of in parallel
s
paste Command Example
paste - Reuse characters from LIST instead
d of tabs
paste Command Example
paste Paste one file at a time instead of in
-s parallel
uniq Command
• uniq reports or filters out repeated lines in a file.
• It can remove duplicates, show a count of occurrences, show only
repeated lines, ignore certain characters and compare on specific fields.
🞂 Syntax :
uniq [OPTION]... [INPUT [OUTPUT]]
Optio
Use
n
uniq -u Prints only unique lines
uniq -d Only print duplicated lines
uniq -D Print all duplicate lines
Prefix lines with a number representing how many times
uniq -c
they occurred
uniq -i Ignore case when comparing
uniq Command Example
uniq Command Example
uniq -u Prints only unique lines
uniq Command Example
uniq -d Only print duplicated lines