1

I am given data in the following format:

comp.os.linux announce 0000002587 02190 m

comp.arch 00000 28874 y

utsa.cs.3423 00000000004 000000000001 y

I am supposed to process it so that it looks like:

comp.os.linux announce m

comp.arch y

utsa.cs.3423 y

I have tried s/^[0-9]//g and it seems to work well but the last line is missing the 4 numbers

1
  • Made a correction in the 2nd line of your expected output, the last record should have been y and not m
    – Inian
    Commented Jan 19, 2017 at 5:53

2 Answers 2

1

With awk, printing the first and last field, including the second field if it's comprised of alphabetic characters only:

awk '$2~/^[[:alpha:]]+$/ {print $1, $2, $NF; next} {print $1, $NF}' file.txt

If you insist on using sed:

sed -E 's/^([^[:blank:]]+)[[:blank:]]+([[:alpha:]]+)?.*[[:blank:]]([^[:blank:]]+)$/\1 \2 \3/'

For the lines that do not have only alphabetic second field, this will have two spaces between the two fields, you could tack another sed for that:

sed -E 's/^([^[:blank:]]+)[[:blank:]]+([[:alpha:]]+)?.*[[:blank:]]([^[:blank:]]+)$/\1 \2 \3/; s/  / /'

Example:

% cat file.txt                                              
comp.os.linux announce 0000002587 02190 m
comp.arch 00000 28874 y
utsa.cs.3423 00000000004 000000000001 y

% awk '$2~/^[[:alpha:]]+$/ {print $1, $2, $NF; next} {print $1, $NF}' file.txt
comp.os.linux announce m
comp.arch y
utsa.cs.3423 y

% sed -E 's/^([^[:blank:]]+)[[:blank:]]+([[:alpha:]]+)?.*[[:blank:]]([^[:blank:]]+)$/\1 \2 \3/' file.txt
comp.os.linux announce m
comp.arch  y
utsa.cs.3423  y

% sed -E 's/^([^[:blank:]]+)[[:blank:]]+([[:alpha:]]+)?.*[[:blank:]]([^[:blank:]]+)$/\1 \2 \3/; s/  / /' file.txt
comp.os.linux announce m
comp.arch y
utsa.cs.3423 y
1

With sed:

sed 's/ [0-9 ]\+[0-9]\+//' file

Output:

comp.os.linux announce m
comp.arch y
utsa.cs.3423 y

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.