Delete multiple rows in csv file

Question

I'm working on this assignment to delete rows from a CSV file with different customers. I've figured out how to delete one specific customer by using this code:

delete() {
  awk -F "\"*;\"*" '$1 != '$@' {print $ALL}' input.csv > output.csv
}

delete $@

However, now I have to delete multiple customers at the same time. I can identify a customer by their customer number which is stored in the first column of the csv file. I'm supposed to create an array for the different customer numbers and create a while loop to loop through the array, but I can't seem te figure it out.

What is $ALL? How is print $ALL different from print? And why are you passing $@ in single quotes? That means it won't be expanded. — terdon, Commented Jun 27, 2018 at 13:05
@terdon the single quotes appear to close before and then open again after $@ (it's still not a good way to pass shell parameters to awk though) — steeldriver, Commented Jun 27, 2018 at 14:11

David Foerster · Accepted Answer · 2018-06-27 17:05:03Z

I'm not sure why you are wrapping this in a shell function - I will assume that's a requirement of your assignment.

First, note that using "*;"* as a field separator in Awk is not a robust way to handle quoted CSV fields - it will fail for example if either the first field or last field on a line is quoted, and it won't preserve quoted delimiters (i. e. quoted fields that actually contain a ;) which misses the whole point of quoting CSV fields.

Second, you should not try to pass shell variables (or positional parameters) into Awk expression that way - the correct way is either to export them and then access them via the ENVIRON array, or use command line option -v. So your "single customer" implementation would be better written

delcust() {
  awk -F '"*;"*' -v cust="$1" '$1 != cust' input.csv > output.csv
}
delcust "$1"

While you could modify this to pass multiple positional parameters, I'd suggest passing the customer list via standard input and parsing it as a file of values; that way you can do a canonical Awk lookup based on an associative array (or hash):

delcusts() {
  printf '%s\n' "$@" | awk -F'"*;"*' 'NR==FNR {custs[$0]=1; next} !($1 in custs)' - input.csv > output.csv
}
delcusts "$@"

Note that you don't need an explicit print in Awk since print is the default action if a rule evaluates non-zero.

dessert · Accepted Answer · 2018-06-27 13:13:13Z

There is not really a need for an array. You could define your function like this:

delete() {
  awk -v customer="^($1)\$" -F ";" '$1 !~ customer {print $ALL}' input.csv >output.csv 
}

I didn't understand how you defined the field separator, so I changed it to be able to test. The relevant part is to use a negated regular expression !~. Also I used the -v parameter for awk that can save you from a lot of shell quoting headache.

With this you can use a parameter like this to delete multiple customers:

delete 'bla|foo'

For in input.csv like this:

bla;blu;bli
foo;faa;fii
blafoo;blufaa;blifii

it would yield

blafoo;blufaa;blifii

in output.csv.

If you really want to use an array you could in addition define a little helper function that prepares the array for use with the delete() function above:

join() { local IFS=\|; echo "$*"; }

With this you are able to define a bash array and convert it to regex alternate syntax:

$ a=(bla blu)
$ join ${a[@]}
bla|blu

Then you could call delete() like this:

$ a=(customer1 customer2)
$ delete "$(join ${a[@]})"

(Small side note for zsh users: the join() function is not needed for zsh, you could simply use the following parameter expansion: ${(j:|:)a} to join all array elements with the | character)

Stack Exchange Network

Delete multiple rows in csv file

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
command-line
text-processing
awk
csv
.

Hot Network Questions

Delete multiple rows in csv file

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged command-linetext-processingawkcsv.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
command-line
text-processing
awk
csv
.