5

I want to see how many lines exist in each file that has been found using the find command.

I know I can use wc -l to find the lines number of a single file. But this does not work when piped from the output of find:

find -type f -name package.json | wc -l

This returns the count of the found files. I want to return the count of lines of each found file.

2 Answers 2

10

wc takes the list of files whose bytes/chars/words/lines to count as arguments.

When called with no argument, it reports those bytes/chars/words/lines in its stdin. So if you're piping find to wc -l, you'll get the number of newline characters in the output of find, so that'll be the number of found files plus the number of newline characters in their paths.

The GNU implementation of wc can also take the list of files NUL-delimited from a file with the --files0-from option, where it treats - as meaning stdin (not the file called -), so you can do:

find . -name package.json -type f -print0 |
  wc -l --files0-from=-

With any standard find or wc implementation, you could get find to pass the list of file paths as arguments to wc with:

find . -name package.json -type f -exec wc -l {} +

But if there's a large number of matching files, that could end up running wc several times resulting in several occurrences of a total line.

wc prints the total line when given at least 2 files to process, so to skip the total line, you could do:

find . -name package.json -type f -exec wc -l {} ';'

Though that would be very inefficient as forking a process and executing a command for each file is quite expensive.

If it's the total you're actually interested in, then you'd do:

find . -name package.json -type f -exec cat {} + | wc -l

Where we feed the concatenation of the contents of those files to wc.

With zsh and any wc, you could do:

wc -l -- **/package.json(D.)

(D for Dotglob to get hidden ones as well like find does and . to only include regular files as the equivalent of -type f).

That has the advantage of giving you a sorted list and avoid the ./ prefix.

This time, if there are no or too many matching files, you'll get an error.

With GNU du, you can avoid those by passing the glob expansion NUL-delimited to wc -l --files0-from=- with:

print -rNC1 -- **/package.json(ND.) | wc -l --files0-from=-

Also beware that in the json format, newline characters (which wc -l counts) are not significant so I'm not sure that's a useful metric you're getting.

You could return the number of elements in some array in those files for instance instead with:

find . -name package.json -type f -exec \
  jq -r '[.devDependencies|length,input_filename]|@csv' {} +

(assuming the file paths are UTF-8 encoded text and here giving you the result in CSV format).

0
5

You can use xargs to pipe standard input into the argument vector where you need it:

find -type f -name package.json | xargs wc -l

Or simply let shell command substitution fill it

wc -l $(find -type f -name package.json)
8
  • 1
    This doesn't appear to add anything that isn't already included in Stéphane's answer. It also has the drawback that the commands break if there are special characters in the filename (space or newline) although this does not apply to the specific file used here.
    – doneal24
    Commented Jan 6, 2023 at 20:18
  • 1
    See also Why is looping over find's output bad practice? for more details as to why those two approaches are incorrect. Commented Jan 6, 2023 at 21:13
  • @doneal24, that's not limited to space or newline. For xargs, there's also other whitespace characters, quotes and backslashes and with some xargs implementations non-text file names. For $(...) there are all the characters of $IFS (by default also includes TAB) and the wildcard ones. It may apply here as the directories those package.json files are found in may very well contain those characters. Commented Jan 6, 2023 at 21:16
  • 2
    @StéphaneChazelas is the concern white-space, or is there an additional problem. I have been doing this but with --print0 on find, and -0 on xargs. I have never found a problem yet. Commented Jan 7, 2023 at 0:02
  • 1
    @ctrl-alt-delor see find . -print0 | xargs -0 cmd vs find . -exec cmd {} +. Also, Roman is not using -0 here, so the problem is with a bunch of characters (and non-characters), not just space. Commented Jan 7, 2023 at 10:55

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .