18

I have a command <streaming ls> | wc -l, it works fine, but the <streaming ls> takes a while, which means I don't get the final line count until a few minutes later.

Is there a way to have the output of wc -l update in real time?

0

3 Answers 3

32

You can’t use wc -l for this, but you can produce a running count of lines seen using other tools, for example AWK:

<streaming ls> | awk '{ printf "%d\r", NR } END { print NR }'

This will update the count of lines seen every time a line is seen, and finish with the total number of lines at the end of the process.

For commands producing lots of output, the overhead can be reduced by printing every n lines:

… | awk 'NR % 10 == 0 { printf "%d\r", NR } END { print NR }'

(for n = 10) or by printing every second:

… | awk 'systime() > lasttime { lasttime = systime(); printf "%d\r", NR } END { print NR }'

(or every n seconds by changing the condition to >= lasttime + n).

3
  • 6
    If your input has a huge number of lines that come in fast, you can speed this up by only updating the count every 10 lines (NR % 10 == 0 { printf ...}), and printing the exact count at the end. Even more fancy would be to print an update when a line comes in only if it's been 100 ms since the last print, maybe with an if() inside the rule. But +1, this is a good simple starting point that's sufficient for some use-cases, like commands that produce lines somewhat slowly, or if terminal updates aren't a bottleneck. Commented Nov 2, 2022 at 19:19
  • 1
    I’ve added those variants to the answer, thanks. AWK (even GNU) doesn’t deal with sub-second intervals AFAICT, but every second or even every n seconds should be sufficient for a long-running job. Commented Nov 4, 2022 at 10:57
  • Or print a result only when the current line count has increased by say 20%, so it will be fast to begin with and slow down over time. This would be useful for potentially huge inputs where one can't even estimate a reasonable size. Marking each line of output with a timestamp might be useful too. Commented Nov 4, 2022 at 20:09
25

You could use pv to gives you some progress report:

cmd | pv -lbtr | wc -l
  • -l for line-based (reports the number of lines instead of bytes).
  • -b to report the number bytes (well lines here because of -l)
  • -t to report the time spent
  • -r to report the current rate (number of lines per second; see also -a for the average rate).

Beware the file names can be made of several lines, so wc -l on the output of ls is not guaranteed to give you a file count unless you use options like -b or -q which escape the newline characters in file names as \n or ?.

4
  • 4
    While your final warning is technically correct, it’s a vanishingly small edge case because it’s exceedingly difficult for a regular person to accidentally create a file with such a name, and most people have no need for multi-line filenames. It’s important to keep such cases in mind when coding, but quite often they’re just not worth worrying about when working with a shell interactively. Commented Nov 2, 2022 at 22:09
  • 5
    @AustinHemmelgarn, no need for an accident, It's very easy to create those files voluntarily if you're inclined to exploit bugs in code that incorrectly assume file names can't contain newline characters. Commented Nov 3, 2022 at 11:05
  • 3
    @AustinHemmelgarn Not really difficult, happens to me all the time. Example: I open a PDF from the web, want to save it under a recognizable name, so I just copy the title from the PDF and paste it into the Save As dialog. Unfortunately, if the title in the PDF is split across multiple lines, the embedded newlines get copied into the filename.
    – TooTea
    Commented Nov 4, 2022 at 15:31
  • 2
    pv is a really powerful tool. Commented Nov 5, 2022 at 13:31
3

Well I used to use something like watch -n 1 your command, not sure if that is of any use to your case, I am not a guru, just a first thing that came to my mind.

https://man7.org/linux/man-pages/man1/watch.1.html

watch - execute a program periodically, showing output fullscreen

-n, --interval seconds Specify update interval. The command will not allow quicker than 0.1 second interval, in which the smaller values are converted. Both '.' and ',' work for any locales. The WATCH_INTERVAL environment can be used to persistently set a non-default interval (following the same rules and formatting).

3
  • 5
    This doesn’t work, because the purpose here is to track a long-running, single instance of a command, not repeat a command periodically. Commented Nov 3, 2022 at 9:33
  • okay, thanks for clarification. one less dead end solution:)
    – hocikto
    Commented Nov 3, 2022 at 10:32
  • 4
    One could redirect the output into a file and then use watch wc -l file
    – allo
    Commented Nov 3, 2022 at 13:43

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .